Center for Automated Learning and Discovery
School of Computer Science, Carnegie Mellon University
Automatic Discovery of Latent Variable Models
How to formalize the process of discovering latent variables and models associated with them is the goal of this thesis. More than finding a good probabilistic model that fits the data well, we describe how, in some situations, we can identify causal features common to all models that equally explain the data. Such common features describe causal relations among observed and hidden variables. Although this goal might seem ambitious, it is a natural extension of several years of work in discovering causal models from observational data through the use of graphical models. Learning causal relations without experiments basically amounts to discovering an unobservable fact (does A cause B?) from observable measurements (the joint distribution of a set of variables that include A and B). We take this idea one step further by discovering which hidden variables exist to begin with.
More specifically, we describe algorithms for learning causal latent variable models when ob- served variables are noisy linear measurements of unobservable entities, without postulating a priori which latents might exist. Most of the thesis concerns how to identify latents by describing which observed variables are their respective measurements. In some situations, we will also assume that latents are linearly dependent, and in this case causal relations among latents can be partially identified. While continuous variables are the main focus of the thesis, we also describe how to adapt this idea to the case where observed variables are ordinal or binary. Finally, we examine density estimation, where knowing causal relations or the true model behind a data generating process is not necessary. However, we illustrate how ideas developed in causal discovery can help the design of algorithms for multivariate density estimation.
||SCS Technical Report Collection
School of Computer Science homepage
This page maintained by firstname.lastname@example.org