CMU-CS-22-146 Computer Science Department School of Computer Science, Carnegie Mellon University
Algorithms for Learning Latent Models: Ainesh Bakshi Ph.D. Thesis August 2022
Modern machine learning relies on algorithms that fit expressive latent models to large datasets. While such tasks are easy in low dimensions, real-world datasets are truly high-dimensional, often leading to computational intractability. Additionally, a prerequisite to deploying models in real-world systems is to ensure that their behavior degrades gracefully when the modeling assumptions no longer hold. Therefore, there is a growing need for efficient algorithms that fit reliable and robust models to data and are accompanied with provable guarantees. In this thesis, we focus on designing such efficient and robust algorithms for learning latent variable models. In particular, we investigate two complementary regimes arising in learning latent models: establishing computational tractability and approaching computational optimality. The first regime considers learning high-dimensional latent models where no efficient algorithms were known. We resolve several central open questions in this regime, by providing the first polynomial time algorithms for robustly learning a mixture of Gaussians, robust linear regression and learning two-layer neural networks. The second regime considers models where polynomial time algorithms were already well-established. Here, we show that we can obtain algorithms with information-theoretically minimal running time and sample complexity. In particular, we show that for several low-rank models there is no statistical vs. computational trade-off.
629 pages
Thesis Committee:
Srinivasan Seshan, Head, Computer Science Department
| |
Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |