CMU-ML-19-105
Machine Learning Department
School of Computer Science, Carnegie Mellon University



CMU-ML-19-105

Statistical and Computational Properties of Some
"User-Friendly" Methods for High-Dimensional Estimation

Alnur Ali

May 2019

Ph.D. Thesis

CMU-ML-19-105.pdf


Keywords: NA


Historically, the choice of method for a given statistical problem has been primarily driven by two criteria: a method's statistical properties, and its computational properties. But as tools for high-dimensional estimation become ubiquitous, it is clear that there are other considerations, going beyond pure accuracy and computational efficiency, that are equally (if not more) important. One such consideration is a method's "user-friendliness" – a term we use to encapsulate the various properties that make a method easy to work with in practice, exemplified by a method being (i) easy-to-implement, (ii) interpretable, and (iii) computationally cheap. In this thesis, we present new statistical and computational results for three different user-friendly methods in various high-dimensional estimation settings.

First, we give conditions for the existence and uniqueness of solutions to the generalized lasso problem, which is a generalization of the standard lasso problem that allows the user to easily impose domain-appropriate structure onto the fitted coefficients. The conditions are very weak, and essentially guarantee uniqueness in many settings of practical interest, even in high dimensions, which are useful results from the points-of-view of interpretability as well as prediction. Second, we consider early-stopped gradient descent (as an estimator), giving a number of results that tightly couple the risk profile of the iterates generated by gradient descent, when run on the fundamental problem of least squares regression, to that of ridge regression – these results are favorable for gradient descent, as it is relatively easy-to-implement as well as computationally cheap. We also discuss extending the analysis to give a similar coupling for (the arguably even more user-friendly) stochastic gradient descent. Finally, we present a new user-friendly, pseudolikelihood-based method for robust undirected graphical modeling that we call the Multiple Quantile Graphical Model (MQGM), showing that the MQGM recovers the population-level conditional independencies, with high probability – this is again a useful result, from an interpretability standpoint. We also give a highly efficient algorithm, based on the alternating direction method of multipliers, for fitting the MQGM to highdimensional and potentially non-Gaussian data.

129 pages

Thesis Committee:
J. Zico Kolter (Co-Chair)
Ryan J. Tibshirani (Co-Chair)
Sivaraman Balakrishnan (Statistics)
Ameet Talwalkar
John C. Duchi (Stanford University)

Roni Rosenfeld, Head, Machine Learning Department
Tom M. Mitchell, Interim Dean, School of Computer Science


SCS Technical Report Collection
School of Computer Science