CMU-CS-24-120 Computer Science Department School of Computer Science, Carnegie Mellon University
Data-driven algorithm design and principled Dravyansh Sharma Ph.D. Thesis July 2024
For any new machine learning technique, a large body of research often fol- lows up in order to tune the technique to work suitably for each of the numerous application areas, requiring significant scientific and engineering efforts. Moreover, this typically involves unprincipled approaches for hyperparameter selection without any guarantee on global optimality. Inspired from the recently proposed paradigm of ‘data-driven algorithm design’, this thesis develops first principled hyperparame- ter tuning techniques for several core machine learning algorithms, including semi- supervised learning, regularized linear regression, robust nearest neighbors and de- cision trees, with formal near-optimality guarantees in statistical and online learning settings. Given multiple problem instances of a learning problem from some problem domain, we develop approaches to learn provably well-tuned parameters over the domain and answer questions related to the number of problem samples needed to learn a well-tuned learning algorithm. In addition, we also develop online learning techniques when the problem instances arrive in a potentally adversarial sequence. Our approaches apply to the following diverse scenarios:
240 pages
Thesis Committee:
Srinivasan Seshan, Head, Computer Science Department
| |
Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |