COMPUTER SCIENCE TECHNICAL REPORT ABSTRACTS

CMU-CS-23-116
Computer Science Department
School of Computer Science, Carnegie Mellon University

CMU-CS-23-116

Designing and Analyzing Machine Learning
Algorithms in the Presence of Strategic

Hanrui Zhang

Ph.D. Thesis

July 2023

CMU-CS-23-116.pdf

Keywords: Machine Learning, Mechanism Design, Sample Complexity, Markov Decision Processes

Machine learning algorithms now play a major role in all kinds of decision-making scenarios, such as college admissions, credit approval, and resume screening. When the stakes are high, self-interested agents – about whom decisions are being made – are increasingly tempted to manipulate the machine learning algorithm, in order to better fulfill their own goals, which are generally different from the decision maker's. The fact that many machine learning algorithms can be manipulated also raises a fairness concern, since algorithms that are manipulable tend to be manipulated most effectively by those who have more resources and who are already entrenched in the system. All this highlights the importance of making machine learning algorithms robust against manipulation. The main focus of this dissertation is on designing and analyzing machine learning algorithms that are robust against strategic manipulation, which is different from the relatively well-studied notion of adversarial robustness.

This dissertation sets the foundations for several key problems in machine learning in the presence of strategic behavior:

Empirical risk minimization and generalization in classification problems [66, 111, 115]: Traditional wisdom suggests that a classifier trained on historical observations (i.e., an empirical risk minimizer) usually also works well on future data points to be classified. Is this still true in the presence of strategic manipulation?
Distinguishing distributions with samples [113, 114, 116]: Due to various constraints, often we have to judge the "quality" of an agent based on a few samples(e.g., screening job candidates based on a few selected papers). How should we calibrate our judgment when these samples are strategically selected or trans- formed?
Planning in Markov decision processes [110, 117, 118]: Dynamic decision-making problems (traditionally modeled using Markov decision processes) can be solved efficiently when the decision maker always has complete and reliable information about the state of the world, as well as full control over which actions to take. What happens when the state of the world is reported by a strategic agent, or when a self-interested agent may interfere with the actions taken?

201 pages

Thesis Committee:
Vincent Conitzer (Chair)
Maria-Florina Balcan
Tuomas Sandholm
Nika Haghtalab (University of California, Berkeley)
Vahab Mirrokni (Google Research)
Renato Paes Leme (Google Research)

Srinivasan Seshan, Head, Computer Science Department
Martial Hebert, Dean, School of Computer Science

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu