COMPUTER SCIENCE TECHNICAL REPORT ABSTRACTS

CMU-CS-23-146
Computer Science Department
School of Computer Science, Carnegie Mellon University

CMU-CS-23-146

Robust Mean Estimation Against Oblivious Adversaries

Shuchen Li

M.S. Thesis

December 2023

CMU-CS-23-146.pdf

Keywords: Robust mean estimation, sparse Fourier transform

We give the first algorithm that achieves arbitrarily high accuracy for robust mean estimation on Gaussian and Laplace distribution when the adversary corrupts the samples before the noise is added. For sufficiently small constant α > 0 and accuracy ε > 0, our algorithm takes as input samples Y ⊆ ℝ^d of size n obtained by i.i.d. samples X ⊆ ℝ^d of size n from the true distribution with unknown mean μ, while an adversary corrupts an α fraction of the points before the (Gaussian or Laplace) noise is added. When the true distribution is Gaussian with unknown mean μ and covariance I, our algorithm needs sample size n = 2^O^(d/ε²) . When the true distribution is Laplace with unknown mean μ and covariance I, our algorithm needs sample size n = Ŏ(d²/ε⁴⁾. Our algorithm runs in Ŏ(nd) time, and outputs an estimation μ̂ that ∥μ̂ - μ∥₂ ≤ ε with high probability. Our method is to transform the sample to a Fourier-sparse signal that encodes the true mean μ and apply the sparse Fourier transform to decode μ.

In Huber's contamination model, where the adversary corrupts an α fraction of the sample after the noise is added, it is known that there is a Ω(α) lower bound on the error of the estimator. In contrast, our algorithm can estimate the mean arbitrarily closely in this corruption-before-noise setting.

Our algorithm in this new setting has many possible applications. For example, the one that motivated us is max-affine regression. In max-affine regression, the model is the maximum of k linear models, where the noise is added after taking the maximum. If we can extend our algorithm to the list-decodable setting, then we immediately get an algorithm for the max-affine regression

35 pages

Thesis Committee:
Pravesh K. Kothari (Chair)
David P. Woodruff

Srinivasan Seshan, Head, Computer Science Department
Martial Hebert, Dean, School of Computer Science

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu