CMU-CS-15-149
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-15-149

An Experimental Study into Spectral and
Geometric Approaches to Data Clustering

Prashant Sridhar

October 2015

M.S. Thesis

CMU-CS-15-149.pdf


Keywords: NA Geometry, Nonparametric, Clustering

Common clustering techniques involve assumptions on the distribution that the data is drawn from, with linearity often being a standard assumption. These techniques work poorly on irregular data sets or on rare clusters. Non-parametric solutions are often inefficient to use in practice and still fail to find rare clusters. This thesis explores geometric approaches to non-parametric clustering that should be much better at identifying these rare clusters. The first approach explores how the Gabriel graph can be used to compute density based distance metrics and how the Gabriel graph itself might be used to cluster data. The second approach views the probability density function as a heat distribution over a metal plate. Traditionally, finite element or control volume methods construct graphs over meshes on the metal plate. This approach explores how these meshes could be used to partition space. Finally, we present the results of running these clustering algorithms on flow cytometry data and compare these to present state of the art non-parametric methods on the field.

56 pages

Thesis Committee:
Gary Miller (Chair)
Alex Smola

Frank Pfenning, Head, Computer Science Department
Andrew W. Moore, Dean, School of Computer Science



Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu