CMU-CS-15-149Computer Science Department School of Computer Science, Carnegie Mellon University
CMU-CS-15-149
Prashant Sridhar October 2015 M.S. Thesis
Keywords: NA
Geometry, Nonparametric, Clustering
Common clustering techniques involve assumptions on the distribution that the data is drawn from, with linearity often being a standard assumption. These techniques work poorly on irregular data sets or on rare clusters. Non-parametric solutions are often inefficient to use in practice and still fail to find rare clusters. This thesis explores geometric approaches to non-parametric clustering that should be much better at identifying these rare clusters. The first approach explores how the Gabriel graph can be used to compute density based distance metrics and how the Gabriel graph itself might be used to cluster data. The second approach views the probability density function as a heat distribution over a metal plate. Traditionally, finite element or control volume methods construct graphs over meshes on the metal plate. This approach explores how these meshes could be used to partition space. Finally, we present the results of running these clustering algorithms on flow cytometry data and compare these to present state of the art non-parametric methods on the field. 56 pages
Frank Pfenning, Head, Computer Science Department
| |

Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |