|
CMU-CS-04-109
Computer Science Department
School of Computer Science, Carnegie Mellon University
CMU-CS-04-109
Drug Screening by Nonparametric Posterior Estimation
Alexander Gray
February 2004
To be presented at ENAR '04.
CMU-CS-04-109.ps
CMU-CS-04-109.pdf
Keywords: Visual screening, high-throughput screening,
classification, nonparametric, metric learning, nonisotropic, kernel
density estimation
Automated high-throughput drug screening constitutes a critical
emerging approach in modern pharamaceutical research. The statistical
task of interest is that of discriminating active versus inactive
molecules given a target molecule, in order to rank potential drug
candidates for further testing. Because the core problem is one of
ranking, our approach concentrates on accurate estimation of unknown
class probabilities, in contrast to popular non-probabilistic methods
which simply estimate decision boundaries. While this motivates
nonparametric density estimation, we are faced with the fact that the
molecular descriptors used in practice typically contain thousands of
binary features. In this paper we attempt to improve the extent to which
kernel density estimation can work well in high-dimensional
discrimination settings. We present a synthesis of techniques
(SLAMDUNK: Sphere, Learn A Metric, Discriminate Using Nonisotropic
Kernels) which yields favorable performance in comparison to previous
published approaches to drug screening, as tested on a large
proprietary pharmaceutical dataset.
9 pages
|