CMU-ML-10-103
Machine Learning Department
School of Computer Science, Carnegie Mellon University



CMU-ML-10-103

Active Learning for Fast Drug Discovery

Anqui Cui*, Jeff Schneider

March 2010

CMU-ML-10-103.pdf


Keywords: Active learning, Bandit problems, drug discovery, function approximator, Gaussian process, regression tree


The drug discovery always costs a lot of time and money. Chemists have to do many experiments to screen useful compounds. In this paper, we present active learning algorithms for fast drug discovery. The algorithms decrease the number of experiments required to find out the best performance compounds among plenty of possible trials. The problem is a traditional exploration vs. exploitation dilemma and our approach is based on the multi-armed bandit problem and other function approximators. We propose the expected improvement estimation as a method to measure the unknown compounds. Some traditional models including UCB algorithms, Gaussian process, regression trees and so on are also used for our problem. Our results show that the algorithms present in this paper significantly raise the best performance of compounds found within a certain number of picks. The number of picks needed to first discover the best compound is also reduced to about half of random method's cost.

28 pages

*State Key Lab of Intelligent Technology and Systems, Department of Computer Science and Technology, Tsinghua University, Beijing, P.R. China


SCS Technical Report Collection
School of Computer Science homepage

This page maintained by reports@cs.cmu.edu