MACHINE LEARNING TECHNICAL REPORT ABSTRACTS

	CMU-ML-10-103 Machine Learning Department School of Computer Science, Carnegie Mellon University CMU-ML-10-103 Active Learning for Fast Drug Discovery Anqui Cui, Jeff Schneider March 2010 CMU-ML-10-103.pdf Keywords:* Active learning, Bandit problems, drug discovery, function approximator, Gaussian process, regression tree The drug discovery always costs a lot of time and money. Chemists have to do many experiments to screen useful compounds. In this paper, we present active learning algorithms for fast drug discovery. The algorithms decrease the number of experiments required to find out the best performance compounds among plenty of possible trials. The problem is a traditional exploration vs. exploitation dilemma and our approach is based on the multi-armed bandit problem and other function approximators. We propose the expected improvement estimation as a method to measure the unknown compounds. Some traditional models including UCB algorithms, Gaussian process, regression trees and so on are also used for our problem. Our results show that the algorithms present in this paper significantly raise the best performance of compounds found within a certain number of picks. The number of picks needed to first discover the best compound is also reduced to about half of random method's cost. 28 pages *State Key Lab of Intelligent Technology and Systems, Department of Computer Science and Technology, Tsinghua University, Beijing, P.R. China

SCS Technical Report Collection School of Computer Science homepage This page maintained by reports@cs.cmu.edu