CMU-ML-11-102
Machine Learning Department
School of Computer Science, Carnegie Mellon University



CMU-ML-11-102

Beyond Keyword Search:
Discovering Relevant Scientific Literature

Khalid El-Arini, Carlos Guestrin

June 2011

CMU-ML-11-102.pdf


Keywords: Personalization, Citation Analysis, Query Formulation


In scientific research, it is often difficult to express information needs as simple keyword queries. We present a more natural way of searching for relevant scientific literature. Rather than a string of keywords, we define a query as a small set of papers deemed relevant to the research task at hand. By optimizing an objective function based on a fine-grained notion of influence between documents, our approach efficiently selects a set of highly relevant articles. Moreover, as scientists trust some authors more than others, results are personalized to individual preferences. In a user study, researchers found the papers recommended by our method to be more useful, trustworthy and diverse than those selected by popular alternatives, such as Google Scholar and a state-of-the-art topic modeling approach.

35 pages


SCS Technical Report Collection
School of Computer Science homepage

This page maintained by reports@cs.cmu.edu