Computer Science Department
School of Computer Science, Carnegie Mellon University
Searching Complex Data Without an Index
Mahadev Satyanarayanan, Rahul Sukthankar*, Adam Goode,
We show how query-specific content-based computation pipelined with human cognition can be used for interactive search when a pre-computed index is not available. More specifically, we use query-specific parallel computation on large collections of complex data spread across multiple Internet servers to shrink a search task down to human scale. The expertise, judgement, and intuition of the user performing the search can then be brought to bear on the specificity and selectivity of the current search. Rather than text or numeric data, our focus is on complex data such as digital photographs and medical images. We describe Diamond, a system that can perform such interactive searches on stored data as well as liveWeb data. Diamond is able to narrow the focus of a non-indexed search by using structured data sources such as relational databases. It can also leverage domain-specific software tools in search computations. We report on the design and implementation of Diamond, and its use in the health sciences.
*Intel Labs Pittsburgh