CMU-CALD-02-107
Center for Automated Learning and Discovery
School of Computer Science, Carnegie Mellon University



CMU-CALD-02-107

Learning from Labels and Unlabeled Data
with Label Propagation

Xiaojin Zhu, Zoubin Ghahramani

June 2002

CMU-CALD-02-107.pdf


Keywords: Artificial intelligence:learning, pattern recognition: models-statistical, pattern recognition: design methodology-classifier design and evaluation, algorithms, semi-supervised learning, label propagation

We investigate the use of unlabeled data to help labeled data in classification. We propose a simple iterative algorithm, label propagation, to propagate labels through the dataset along high density areas defined by unlabeled data. We give the analysis of the algorithm, show its solution, and its connection to several other algorithms. We also show how to learn parameters by minimum spanning tree heuristic and entropy minimization, and the algorithm's ability to do feature selection. Experiment results are promising.

19 pages


SCS Technical Report Collection
School of Computer Science homepage

This page maintained by reports@cs.cmu.edu