Machine Learning Department
School of Computer Science, Carnegie Mellon University


Stacked Graphical Learning

Zhanzhen Kou

December 2007

Ph.D. Thesis


Keywords: Statistical relational learning, graphical models, machine learning

In reality there are many relational datasets in which both features of instances and the relationships among the instances are recorded, such as hyperlinked web pages, scientific literature with citations, and social networks. Collective classification has been widely used to classify a group of related instances simultaneously. Recently there have been several studies on statistical relational learning for collective classification, including relational dependency networks, relational Markov networks, and Markov logic networks. In statistical relational learning models, collective classification is usually formulated as an inference problem over graphical models. Hence the existing collective classification methods are expensive due to the iterative inference procedure required for general graphical models. Procedures that learn collective classifiers are also expensive, especially if they are based on iterative optimization of an expensive iterative inference procedure. Our goal is to develop an efficient model for collective classification for relational datasets.

In my thesis, I have studied a learning scheme called stacked graphical learning. In stacked graphical learning, a base learner is augmented by providing the predicted labels of relevant instances. That is, first, a base learner is applied to the training data to make predictions using a cross validation-like technique. Then we expand the features by adding the predictions of relevant examples into the feature vector. Finally the base learner is applied to the expanded feature sets to make the final predictions. The intuition behind stacked graphical learning is that combining the predictions on the neighbors with local features can capture the dependencies among examples; hence we can rely on the base learner to classify the instances using the expanded feature set.

We have applied stacked graphical learning to many real problems including collective classification, sequential partitioning, information extraction, and multi-task problems in an information extraction system. Stacked graphical learning has been demonstrated to achieve competitive performance to the state-of-art relational graphical models with much less inference time.

In addition to exploring many applications of stacked graphical learn- ing in real problems, we formally analyze an idealized version of the algorithm, which can be formulate as an inhomogeneous Gibbs sampling process with parameters learned in a greedy manner, provide proof of convergence of the idealized version of stacking, and discuss the conditions under which the algorithm of stacked graphical learning is nearly identical to the idealized stacked graphical learning.

We also studied an online version of stacked graphical learning, which integrates a single-pass online learning algorithm Modified Balanced Winnow with stacked learning. Online stacked graphical learning can save training time and is capable to handle large streaming datasets with min- imal memory overhead. We analyze the time and memory cost of online stacked graphical learning and applied it to several real problems.

150 pages

SCS Technical Report Collection
School of Computer Science homepage

This page maintained by