Computer Science Department
School of Computer Science, Carnegie Mellon University


Diffusion Kernels on Statistical Manifolds

John Lafferty, Guy Lebanon

January 2004

Keywords: Kernels, heat equation, diffusion, information geometry, text classification

A family of kernels for statistical learning is introduced that exploits the geometric structure of statistical models. The kernels are based on the heat equation on the Riemannian manifold defined by the Fisher information metric associated with a statistical family, and generalize the Gaussian kernel of Euclidean space. As an important special case, kernels based on the geometry of multinomial families are derived, leading to kernel-based learning algorithms that apply naturally to discrete data. Bounds on covering numbers and Rademacher averages for the kernels are proved using bounds on the eigenvalues of the Laplacian on Riemannian manifolds. Experimental results are presented for document classification, for which the use of multinomial geometry is natural and well motivated, and improvements are obtained over the standard use of Gaussian or linear kernels, which have been the standard for text classification.

39 pages

Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by