Machine Learning Department
School of Computer Science, Carnegie Mellon University


Maximum Entropy Discrimination Markov Networks

Jun Zhu*, Eric Xing, Bo Zhang*

February 2008


Keywords: Maximum entropy discrimination Markov networks, Bayesian max-margin Markov networks, Laplace max-margin Markov networks, structured prediction

Standard max-margin structured prediction methods concentrate directly on the input-output mapping, and the lack of an elegant probabilistic interpretation causes limitations. In this paper, we present a novel framework called Maximum Entropy Discrimination Markov Networks (MaxEntNet) to do Bayesian max-margin structured learning by using expected margin constraints to define a feasible distribution subspace and applying the maximum entropy principle to choose the best distribution from this subspace. We show that MaxEntNet subsumes the standard max-margin Markov networks (M3N) as a spacial case where the predictive model is assumed to be linear and the parameter prior is a standard normal. Based on this understanding, we propose the Laplace max-margin Markov networks (LapM3N) which use the Laplace prior instead of the standard normal. We show that the adoption of a Laplace prior of the parameter makes LapM3N enjoy properties expected from a sparsified M3N. Unlike the L1-regularized maximum likelihood estimation which sets small weights to zeros to achieve sparsity, LapM3N posteriorly weights the parameters and features with smaller weights are shrunk more. This posterior weighting effect makes LapM3N more stable with respect to the magnitudes of the regularization coefficients and more generalizable. To learn a LapM3N, we present an efficient iterative learning algorithm based on variational approximation and existing convex optimization methods employed in M3N. The feasibility and promise of LapM3N are demonstrated on both synthetic and real OCR data sets.

34 pages

*Department of Computer Science and Technology, Tsinghua University, Beijing 100084 China **Eric Xing, addressee for all correspondence, Carnegie Mellon University

SCS Technical Report Collection
School of Computer Science homepage

This page maintained by