Computer Science Department
School of Computer Science, Carnegie Mellon University


Using Asymmetric Distributions to Improve Classifier Probabilities:
A Comparison of New and Standard Parametric Methods

Paul N. Bennett

April 2002

An abbreviated version of this report will also appear in the
Proceedings of the 26th Annual International ACM Conference on
Research and Development in Information Retrieval (SIGIR)

Toronto, Canada, July 28 - August 1, 2003. (Color)
CMU-CS-02-126.pdf (Color) (B&W)
CMU-CS-02-126.pdf (B&W)

Keywords: Calibration, well-calibrated, reliability, posterior, text classification, cost-sensitive learning, active learning, post-processing, probability estimates

For many discriminative classifiers, it is desirable to convert an unnormalized confidence score output from the classifier to a normalized probability estimate. Such a method can also be used for creating better estimates from a probabilistic classifier that outputs poor estimates. Typical parametric methods have an underlying assumption that the score distribution for a class is symmetric; we motivate why this assumption is undesirable, especially when the scores are output by a classifier. Two asymmetric families, an asymmetric generalization of a Gaussian and a Laplace distribution, are presented, and a method of fitting them in expected linear time is described. Finally, an experimental analysis of parametric fits to the outputs of two text classifiers, naive Bayes (which is known to emit poor probabilities) and a linear SVM, is conducted. The analysis shows that one of these asymmetric families is theoretically attractive (introducing few new parameters while increasing flexibility), computationally efficient, and empirically preferable.

24 pages

Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by