CMU-CS-04-166
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-04-166

Evaluation of the Haplotype Motif Model using
the Principle of Minimum Description

Srinath Sridhar, Kedar Dhamdhere, Guy E. Blelloch,
R. Ravi*, Russell Schwartz**

October 2004

CMU-CS-04-166.ps
CMU-CS-04-166.pdf


Keywords: Single nucleotide polymorphism, haplotypes, minimum description length


We apply minimum description length (MDL) principles to evaluate the merit of relaxing the rigidity of block models of haplotype structure. We accomplish this by developing an MDL formulation of the more general "haplotype motif" haplotype structure similar to an approach proposed independently by Koivisto et al.. Comparison of equivalent block and motif MDL models on real and simulated data reveal that the more exible motif models can yield substantial reductions in data explanations, suggesting that motifs are more accurately capturing the true nature of haplotype conservation. These benefits are less pronounced in real than in simulated data, however, and depend on coverage level, marker density, and intrinsic recombination rates of specific data sets.

16 pages

* The Tepper School of Business, Carnegie Mellon University
** Department of Biological Sciences, Carnegie Mellon University


Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by reports@cs.cmu.edu