Computer Science Department
School of Computer Science, Carnegie Mellon University
Nonlinear Switching State-Space Models
DNA copy number aberrations serve as key biological markers for cancer and many other diseases and determining their location has important applications to cancer diagnosis, drug development, and molecular therapy. Analysis of a variety of cancers have revealed that gains present in proto-oncogenes and losses of tumor suppressor genes have serious impacts on growth-limiting functions, cell-death programs, and self-repair processes of cancerous regions. Methods to efficiently and accurately detect these aberrations serve as an important step in understanding the behavior of cancer and have significant consequences in cancer diagnosis and development of treatments. Array comparative genome hybridization (aCGH) provides an efficient method for full genome analysis of DNA copy numbers, but is corrupted by serious systematic errors such as impurity of the DNA sample, heterogeneity of copy numbers among defective cells, and measurement noise. Previous methods have shortcomings in inference of dosage states, failing to properly annotate genomes displaying spatially correlated samples and large spikes in log-ratio. A new model, the Nonlinear Genome Imbalance Scanner (NL-GIMscan), is proposed which captures both nonlinear spatial drift of aCGH intensities and measurement noise through fitting state-specific non-linear hidden trajectories with an overlying first-order Markov switching process. NL-GIMscan is demonstrated on two different datasets of malignant tumors and resulted in improved performance over existing models such as hidden Markov models (HMM). A software implementation of the model is available from the author.