Institute for Software Research
School of Computer Science, Carnegie Mellon University


Predictions for Biomedical Decision Support

Xiaoqian Jiang

December 2010

Ph.D. Thesis


Keywords: Discrimination, calibration, AUC, Hosmer-Lemeshow test, isotonic regression, Platt scaling, reliability diagram, adaptive learning, structured learning, maximum margin optimization, convex optimizaton, markov network, conditional random fields, time series regression, Hidden Markov Models

Medications designed for a general population do not work the same for each individual. Similarly, patterns observed from naturally occurring disease outbreaks do not necessarily describe outbreaks of purposeful disease outbreaks (e.g. bioterrorism). To tackle challenges posed by individual differences, my thesis introduces data-driven paradigms that predict a particular case will have the outcome of interest. My insight is to accommodate individual differences by coherently leveraging information from complementary perspectives (e.g., temporal dependency, relational correlation, feature similarity, and estimation uncertainty) to provide more reliable predictions than possible with existing cohort-based approaches.

Specifically, I carefully investigated two representative problems, bioterrorism-related disease outbreak and personalized clinical decision support, for which previous research does not provide satisfactory solutions. I developed a Temporal Maximum Margin Markov Network framework to consider the temporal correlation concurrently with relational dependency in bioterrorism-related diseases‚ outbreaks. This framework reduces the ambiguity in estimating outcome variables from noisy manifestations by considering complementary information. It outperformed state-of the-art models with synthetic and real world datasets, and improved average state prediction accuracy in predicting simulated biohazards. Regarding personalized clinical decision support, I focused on an important but little-studied measurement "calibration," which stratifies how outcomes affect various genetic population groups within a patientdiagnosis population. I designed joint optimization framework to combine discrimination and calibration, and demonstrated models (DP-SVM, SIO and AC-LR) developed under this multitargeted framework perform better on both metrics than single-targeted models. I conducted various real data experiments including Hospital Discharge Error, Myocardial Infarction and Breast Cancer Gene Expression Data to verify the efficacy of my joint optimization framework.

280 pages

Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by