Computer Science Department
School of Computer Science, Carnegie Mellon University
Model Validation and Discovery for Complex
Sumit Kumar Jha
In this thesis, we study two fundamental problems that arise in the modeling of stochastic systems: (i) Validation of stochastic models against behavioral specifications such as temporal logics, and (ii) Discovery of kinetic parameters of stochastic biochemical models from behavioral specifications.
We present a new Bayesian algorithm for Statistical Model Checking of stochastic systems based on a sequential version of Jeffreys' Bayes Factor test. We argue that the Bayesian approach is more suited for application domains like systems biology modeling, where distributions on nuisance parameters and priors may be known. We prove that our Bayesian Statistical Model Checking algorithm terminates for a large subclass of prior probabilities. We also characterize the Type I/II errors associated with our algorithm. We experimentally demonstrate that this algorithm is suitable for the analysis of complex biochemical models like those written in the BioNetGen language. We then argue that i.i.d. sampling based Statistical Model Checking algorithms are not an effective way to study rare behaviors of stochastic models and present another Bayesian Statistical Model Checking algorithm that can incorporate non-i.i.d. sampling strategies.
We also present algorithms for synthesis of chemical kinetic parameters of stochastic biochemical models from high level behavioral specifications. We consider the setting where a modeler knows facts that must hold on the stochastic model but is not confident about some of the kinetic parameters in her model. We suggest algorithms for discovering these kinetic parameters from facts stated in appropriate formal probabilistic specification languages. Our algorithms are based on our theoretical results characterizing the probability of a specification being true on a stochastic biochemical model. We have applied this algorithm to discover kinetic parameters for biochemical models with as many as six unknown parameters.