Lane Center for Computational Biology
School of Computer Science, Carnegie Mellon University
Automatced Construction of Dynamic Models
Taráz E. Buck
Proteins specifically localize to various subcellular structures, and both the localization patterns and the structures themselves change over time. Protein location is essential information for understanding subcellular signaling networks as proteins that are never in the same compartment or localized to the same protein complexes or scaffolds cannot interact directly. Furthermore, the probability that a set of proteins can interact is proportional to the local concentrations of those proteins. Location proteomics complements the study of an organism's complete set of protein sequences, structures, and behaviors by gathering knowledge about the positions of all proteins within the cell under all conditions. Many computational approaches for quantifying the subcellular distributions of proteins, differences among them, and the shapes of the membranes that bound them have been developed relatively recently, e.g., for understanding the differences in cells obtained from normal and diseased tissues or over the cell cycle, modeling cytoskeletal dynamics, learning the range of possible nuclear and cellular shapes, and learning the effects of gene expression changes on cellular shapes. Investigation of the dynamics of this patterning and structure extends the often static approach to location proteomics and becomes significant in light of studies showing cell cycle-related changes in the levels or subcellular distribution of 19% and 23% of human proteins, respectively.
We present work on three projects creating models of dynamic protein localization and nuclear and cellular shape. First, we learn a model of cell cycle-related variation of images of nuclei in an unsupervised manner, i.e., without information on the cell cycle phase of a cell or artificial synchronization of cells to the same phase, using manifold learning. The manifold's coordinates predict ground truth cell cycle phase with a testing adjusted R-square of 0.70. Second, we extended previous work that created a nonparametric, generative shape space model of two-dimensional nuclear shape to represent the joint distribution between three-dimensional nuclear and plasma membrane shapes. To extend this static representation to a dynamic one, we proposed a nonparametric, generative model of trajectories in shape spaces based on kernel density estimation, and we additionally synthesized videos of nuclear and plasma membrane shape dynamics by performing a random walk through shape space. We additionally performed simulation experiments to investigate the reduction of the computational complexity of shape space construction from quadratic to linear time. Third, we learned maps of the spatiotemporal localization of nine proteins in helper T cells during the process of synapse formation with antigenpresenting cells. These maps were built under two experimental conditions, specifically when antigen-presenting cells presented a full set of stimulatory surface proteins and when one of these surface proteins, B7, was blocked. We found statistically significant differences in the distribution of four of these proteins between the two conditions, which have implications for understanding T cell signaling.