CMU-CB-21-101
Ray and Stephanie Lane Computational Biology Department
School of Computer Science, Carnegie Mellon University



CMU-CB-21-101

Automated analysis of protein subcellular location
in immunohistochemistry images for cancer diagnosis

Aparna Kumar

May 2021

Ph.D. Thesis

CMU-CB-21-101.pdf


Keywords: Immunohistochemistry, pathology, automated image analysis, cancer, protein subcellular location, biomarkers, liver lesions

Protein subcellular location and compartmentalization play an important role in regulating cellular processes. Protein mislocalization alters cell signaling and is observed in diverse diseases (Hung and Link 2011). Drug resistance can occur when proteins are mislocalized to the cytoplasm and nucleus, suggesting that the measurement of protein location can help clinicians personalize therapies and diagnose disease. Here, two projects explore how automatically quantitating subcellular location from pathology images can be used be in diagnostics and for understanding disease. 1) We developed an automated pipeline to compare the subcellular location of proteins between two sets of immunohistochemistry images. We used the pipeline to compare images of healthy and tumor tissue from the Human Protein Atlas, ranking hundreds of proteins in breast, liver, prostate and bladder based on how much their location was estimated to have changed. The performance of the system was evaluated by determining whether proteins previously known to change location in tumors were ranked highly. We present a number of new candidate location biomarkers for each tissue. Further we identified biochemical pathways that are enriched in proteins that change location. We confirmed some previously implicated pathways and we report new pathways previously unassociated with cancer to have changed. 2) We extended the IHC pipeline to process full slide images. Using the pipeline we explored how measuring changes in protein subcellular location can aid in identifying adult and pediatric liver lesions. Our results indicate that most of the time single protein measurements are poor markers for the lesions. Next we explored lesion.specific protein signatures for identifying diseases. Given our dataset we found a signature set of proteins that can successfully identify liver lesions in adult and pediatric populations with perfect accuracy. Finally we report two new proteins that aid in classifying the lesions when used as part of a signature protein set.

109 pages

Thesis Committee:
Robert F. Murphy (Chair)
Russell Schwartz
Chakra Chennubhotla (University of Pittsburgh)
John Ozolek (West Virginia University School of Medicine)
Gustavo Rodhe (University of Virginia)

Russell Schwartz, Head, Computational Biology Department
Martial Hebert, Dean, School of Computer Science



Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu