CMU-CS-21-129 Computer Science Department School of Computer Science, Carnegie Mellon University
Deep Learning Based Data Augmentation for Zhendong Yuan M.S. Thesis August 2021
Deep learning has become increasingly popular in a wide range of applications in the past few years. The performance improvements in hardware and machine learning models have made it possible to train a deeper and wider network to achieve state-of-the-art (SOTA) performance for those applications. However, there still exist several potential obstacles that researchers have to overcome before producing a model that could actually be useful in reality. One of the common obstacles is related to the data itself. The training data collected from a small hospital could be limited in quantity and a pre-trained model taken from other hospitals could have bad generalization performance due to potential differences in the X-ray machines and the environment in which the mammogram is taken[41]. Moreover, since the majority of the data collected from the mammogram comes from patients who actually have no illness, there could be a serious imbalance of positive/negative cases in the training data. Models trained using such data could naively achieve an extremely high overall accuracy by predicting everything as normal and would have no actual value in reality. However, lesion/cancer detection is a task that requires the model's predictions to be accurate for both positive/negative cases, resilient to noises, and consistent across different data sources. In this thesis, we provide workarounds for the issues mentioned. Our experiment is based on the UPITT mammogram dataset that is comprised of 79501 images collected from approximately 22267 distinct patients. In order to deal with the dataset size restriction and to achieve localized explanation, we decide to use a patch-based model for the lesion classification. We extract the normal patches from the breast tissue in images with BIRADS level of 1. The lesion patches are extracted from the ROI(region of interest) labeled by the radiologist from images with BIRADS level score of 0,2 using computer vision techniques. We designed our own techniques to deal with the serious data imbalance via deep learning-based SMOTE[9] and GAN[6, 12, 18, 28] and test those techniques with a deep convolutional model that is similar to VGG16[35]. 54 pages
Thesis Committee:
Srinivasan Seshan, Head, Computer Science Department
| |
Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |