CMU-ISR-13-111
Institute for Software Research
School of Computer Science, Carnegie Mellon University



CMU-ISR-13-111

Applying Autnomic Diagnosis at Samsung Electronics

Paulo Casnova, Bradley Schmerl, David Garlan, Rui Abreu*, Jungsik Ahn**

September 2013
Available June 2015

CMU-ISR-13-111.pdf


Keywords: Samsung Electronics, Fault Diagnosis, Autonomic Computing, Self-Adaptive Systems

An increasingly essential aspect of many critical software systems is the ability to quickly diagnose and locate faults so that appropriate corrective measures can be taken. Large, complex software systems fail unpredictably and pinpointing the source of the failure is a challenging task. In this paper we explore how our recently developed technique for automatic diagnosis performs in the automatic detection of failures and fault localization in a critical manufacturing control system of Samsung Electronics, where failures can result in large financial and schedule losses. We show how our approach scales to such systems to diagnose intermittent faults, connectivity problems, protocol violations, and timing failures. We propose a set of measures of accuracy and performance that can be used to evaluate run-time diagnosis. We present lessons learned from this work including how instrumentation limitations may impair diagnostic accuracy: without overcoming these, there is a limit to the kinds of faults that can be detected.

25 pages

*University of Porto, Porto, Portugal
**Samsung Electronics, Republic of Korea


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu