CMU-ML-18-104
Machine Learning Department
School of Computer Science, Carnegie Mellon University



CMU-ML-18-104

Stress Detection for Keystroke Dynamics

Shing-Hon Lau

May 2018

Ph.D. Thesis

CMU-ML-18-104.pdf


Keywords: Keystroke dynamics, keystroke biometrics, behavioral biometrics, stress, stress detection, affect detection


Background. Stress can profoundly affect human behavior. Critical-infrastructure operators (e.g., at nuclear power plants) may make more errors when overstressed; malicious insiders may experience stress while engaging in rogue behavior; and chronic stress has deleterious effects on mental and physical health. If stress could be detected unobtrusively, without requiring special equipment, remedies to these situations could be undertaken. In this study a common computer keyboard and everyday typing are the primary instruments for detecting stress.

Aim. The goal of this dissertation is to detect stress via keystroke dynamics – the analysis of a user's typing rhythms – and to detect the changes to those rhythms concomitant with stress. Additionally, we pinpoint markers for stress (e.g., a 10% increase in typing speed), analogous to the antigens used as markers for blood type. We seek markers that are universal across all typists, as well as markers that apply only to groups or clusters of typists, or even only to individual typists.

Data. Five types of data were collected from 116 subjects: (1) demographic data, which can reveal factors (e.g., gender) that influence subjects' reactions to stress; (2) psychological data, which capture a subject's general susceptibility to stress and anxiety, as well as his/her current stress state; (3) physiological data (e.g., heart-rate variability and blood pressure) that permit an objective and independent assessment of a subject's stress level; (4) self-report data, consisting of subjective self-reports regarding the subject's stress, anxiety, and workload levels; and (5) typing data from subjects, in both neutral and stressed states, measured in terms of keystroke timings – hold and latency times – and typographical errors. Differences in typing rhythms between neutral and stressed states were examined to seek specific markers for stress.

Method. An ABA, single-subject design was used, in which subjects act as their own controls. Each subject provided 80 typing samples in each of three conditions: (A) baseline/neutral, (B) induced stress, and (A) post-stress return/recovery-to-baseline. Physiological measures were analyzed to ascertain the subject's stress level when providing each sample. Typing data were analyzed, using a variety of statistical and machine learning techniques, to elucidate markers of stress. Clustering techniques (e.g., K-means) were also employed to detect groups of users whose responses to stress are similar.

Results. Our stressor paradigm was effective for all 116 subjects, as confirmed through analysis of physiological and self-report data. We were able to identify markers for stress within each subject; i.e., we can discriminate between neutral and stressed typing when examining any subject individually. However, despite our best attempts, and the use of state-of-the-art machine learning techniques, we were not able to identify universal markers for stress, across subjects, nor were we able to identify clusters of subjects whose stress responses were similar. Subjects' stress responses, in typing data, appear to be highly individualized. Consequently, effective deployment in a realworld environment may require an approach similar to that taken in personalized medicine.

232 pages

Thesis Committee:
Roy Maxion (Chair)
Tom Mitchell
Daniel Siewiorek
Peter Strick (University of Pittsburgh)
David Banks (Duke University)
Mark Wetherell (Northumbria University)

Manuela M. Veloso, Head, Machine Learning Department
Andrew W. Moore, Dean, School of Computer Science


SCS Technical Report Collection
School of Computer Science