CMU-HCII-24-102
Human-Computer Interaction Institute
School of Computer Science, Carnegie Mellon University



CMU-HCII-24-102

Computational Understanding of User Interaces

Jason Wu

July 2024

Ph.D. Thesis

CMU-HCII-24-102.pdf


Keywords: Human-Computer Interaction, Machine Learning, User Interface, Accessibility, User Modeling, UI Modeling, UI Adaptation


A grand challenge in human-computer interaction (HCI) is constructing user interfaces (UIs) that make computers useful for all users across all contexts. Conventional UI development processes have approached this goal by iteratively converging towards a single "final" UI through prototyping, implementation, and testing. However, even when following best practices, this approach locks in a set of assumptions that often cannot accommodate the diversity of user abilities, usage contexts, or computing technologies, ultimately limiting how we can use computers. In this dissertation, I propose a new approach that uses machine-learning-driven systems that automatically understand and manipulate existing UIs as a way to overcome this challenge. Using content and functionality inferred from an existing application, combined with sensed usage context, the UI can be dynamically tailored to the immediate needs of use, e.g., through interface adaptation or generation.

The work presented in this dissertation represents the initial technical foundation for this vision. First, I describe approaches for understanding user ability and context (user understanding), which HCI suggests is the basis for building good interfaces. I describe a recommendation system that recommends device settings (e.g., accessibility features) based on sensed usage behaviors and user interaction logs. While users often found these suggestions helpful, this approach of adapting interfaces through configuration changes has traditionally been limited, since applications often do not properly expose their semantics to external services. To this end, I describe several projects in the area of UI understanding, which shows that it is possible to overcome this barrier using data-driven ML models that predict interface layout, structure, and functionality from visual information, which is how UIs are generally assumed to be used. These predicted semantics can enable many forms of existing computing infrastructure, such as accessibility and UI agents to work more reliably and robustly. Finally, I combine both user and UI understanding to dynamically generate and adapt UIs that meet the specific needs of users. I describe ML-driven systems that generate UIs by modifying existing application layouts and generating UI code based on personalized user profiles and design objectives. Ultimately, through my work, I show that computational understanding of user interfaces allows UIs to be transformed from static objects into malleable representations that can be dynamically reshaped for new devices, modalities, and users.

229 pages

Thesis Committee:
Jeffrey P. Bigham (Chair)
Jodi Forlizzi
SHerry Tongshuang Wu
Tom Mitchell
Jeffrey Nichols (Apple)

Brad A. Myers, Head, Human-Computer Interaction Institute
Martial Hebert, Dean, School of Computer Science



Return to: SCS Technical Report Collection
School of Computer Science homepage

This page maintained by reports@cs.cmu.edu