INSTITUTE FOR SOFTWARE RESEARCH TECHNICAL REPORT ABSTRACTS

CMU-ISR-19-105
Institute for Software Research
School of Computer Science, Carnegie Mellon University

CMU-ISR-19-105

Can Machine Learning Help People Configure
Their Mobile App Privacy Settings?

Bin Liu

December 2019

Ph.D. Thesis
Societal Computing

CMU-ISR-19-105.pdf

Keywords: Mobile Privacy, Mobile Apps, Android Permissions, Assistants

Technologies such as mobile apps, web browsers, social networking sites, and IoT devices provide sophisticated services to users. At the same time, they are also increasingly collecting privacy-sensitive data about them. In some domains, such as mobile apps, this trend has resulted in an increase in the breadth of privacy settings made available to users. These settings are necessary because not all users feel comfortable having their data collected by some of these technologies. On mobile phones alone, the sheer number of apps users download is staggering. The variety of sensitive data and functionality requested by these apps has led to a demand for much more specific privacy settings. The same is true in other domains as well, such as social networks, browsers, and various IoT technologies. The result of this situation is that users feel overwhelmed by all of the settings available to them, and are thus unable to take advantage of them effectively.

This dissertation examines whether machine learning techniques can be utilized to help users manage an increasingly large number of privacy settings. It specifically focuses on mobile app permissions. The research presented herein aims to simplify people's tasks in regard to managing their large number of app privacy settings. We present the methods we used for developing models of users' privacy preferences, and describe the interactive assistant we designed based on these models to help users configure their settings using personalized recommendations. The objective of this work is to alleviate the burden placed on users while increasing alignment between a their preferences and the privacy settings on their phones.

This dissertation details three different studies. Specifically, in the first study, we used a dataset of mobile app permission settings obtained from over 200K Android users, explored different machine learning models, and analyzed different combinations of features to predict users' mobile app permission settings. The study includes the development and evaluation of profile-based models as well as individual prediction models. It also includes simulation studies, wherein we explored the viability of different interactive configuration scenarios by testing different ways of combining dialogue inputs from users with recommendations based on machine learning models. The results of these simulations suggest that by selectively prompting users to indicate how they would like to configure a relatively small percentage of their permission settings, it is possible to accurately predict many of their remaining permission settings. Another significant finding of this first study is that a relatively small number of privacy profiles derived from clusters of like-minded users can help predict many of the permission settings that users in a given cluster prefer.

The second study was designed to validate these findings in a field study with actual users. We designed an enhanced version of Android's permission manager and collected rich information on users' actual app permission settings. While results from this study involve a much smaller number of users, they were obtained using privacy nudges designed to increase user awareness of data being collected about them and as a result also their engagement with their permission settings. Using data collected as part of this study, we were able to generate and analyze privacy profiles built for groups of like-minded users who exhibited similar privacy preferences. Results of this study confirm that a relatively small number of profiles (or clusters of users) can capture s large percentage of users' diverse privacy preferences and help predict many of their desired privacy settings. They also indicate that privacy nudges can be very effective in motivating users to engage with their permission settings and in deriving privacy profiles with strong predictive power.

In the third study, we evaluated our profile-based preference models by developing a privacy assistant that helps users configure their app permission settings based on the developed profiles from our second study. We report on the results of a pilot study (N=72) conducted with actual Android users who used our privacy assistant on their smartphones while performing their regular daily activities. The results indicate that participants accepted 78.7% of the recommendations made by the privacy assistant and kept 94.9% of these settings on their phones over the following six days, all while receiving daily nudges designed to motivate them to further review their settings. The dissertation also discusses the privacy profiles designed for this research and identifies essential attributes that separate people associated with different profiles (or clusters). A refined version of the Personalized Privacy Assistant was released to the Google Play store and used to collect some additional data.

In summary, through a series of three studies, this dissertation shows that using a small number of privacy decisions made by a given smartphone user, it is often possible to predict a large fraction of the mobile app permission settings this user would want to have. The dissertation further shows how we have been able to effectively operationalize this finding in the form of personalized privacy assistants that can help users configure mobile app permission settings on their smartphones.

143 pages

Thesis Committee:
Norman Sadeh (Chair)
Alessandro Acquisiti (Heinz College)
Lorrie Cranor
Florian Schaub (University of Michigan)
Nina Taft (Google, Inc.)

James D. Herbsleb, Director, Institute for Software Research
Martial Hebert, Dean, School of Computer Science

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu