CMU-ISR-18-100R
Institute for Software Research
School of Computer Science, Carnegie Mellon University



CMU-ISR-18-100R

Which Apps have Privacy Policies?
An analysis of over one million Google Play Store apps

Peter Story, Sebastian Zimmeck, Norman Sadeh

July 2018

CMU-ISR-18-100R.pdf

Supercedes Institute for Software Research
Technical Report CMU-ISR-18-100


Keywords: Privacy, privacy policy, smartphone, smartphone apps

Smartphone app privacy policies are intended to describe smartphone apps' data collection and use practices. However, not all apps have privacy policies. Without prominent privacy policies, it becomes more difficult for users, regulators, and privacy organizations to evaluate apps' privacy practices. This study shines light on the distribution of privacy policies in the Android ecosystem. We answer the question: "Which apps have privacy policies?" by analyzing the metadata of over one million apps from the Google Play Store. Only about half of the apps we examined link to a policy from their Play Store pages. First, we conducted an exploratory data analysis of the relationship between app metadata features and whether apps link to privacy policies. We found that only 55.3% of apps rated as suitable for Everyone 10+ link to privacy policies. While this finding requires further investigation, it suggests that a number of apps might not be compliant with the Children's Online Privacy Protection Act. Next, we trained a logistic regression model to predict the probability that individual apps will have policy links. Using the model, we explore which app features are most influential for this determination. We find that apps with more ratings and in-app purchases are associated with greater odds of having links to privacy policies. In contrast, apps in Google Play's "Books and Reference" category or with the "Sexual Themes" content descriptor were associated with lower odds of having policy links. Regulators might use this information to focus their enforcement actions towards areas that are particularly prone to the lack of policy links. Alternatively, regulators could use our model to identify apps for which it is particularly unusual that they do not have policy links. According to our model, it is unusual for apps with more than 100,000 ratings and millions of downloads to not link to privacy policies, yet such apps are present on the Play Store. Finally, by comparing three crawls of the Play Store, we observe an overall-increase in the percent of apps with links between September 2017 and May 2018 (from 41.7% to 51.8%). This increase could be associated with a Play Store purge performed by Google to eliminate or hide apps which collected "Personal and Sensitive Information" but did not have links to privacy policies.

26 pages


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu