CMU-ISR-18-100
Institute for Software Research
School of Computer Science, Carnegie Mellon University



CMU-ISR-18-100

Which Apps have Privacy Policies?
An analysis of over one million Google Play Store apps

Peter Story, Sebastian Zimmeck, Norman Sadeh

February 2018

CMU-ISR-18-100.pdf

Superceded by Institute for Software Research
Technical Report CMU-ISR-18-100R

Keywords: Privacy, privacy policy, smartphone, smartphone apps

Privacy policies are intended to describe smartphone apps' data collection and use practices. However, less than half of apps on the US Google Play Store link to a policy. The resulting opacity of privacy practices hinders users, regulators, and privacy organizations from evaluating apps' privacy practices. This study shines light on the distribution of privacy policies in the Android ecosystem. We answer the question: "Which apps have privacy policies?" by analyzing the metadata of over one million apps from the Google Play Store. First, we conducted an exploratory data analysis of the relationship between app metadata features and whether apps link to privacy policies. We found that only 55.3% of apps rated as suitable for Everyone 10+ link to privacy policies. While this finding requires further investigation, it suggests that a number of apps might not be compliant with the Children's Online Privacy Protection Act. Next, we trained a logistic regression model to predict the probability that individual apps will have policy links. Using the model, we explore which app features are most influential for this determination. We find that apps with more ratings and in-app purchases are associated with greater odds of having links to privacy policies. In contrast, apps in Google Play's "Books and Reference" category or with the "Sexual Themes" content descriptor were associated with lower odds of having policy links. Regulators might use this information to focus their enforcement actions towards areas that are particularly prone to the lack of policy links. Alternatively, regulators could use our model to identify apps for which it is particularly unusual that they do not have policy links. According to our model, it is unusual for apps with more than 100,000 ratings and millions of downloads to not link to privacy policies, yet such apps are present on the Play Store. Finally, by comparing two crawls of the Google Play Store, we observe an overall-increase in the percent of apps with policy links between September and December 2017. This increase could be associated with a Play Store purge performed by Google to eliminate apps which collected "Personal and Sensitive Information" but did not have links to privacy policies.

28 pages


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu