Institute for Software Research
School of Computer Science, Carnegie Mellon University


The Usable Privacy Policy Project:
Combining Crowdsourcing, Machine Learning and Natural Language Processing
to Semi-Automatically Answer Those Privacy Questions Users Care About

Norman Sadeh, Alessandro Acquisti, Travis D. Breaux, Lorrie Faith Cranor,
Aleecia M. McDonald*, Joel R. Reidenberg**, Noah A. Smith, Fei Liu,
N. Cameron Russell**, Florian Schaub, Shomir Wilson

December 2013


Keywords: Privacy, online privacy, usability, law, public policy, behavioral economics, natural language processing, privacy policies, privacy notices, notice & choice, privacy policy analysis, privacy decision making, crowdsourcing, machine learning, privacy preferences, privacy preference modeling, cognitive biases

Natural language privacy policies have become a de facto standard to address expectations of "notice and choice" on the Web. However, users generally do not read these policies and those who do read them struggle to understand their content. Initiatives aimed at addressing this problem through the development of machine-readable standards have run into obstacles, with many website operators showing reluctance to commit to anything more than what they currently do. This project builds on recent advances in natural language processing, privacy preference modeling, crowdsourcing, formal methods, and privacy interface design to develop a practical framework based on websites' existing natural language privacy policy that empowers users to more meaningfully control their privacy, without requiring additional cooperation from website operators. Our approach combines fundamental research with the development of scalable technologies to (1) semi-automatically extract key privacy policy features from natural language privacy policies, and (2) present these features to users in an easy-to-digest format that enables them to make more informed privacy decisions as they interact with different websites. This work will also involve the systematic collection and analysis of website privacy policies, looking for trends and deficiencies both in the wording and content of these policies across different sectors and using this analysis to inform public policy. This report outlines the project's research agenda and overall approach

24 pages

*The Center for Internet and Society, Stanford Law School, Stanford, CA 94305
**Center on Law and Information Policy, School of Law, Fordham University, New York, N 10023Y

Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by