CMU-HCII-22-105 Human-Computer Interaction Institute School of Computer Science, Carnegie Mellon University
Modular Privacy Flows: A Design Pattern for Data Minimization Haojian Jin September 2022 Ph.D. Thesis
This dissertation introduces a new design pattern, called Modular Privacy Flows (MPF), for designing systems that allow developers to collect data on a need-to-know basis. MPF combines three simple ideas. First, instead of enumerating all-or-nothing fine-grained data access, system builders offer a small and fixed set of stateless operators to developers. Second, developers declare intended data access by authoring a Unix-like pipeline using these operators and save the pipeline representation in a text-based manifest. Third, given a manifest, a trusted runtime assembles a data transformation executable using pre-loaded open-source operator implementations, which relays data flows in a structured and enforceable manner. MPF offers a few important advantages over the conventional all-or-nothing permission approach and other relevant approaches. First, system builders can now support numerous fine-grained APIs by implementing a small set of reusable operator implementations. Second, developers only need to learn the semantics of a few operators to customize their data access. Third, since the operators have clearly defined semantics and the manifests are non-proprietary, MPF can facilitate many independent privacy features to help users manage their privacy in a centralized and unified manner. Further, MPF also allows third-party privacy advocates (e.g., consumer reports) to analyze manifests programmatically and alert users of bad practices. This dissertation has three main parts. The first part includes three empirical studies to characterize developers' data collection behaviors, illustrating that most developers only need partial or derived data rather than raw data. The second part introduces two MPF software architectures (Peekaboo and MapAggregate) that can reduce developers' data collection and demonstrate the abovementioned advantages. Finally, the third part presents two design methods to help developers navigate data minimization's design space, including data collection decision-making and designing independent privacy features through MPF. Combined, this dissertation will scaffold the future development of data minimization.
268 pages
Jodi Forlizzi, Head, Human-Computer Interaction Institute
| |
Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |