CMU-CS-25-100
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-25-100

New Techniques for Parallelism and
Concurrency in Nearest Neighbor Search

Magdalen Dobson Manohar

Ph.D. Thesis

May 2025

CMU-CS-25-113.pdf


Keywords: Nearest Neighbor Search, Information Retrieval, Range Search, Parallelism, Concurrency

Nearest neighbor search in both high and low dimensions is an important problem in the field of computer science and beyond. In this thesis, we extend the capabilities of nearest neighbor search algorithms to meet modern demands, including support for billion-scale indices, parallelism and concurrency on machines with hundreds of cores, efficient updates, and extensions to related geometric problems such as range search. We progress towards this goal by introducing the zd-tree, a data structure for low-dimensional nearest neighbor search with provable guarantees on the work and span of search, build, and update, a scalable parallel build, and the ability to perform batch-dynamic updates in parallel. Building on the zd-tree, we also present the CLEANN-Tree (for Concurrent Linearizable Efficient Augmented Nearest Neighbor Search Tree), a generalization of the zd-tree which supports concurrent queries and updates utilizing versioned pointers and lock-free locks. In high dimensions, we introduce techniques to make existing nearest neighbor search algorithms lock-free, deterministic, and scalable to billion-size datasets. We apply these techniques to four existing graph-based nearest neighbor search algorithms in a library called ParlayANN. Building off our work in ParlayANN, we extend the search algorithms for graph-based nearest neighbor indices to the related but relatively under-studied problem of range search in high dimensions, making significant gains over a naive baseline.

157 pages

Thesis Committee:
Guy E. Blelloch (Chair)
Phil Gibbons
Andy Pavlo
Matthijs Douze (Meta AI Research
Harsha Vardhan Simhadri (Microsoft Azure)

Srinivasan Seshan, Head, Computer Science Department
Martial Hebert, Dean, School of Computer Science


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu