Evaluation
Scaling Laws for Grid-Based Approximate Nearest Neighbor Search in High Dimensions
** Matthew J Liu, Wei Hang Zheng, Vidhan Purohit, Siqi Xie, Chieh-En Li, Jerry Li, Noah Flynn
Scaling Laws for Grid-Based Approximate Nearest Neighbor Search in High Dimensions
Authors: Matthew J Liu, Wei Hang Zheng, Vidhan Purohit, Siqi Xie, Chieh-En Li, Jerry Li, Noah Flynn
arXiv ID: 2607.01283
Problem: Grid-based ANN methods have been absent from modern scaling analyses, so the paper characterizes their scaling behavior against dataset size N and dimensionality d to identify competitive regimes.
Key Methodology:
- Systematic characterization of a multiprobe grid algorithm that decouples cell selection (in PCA subspace ℝᵐ, m ≪ d) from candidate re-ranking (in full ℝᵈ)
- Derived a closed-form log-linear QPS–recall relationship from grid geometry and exponential decay of nearest-neighbor membership probability across probed cells
- Benchmarked against Voyager, PyNNDescent, Annoy, and FAISS-IVF on GloVe (angular) and SIFT-128 (Euclidean) datasets using the ann-benchmarks framework
Key Results:
- At recall@k=10 = 0.80 on GloVe-200, multiprobe grid has N-scaling exponent α_N = -0.94 (R²=1.00) - near-linear; baselines range from -0.44 to -0.59
- On SIFT-128 at same recall, α_N = -0.83 (R²=1.00) vs. -0.27 to -0.39 for baselines
- d-scaling crossover: multiprobe grid maintains a flat α_d across recall while all other algorithms steepen sharply beyond recall 0.7, meaning it becomes more competitive as dimensionality grows
- Index build time at N=1.18×10⁶: 4–36 sec (multiprobe grid) vs. 206 sec (FAISS-IVF), 333 sec (Annoy), 500 sec (PyNNDescent), 1,569 sec (Voyager)
- Multiprobe grid builds ~190× faster than Voyager but per-query latency ~120× slower; achieves lower total cost when rebuild-to-query ratio exceeds ~1:2,600–20,400
Applied Context: Multiprobe grid is a strong choice for rebuild-heavy or high-dimensional pipelines (streaming embeddings, RAG with frequent index refreshes, KV-cache-like incremental updates) - its near-zero indexing cost and dimensional robustness outweigh its slower per-query throughput in settings where the index is rebuilt often relative to the number of queries served.