
CRINN: Automatic Optimization Of Approximate Nearest Neighbor Search Algorithms Using Reinforcement Learning
3 main points
✔️ CRINN uses reinforcement learning to automatically optimize ANNS code, eliminating the need for human adjustments
✔️ Considers tradeoff between QPS and reproducibility, using speed as a reward signal to improve performance
✔️ Evaluated on six benchmarks, achieving significant speed improvements on MNIST and GIST
CRINN: Contrastive Reinforcement Learning for Approximate Nearest Neighbor Search
written by Xiaoya Li, Xiaofei Sun, Albert Wang, Chris Shum, Jiwei Li
(Submitted on 4 Aug 2025 (v1), last revised 20 Aug 2025 (this version, v2))
Comments: Preprint Version
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Databases (cs.DB)
The images used in this article are from the paper, the introductory slides, or were created based on them.
Summary
This paper focuses on the optimization of Approximate Nearest Neighbor Search (ANNS) in high-dimensional vector spaces.
ANNS is a technique that significantly improves search speed at the cost of slightly sacrificing search accuracy, and has recently become an integral part of the underlying technology for Retrieval-Augmented Generation (RAG) and agent-based LLM applications.
Conventional optimization has been achieved by profiling by human experts, analyzing cache misses, adjusting data structures, and repeatedly tuning parameters manually.
However, this method requires expertise and labor, and has limitations in keeping up with the evolution of hardware and application environments.
Therefore, the authors propose a new optimization framework, CRINN, which combines LLM and reinforcement learning.
CRINN treats the execution speed of code implementation as a reward and automatically generates efficient ANNS code through reinforcement learning based on contrastive learning.
This makes it possible to generate sequentially improved implementations without relying on manual tuning, achieving a new breakthrough in retrieval performance.
Proposed Methodology
CRINN views ANNS as an optimization problem using reinforcement learning and combines it with contrastive learning to improve performance.
Specifically, the design incorporates existing implementation codes and their execution speeds into the prompts, and allows the LLM to compare and analyze "why a certain implementation is faster".
In this way, the model learns patterns that lead to speed improvements and generates new, better code.
The generated code is executed and a reward is given based on speed and reproducibility.
This reward is used to perform reinforcement learning based on Group Relative Policy Optimization (GRPO), and the model is updated sequentially.
In the reward design, we focused on the trade-off between Queries Per Second (QPS) and Recall, and the area under the curve in the Recall range of [0.85,0.95] was used as the scalar reward.
Furthermore, using an existing ANNS library called GLASS as an initial foundation, sequential optimization was performed for each module: graph construction, exploration, and refinement.
This structured approach automates the traditional expert-dependent coordination and enables the development of efficient search algorithms.
Experiments
Experiments tested CRINN's performance on six benchmark datasets: SIFT-128, GIST-960, MNIST-784, GloVe-25, GloVe-100, and NYTimes-256.
Representative open source ANNS implementations such as ParlayANN, GLASS, NNDescent, PyNNDescent, Vearch, and Voyager were selected for comparison. Training was performed using only SIFT-128 (Euclidean distance) and then generalization performance was evaluated on other datasets.
As a result, CRINN achieved speedups of up to 85% on MNIST-784 and GIST-960, with particularly significant improvements observed in the graph construction module.
On the other hand, performance degradation was also observed for some datasets, such as NYTimes-256, suggesting that optimization may be limited by distance scale and data characteristics.
In addition, the effectiveness of incremental modular optimization was demonstrated, confirming the potential for continuous improvement from the underlying GLASS.
Overall, CRINN combines speed and accuracy that outperform existing methods, and presents a new direction for code optimization using reinforcement learning.
Categories related to this article