CRINN: Automatic Optimization Of Approximate Nearest Neighbor Search Algorithms Using Reinforcement Learning

24/08/2025

3 main points
✔️ CRINN uses reinforcement learning to automatically optimize ANNS code, eliminating the need for human adjustments
✔️ Considers tradeoff between QPS and reproducibility, using speed as a reward signal to improve performance
✔️ Evaluated on six benchmarks, achieving significant speed improvements on MNIST and GIST

CRINN: Contrastive Reinforcement Learning for Approximate Nearest Neighbor Search
written by Xiaoya Li, Xiaofei Sun, Albert Wang, Chris Shum, Jiwei Li
(Submitted on 4 Aug 2025 (v1), last revised 20 Aug 2025 (this version, v2))
Comments: Preprint Version
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Databases (cs.DB)

The images used in this article are from the paper, the introductory slides, or were created based on them.

Summary

This paper focuses on the optimization of Approximate Nearest Neighbor Search (ANNS) in high-dimensional vector spaces.

ANNS is a technique that significantly improves search speed at the cost of slightly sacrificing search accuracy, and has recently become an integral part of the underlying technology for Retrieval-Augmented Generation (RAG) and agent-based LLM applications.
Conventional optimization has been achieved by profiling by human experts, analyzing cache misses, adjusting data structures, and repeatedly tuning parameters manually.
However, this method requires expertise and labor, and has limitations in keeping up with the evolution of hardware and application environments.

Therefore, the authors propose a new optimization framework, CRINN, which combines LLM and reinforcement learning.
CRINN treats the execution speed of code implementation as a reward and automatically generates efficient ANNS code through reinforcement learning based on contrastive learning.
This makes it possible to generate sequentially improved implementations without relying on manual tuning, achieving a new breakthrough in retrieval performance.

Proposed Methodology

CRINN views ANNS as an optimization problem using reinforcement learning and combines it with contrastive learning to improve performance.
Specifically, the design incorporates existing implementation codes and their execution speeds into the prompts, and allows the LLM to compare and analyze "why a certain implementation is faster".

In this way, the model learns patterns that lead to speed improvements and generates new, better code.
The generated code is executed and a reward is given based on speed and reproducibility.

This reward is used to perform reinforcement learning based on Group Relative Policy Optimization (GRPO), and the model is updated sequentially.
In the reward design, we focused on the trade-off between Queries Per Second (QPS) and Recall, and the area under the curve in the Recall range of [0.85,0.95] was used as the scalar reward.
Furthermore, using an existing ANNS library called GLASS as an initial foundation, sequential optimization was performed for each module: graph construction, exploration, and refinement.

This structured approach automates the traditional expert-dependent coordination and enables the development of efficient search algorithms.

Experiments

Experiments tested CRINN's performance on six benchmark datasets: SIFT-128, GIST-960, MNIST-784, GloVe-25, GloVe-100, and NYTimes-256.

Representative open source ANNS implementations such as ParlayANN, GLASS, NNDescent, PyNNDescent, Vearch, and Voyager were selected for comparison. Training was performed using only SIFT-128 (Euclidean distance) and then generalization performance was evaluated on other datasets.
As a result, CRINN achieved speedups of up to 85% on MNIST-784 and GIST-960, with particularly significant improvements observed in the graph construction module.

On the other hand, performance degradation was also observed for some datasets, such as NYTimes-256, suggesting that optimization may be limited by distance scale and data characteristics.
In addition, the effectiveness of incremental modular optimization was demonstrated, confirming the potential for continuous improvement from the underlying GLASS.

Overall, CRINN combines speed and accuracy that outperform existing methods, and presents a new direction for code optimization using reinforcement learning.

Categories related to this article

nakata

CRINN: Automatic Optimization Of Approximate Nearest Neighbor Search Algorithms Using Reinforcement Learning

Summary

Proposed Methodology

Experiments

MMR1: A Multimodal Inference Model That Stabilizes Reinforcement Learning With Sampling Based On Reward Variance

MMR1: A Multimodal Inference Model That Stabilizes Reinforcement Learning With Sampling Based On Rew ...

VCRL: A New Approach To LLM Reinforcement Learning That Controls Learning Difficulty With Reward Variance

VCRL: A New Approach To LLM Reinforcement Learning That Controls Learning Difficulty With Reward Var ...

The Challenge Of Social-MAE, A Social AI That Uses Self-supervised Learning To Decipher Emotions, Laughter, And Personality

The Challenge Of Social-MAE, A Social AI That Uses Self-supervised Learning To Decipher Emotions, La ...

OnGoal: New Chat Interface To Visualize The Goals Of LLM Dialogue

OnGoal: New Chat Interface To Visualize The Goals Of LLM Dialogue

TriMM: Collaborative Multimodal Coding For High-quality 3D Generation

TriMM: Collaborative Multimodal Coding For High-quality 3D Generation

Dress&Dance: Video Diffusion Model For Highly Accurate Virtual Fitting And Motion Generation

Dress&Dance: Video Diffusion Model For Highly Accurate Virtual Fitting And Motion Generation