Model Overview

Given a user request for how to change a codebase, you want to retrieve only the files relevant to implementing that request. This is important for two reasons:

Polluting the context window with irrelevant files makes the generated code worse.
The fewer files you pass in, the more you save on input tokens.

We trained our reranker on hundreds of thousands of user query and code pairs to make it best in class for AI codegen applications.

Accuracy Benchmark

We evaluated our model on two retrieval benchmarks — an inhouse dataset consisting of query/codebase pairs for prompt-to-app tasks, and a more general dataset consisting of open source GitHub PRs.

Recall@k tells you of the k relevant files for the example, how many of those relevant files were ranked in the top k results. For codegen, high recall is essential because failure to pass relevant files into context entirely breaks the generation.

Comparison to Gemini 2.0 Flash-Lite

Many people start doing retrieval by just passing the query/codebase pair into a model with a huge context window, like Gemini Flash-Lite, and use a prompt to score the relevance of each file. We beat the accuracy of Gemini 2.0 Flash-Lite at 2x the speed and 2/3 the cost.

Relace Reranker vs. Gemini 2.0 Flash-Lite

For more code examples and endpoint info, see the API reference.

Getting Started

Model Guides

Repos

Pricing

Model Overview

Accuracy Benchmark

Comparison to Gemini 2.0 Flash-Lite

Getting Started

Model Guides

Repos

Pricing

​Accuracy Benchmark

​Comparison to Gemini 2.0 Flash-Lite

Accuracy Benchmark

Comparison to Gemini 2.0 Flash-Lite