Overview

Repos is now in Public Beta! Reach out to us at [email protected] if you have any questions or need help with the integration.

Introduction

Relace Repos is a fully managed version control system specifically designed for AI applications, offering:

Lightweight read/write operations enabling easy integration with your backend.
Semantic search that maintains low latency on large codebases.
Multi-tenancy that scales to millions of repositories across thousands of users.
Permissive rate limits that allow interactions across many repos simultaneously.
Full git compatibility, allowing easy interactions from the git CLI.

Source Management for AI

Building a scalable AI code editing application usually requires:

Durable storage of source code
Versioning, so that unsuccessful or undesired changes can be easily reverted
Low-latency reads/writes on the working version of the code
Support for automated interactions with thousands of isolated repos
Integration with GitHub, to allow human developers to easily collaborate on the same code base

Relace Repos offers a centralized platform that satisfies all of these requirements, seamlessly integrated with our state-of-the-art embedding and retrieval models.

Existing Alternatives

Industry standard version control services like GitHub are designed primarily for human developers, which leads to some pain points for AI applications.

Service limits

Humans have low frequency manual interactions through websites and the git CLI. This results in relatively low limits that are insufficient for large scale automated AI applications:

A single account/organization may not exceed 100,000 repos
REST API requests may not exceed 5,000 per hour (15,000 for a GitHub app owned by an enterprise organization)

Most text-to-app systems treat each user application as a single repo, which leads to the repo limit being reached very quickly. Systems with many concurrent users will also hit the rate limit relatively quickly. Assuming that a single AI edit requires at least 2 API calls (pulling and pushing), your capacity would be ~2 requests per second.

Integration complexity

Human developer workflows are less constrained and more complex than a typical AI application. This leads to a system that requires many distinct steps to make a simple change to the code base:

Clone/checkout the repository locally
Edit the files
Stage and commit the changes
Push to the remote

Since commits are made locally, you must setup a git library or the git CLI on your host. Given that most AI workflows are run in a serverless or sandboxed environment, this also means that full source retrieval must happen every time you spin up the environment. This contributes to high cold-start latency, and often necessitates some sort of file caching strategy. A relatively simple workflow where an agent reads a code base and edits some files can quickly become a highly complex infrastructure problem.

Built-in Two-Stage Retrieval

Providing the right code context to your LLM is essential to produce the best results without sacrificing cost or latency. A reranker model offers the highest quality retrieval, but is computationally expensive to run (retrieval cost scales with the size of the code base). Pre-computing vector embeddings for code and using a vector similarity search for retrieval offers much faster/cheaper retrieval, but with lower accuracy. The best approach tends to be a hybrid system, where a vector search is used to provide a reduced set of candidate files for a reranker. Building this kind of system offers several challenges:

Embedding and reranker models have a hard limit on the size of input files, which means large files need to be chunked
Embeddings need to be stored in a vector database, and kept in sync with your source code
Computing embeddings is slow; this means it must be done asynchronously, exposing potential race conditions

Relace Repos handles all of this complexity for you with our two-stage retrieval system: Stage 1: Indexing

Large files are chunked into meaningful segments and passed to our optimized Embeddings model
Embeddings are updated asynchronously as your codebase evolves.
A vector similarity search over the stored embeddings is used to produce a set of candidate files for the reranker. Recently edited files that do not yet have stored embeddings are always included in the reranker input.

Stage 2: Reranking

Candidate files from the first stage are passed to our Code Reranker model
Files are ranked by the model based on relevance to the input query
Irrelevant code snippets are filtered out entirely

Next Steps

For more details on how to set up Relace Repos as your agents source control system see our onboarding guide.

Getting Started

Model Guides

Repos

Pricing

Introduction

Source Management for AI

Existing Alternatives

Service limits

Integration complexity

Built-in Two-Stage Retrieval

Next Steps

Getting Started

Model Guides

Repos

Pricing

​Introduction

​Source Management for AI

​Existing Alternatives

​Service limits

​Integration complexity

​Built-in Two-Stage Retrieval

​Next Steps

Introduction

Source Management for AI

Existing Alternatives

Service limits

Integration complexity

Built-in Two-Stage Retrieval

Next Steps