Repos is now in Public Beta! Reach out to us at info@relace.ai if you have any questions or need help with the integration.
Introduction
Relace Repos is a fully managed version control system specifically designed for AI applications, offering:- Lightweight read/write operations enabling easy integration with your backend.
- Semantic search that maintains low latency on large codebases.
- Multi-tenancy that scales to millions of repositories across thousands of users.
- Permissive rate limits that allow interactions across many repos simultaneously.
- Full git compatibility, allowing easy interactions from the git CLI.
Source Management for AI
Building a scalable AI code editing application usually requires:- Durable storage of source code
- Versioning, so that unsuccessful or undesired changes can be easily reverted
- Low-latency reads/writes on the working version of the code
- Support for automated interactions with thousands of isolated repos
- Integration with GitHub, to allow human developers to easily collaborate on the same code base
Existing Alternatives
Industry standard version control services like GitHub are designed primarily for human developers, which leads to some pain points for AI applications.Service limits
Humans have low frequency manual interactions through websites and the git CLI. This results in relatively low limits that are insufficient for large scale automated AI applications:- A single account/organization may not exceed 100,000 repos
- REST API requests may not exceed 5,000 per hour (15,000 for a GitHub app owned by an enterprise organization)
Integration complexity
Human developer workflows are less constrained and more complex than a typical AI application. This leads to a system that requires many distinct steps to make a simple change to the code base:- Clone/checkout the repository locally
- Edit the files
- Stage and commit the changes
- Push to the remote
Built-in Two-Stage Retrieval
Providing the right code context to your LLM is essential to produce the best results without sacrificing cost or latency. A reranker model offers the highest quality retrieval, but is computationally expensive to run (retrieval cost scales with the size of the code base). Pre-computing vector embeddings for code and using a vector similarity search for retrieval offers much faster/cheaper retrieval, but with lower accuracy. The best approach tends to be a hybrid system, where a vector search is used to provide a reduced set of candidate files for a reranker. Building this kind of system offers several challenges:- Embedding and reranker models have a hard limit on the size of input files, which means large files need to be chunked
- Embeddings need to be stored in a vector database, and kept in sync with your source code
- Computing embeddings is slow; this means it must be done asynchronously, exposing potential race conditions
- Large files are chunked into meaningful segments and passed to our optimized Embeddings model
- Embeddings are updated asynchronously as your codebase evolves.
- A vector similarity search over the stored embeddings is used to produce a set of candidate files for the reranker. Recently edited files that do not yet have stored embeddings are always included in the reranker input.
- Candidate files from the first stage are passed to our Code Reranker model
- Files are ranked by the model based on relevance to the input query
- Irrelevant code snippets are filtered out entirely