Overview
When using frontier models to edit your codebase, you’re paying premium rates for both valuable changes and unchanged sections alike. Instant apply is about the separation of concerns — use heavyweight models for new sections of code, and use a lightweight model to merge the new into the old. Usingrelace-apply-3 running at 10k+ tok/s, this method is over 3x faster and cheaper than rewriting files from scratch.
We explain how we train/inference the model to achieve these speeds in our blog post.
Prerequisites
Generate Code Snippets
Add instructions for formatting edits somewhere in the system prompt for your LLM of choice. See our edit tool definition if you’re building an agent.
Merge with Instant Apply
Pass the abbreviated edit snippet along with the initial code to the Instant Apply endpoint.