Overview
When using frontier models to edit your codebase, you’re paying premium rates for both valuable changes and unchanged sections alike. Instant apply is about the separation of concerns — use heavyweight models for new sections of code, and use a lightweight model to merge the new into the old. Usingrelace-apply-3 running at 10k+ tok/s, this method is over 3x faster and cheaper than rewriting files from scratch.
We explain how we train/inference the model to achieve these speeds in our blog post.
Prerequisites
1
Generate Code Snippets
Add instructions for formatting edits somewhere in the system prompt for your LLM of choice. See our edit tool definition if you’re building an agent.
2
Merge with Instant Apply
Pass the abbreviated edit snippet along with the initial code to the Instant Apply endpoint.
3
Parse Final Code from Response
Collect the merged code from the response json.