Skip to main content
POST
/
v1
/
code
/
apply
cURL
curl --request POST \
  --url https://instantapply.endpoint.relace.run/v1/code/apply \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "initial_code": "function calculateTotal(items) {\n  let total = 0;\n  \n  for (const item of items) {\n    total += item.price * item.quantity;\n  }\n  \n  return total;\n}",
  "edit_snippet": "// ... keep existing code\n\nfunction applyDiscount(total, discountRules) {\n  let discountedTotal = total;\n  \n  if (discountRules.percentOff) {\n    discountedTotal -= (total * discountRules.percentOff / 100);\n  }\n  \n  if (discountRules.fixedAmount && discountRules.fixedAmount < discountedTotal) {\n    discountedTotal -= discountRules.fixedAmount;\n  }\n  \n  return Math.max(0, discountedTotal);\n}",
  "stream": false
}
'
{
  "mergedCode": "function calculateTotal(items) {\n  let total = 0;\n  \n  for (const item of items) {\n    total += item.price * item.quantity;\n  }\n  \n  return total;\n}\n\nfunction applyDiscount(total, discountRules) {\n  let discountedTotal = total;\n  \n  if (discountRules.percentOff) {\n    discountedTotal -= (total * discountRules.percentOff / 100);\n  }\n  \n  if (discountRules.fixedAmount && discountRules.fixedAmount < discountedTotal) {\n    discountedTotal -= discountRules.fixedAmount;\n  }\n  \n  return Math.max(0, discountedTotal);\n}",
  "usage": {
    "prompt_tokens": 245,
    "completion_tokens": 187,
    "total_tokens": 432
  }
}
Relace apply runs at over 10,000 tok/s, making complicated code merges feel instantaneous.

Models

We have deprecated all previous versions of the apply models in favor of relace-apply-3, which is more performant on accuracy, speed, and long context requests.
ModelSpeedMax InputMax Output
relace-apply-3~10k tokens/s128k tokens128k tokens

OpenAI Compatible Endpoint

If the Relace REST API is inconvenient, we also support an OpenAI compatible endpoint for our apply models.
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://instantapply.endpoint.relace.run/v1/apply',
});

const userMessage = `
<instruction>{instruction}</instruction>
<code>{initial_code}</code>
<update>{edit_snippet}</update>
`;

const response = await client.chat.completions.create({
  model: 'auto',
  messages: [
    {
      role: 'user',
      content: userMessage,
    },
  ],
});
The user message must include <code> and <update> tags following the format above. The <instruction> tag is optional.

OpenRouter

Relace Apply is also available via OpenRouter’s OpenAI-compatible endpoint. See the model page for current availability and details. The performance section of the model page shows our real-time throughput and latency numbers from customer requests.

Fallbacks

We recommend using GPT-4.1-mini with predictive edits as a fallback. This option is 10-20x slower than Relace’s apply models, but it’s useful for redundancy. Relace apply models also return a 400 error code when input exceeds context limits (see table above). For these cases, GPT-4o-mini’s 1M token context window is a reliable fallback option. However, even frontier LLMs struggle with long context. We recommend proactive refactoring of files >32k tokens to improve output quality.

Authorizations

Authorization
string
header
required

Relace API key Authorization header using the Bearer scheme.

Body

application/json

Initial code and edits to apply

initial_code
string
required

The original code that needs to be modified

edit_snippet
string
required

The code changes to be applied to the initial code

model
enum<string>
default:relace-apply-3

Choice of apply model to use

Available options:
relace-apply-3
instruction
string

Optional single line instruction for to disambiguate the edit snippet. e.g. Remove the captcha from the login page

stream
boolean
default:false

Whether to stream the response back

relace_metadata
object

Optional metadata for logging and tracking purposes.

Response

Code successfully applied

mergedCode
string

The merged code with the changes applied

usage
object

Token usage information for the request