POST
/
v1
/
code
/
apply
cURL
curl --request POST \
  --url https://instantapply.endpoint.relace.run/v1/code/apply \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "initial_code": "function calculateTotal(items) {\n  let total = 0;\n  \n  for (const item of items) {\n    total += item.price * item.quantity;\n  }\n  \n  return total;\n}",
  "edit_snippet": "// ... keep existing code\n\nfunction applyDiscount(total, discountRules) {\n  let discountedTotal = total;\n  \n  if (discountRules.percentOff) {\n    discountedTotal -= (total * discountRules.percentOff / 100);\n  }\n  \n  if (discountRules.fixedAmount && discountRules.fixedAmount < discountedTotal) {\n    discountedTotal -= discountRules.fixedAmount;\n  }\n  \n  return Math.max(0, discountedTotal);\n}",
  "stream": false
}'
{
  "mergedCode": "function calculateTotal(items) {\n  let total = 0;\n  \n  for (const item of items) {\n    total += item.price * item.quantity;\n  }\n  \n  return total;\n}\n\nfunction applyDiscount(total, discountRules) {\n  let discountedTotal = total;\n  \n  if (discountRules.percentOff) {\n    discountedTotal -= (total * discountRules.percentOff / 100);\n  }\n  \n  if (discountRules.fixedAmount && discountRules.fixedAmount < discountedTotal) {\n    discountedTotal -= discountRules.fixedAmount;\n  }\n  \n  return Math.max(0, discountedTotal);\n}",
  "usage": {
    "prompt_tokens": 245,
    "completion_tokens": 187,
    "total_tokens": 432
  }
}
Relace apply models run at up to 10,000 tok/s, making complicated code merges feel instantaneous.

Models

We support two families of models, lite and main. The lite family has fewer parameters, and is highly accurate for shorter requests (less than 16k tokens). The main family is designed specifically to improve accuracy for long context tasks (>16k tokens).
ModelSpeedMax InputUse Case
auto128k tokAuto-route based on input size
relace-apply-2.5-lite~10k tok/s16k tokFast & accurate on short context
relace-apply-3~7.5k tok/s128k tokHighly accurate on long context
We strongly recommend using the auto option for best performance.

OpenAI Compatible Endpoint

If the Relace REST API is inconvenient, we also support an OpenAI compatible endpoint for our apply models.
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://instantapply.endpoint.relace.run/v1/apply"
)

user_message = """
<instruction>{instruction}</instruction>
<code>{initial_code}</code>
<update>{edit_snippet}</update>
"""

response = client.chat.completions.create(
    model="auto",
    messages=[
        {
            "role": "user",
            "content": user_message
        }
    ]
)
The user message must include <code> and <update> tags following the format above. The <instruction> tag is optional.

Fallbacks

We recommend using GPT-4.1-mini with predictive edits as a fallback. This option is 10-20x slower than Relace’s apply models, but it’s useful for redundancy. Relace apply models also return a 400 error code when input exceeds context limits (see table above). For these cases, GPT-4o-mini’s 1M token context window is a reliable fallback option. However, even frontier LLMs struggle with long context. We recommend proactive refactoring of files >32k tokens to improve output quality.

Authorizations

Authorization
string
header
required

Relace API key Authorization header using the Bearer scheme.

Body

application/json

Initial code and edits to apply

initial_code
string
required

The original code that needs to be modified

edit_snippet
string
required

The code changes to be applied to the initial code

model
enum<string>
default:auto

Choice of apply model to use

Available options:
auto,
relace-apply-2.5-lite,
relace-apply-2
instruction
string

Optional single line instruction for to disambiguate the edit snippet. e.g. Remove the captcha from the login page

stream
boolean
default:false

Whether to stream the response back

relace_metadata
object

Optional metadata for logging and tracking purposes.

Response

Code successfully applied

mergedCode
string

The merged code with the changes applied

usage
object

Token usage information for the request