> ## Documentation Index > Fetch the complete documentation index at: https://docs.relace.ai/llms.txt > Use this file to discover all available pages before exploring further. # Apply Code > Merge code snippets from an LLM into your existing codebase. Relace apply runs at over 10,000 tok/s, making complicated code merges feel instantaneous. ## Models We have deprecated all previous versions of the apply models in favor of `relace-apply-3`, which is more performant on accuracy, speed, and long context requests. | Model | Speed | Max Input | Max Output | | :--------------- | :------------- | :---------- | :---------- | | `relace-apply-3` | \~10k tokens/s | 128k tokens | 128k tokens | ## OpenAI Compatible Endpoint If the Relace REST API is inconvenient, we also support an OpenAI compatible endpoint for our apply models. ```typescript theme={null} import OpenAI from 'openai'; const client = new OpenAI({ apiKey: 'YOUR_API_KEY', baseURL: 'https://instantapply.endpoint.relace.run/v1/apply', }); const userMessage = ` {instruction} {initial_code} {edit_snippet} `; const response = await client.chat.completions.create({ model: 'auto', messages: [ { role: 'user', content: userMessage, }, ], }); ``` The user message must include `

` and `` tags following the format above. The `` tag is optional.



### OpenRouter

Relace Apply is also available via OpenRouter's OpenAI-compatible endpoint. See the [model page](https://openrouter.ai/relace/relace-apply-3) for current availability and details.

The [performance](https://openrouter.ai/relace/relace-apply-3/performance) section of the model page shows our real-time throughput and latency numbers from customer requests.

## Fallbacks

We recommend using GPT-4.1-mini with [predictive edits](https://platform.openai.com/docs/guides/predicted-outputs) as a fallback. This option is 10-20x slower than Relace's apply models, but it's useful for redundancy.

Relace apply models also return a `400` error code when input exceeds context limits (see table above). For these cases, GPT-4o-mini's 1M token context window is a reliable fallback option.

However, even frontier LLMs struggle with long context. We recommend proactive refactoring of files >32k tokens to improve output quality.


## OpenAPI

````yaml POST /v1/code/apply
openapi: 3.0.1
info:
  title: Relace API
  description: API for accessing Relace code generation models.
  version: 1.0.0
  license:
    name: MIT
servers:
  - url: https://instantapply.endpoint.relace.run
    description: Server for code application endpoints
  - url: https://ranker.endpoint.relace.run
    description: Server for code ranking endpoints
  - url: https://embeddings.endpoint.relace.run
    description: Server for code embedding endpoints
  - url: https://api.relace.run
    description: Server for general infrastructure
security:
  - bearerAuth: []
paths:
  /v1/code/apply:
    post:
      description: Merge code snippets from an LLM into your existing codebase.
      requestBody:
        description: Initial code and edits to apply
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/InstantApplyRequest'
            example:
              initial_code: |-
                function calculateTotal(items) {
                  let total = 0;
                  
                  for (const item of items) {
                    total += item.price * item.quantity;
                  }
                  
                  return total;
                }
              edit_snippet: |-
                // ... keep existing code

                function applyDiscount(total, discountRules) {
                  let discountedTotal = total;
                  
                  if (discountRules.percentOff) {
                    discountedTotal -= (total * discountRules.percentOff / 100);
                  }
                  
                  if (discountRules.fixedAmount && discountRules.fixedAmount < discountedTotal) {
                    discountedTotal -= discountRules.fixedAmount;
                  }
                  
                  return Math.max(0, discountedTotal);
                }
              stream: false
      responses:
        '200':
          description: Code successfully applied
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/InstantApplyResponse'
            text/event-stream:
              schema:
                type: string
                description: >-
                  Stream of results from the Instant Apply model (compatible
                  with OpenAI streaming format)
        '400':
          description: Bad request
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error400'
        '401':
          description: Unauthorized
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error401'
        '404':
          description: API key not found
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error404'
        '429':
          description: Rate limit exceeded
          headers:
            X-RateLimit-Limit:
              schema:
                type: string
              description: Rate limit ceiling for the API key
            X-RateLimit-Remaining:
              schema:
                type: string
              description: Number of requests left for the time window
            X-RateLimit-Reset:
              schema:
                type: string
              description: Time at which the rate limit resets
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error429'
        '500':
          description: Internal server error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error500'
      servers:
        - url: https://instantapply.endpoint.relace.run
components:
  schemas:
    InstantApplyRequest:
      type: object
      required:
        - initial_code
        - edit_snippet
      properties:
        model:
          type: string
          description: Choice of apply model to use
          enum:
            - relace-apply-3
          default: relace-apply-3
        initial_code:
          type: string
          description: The original code that needs to be modified
        edit_snippet:
          type: string
          description: The code changes to be applied to the initial code
        instruction:
          type: string
          description: >-
            Optional single line instruction for to disambiguate the edit
            snippet. *e.g.* `Remove the captcha from the login page`
        stream:
          type: boolean
          description: Whether to stream the response back
          default: false
        relace_metadata:
          type: object
          description: Optional metadata for logging and tracking purposes.
          additionalProperties: true
    InstantApplyResponse:
      type: object
      properties:
        mergedCode:
          type: string
          description: The merged code with the changes applied
        usage:
          type: object
          properties:
            prompt_tokens:
              type: integer
              description: Number of tokens in the prompt
            completion_tokens:
              type: integer
              description: Number of tokens in the completion
            total_tokens:
              type: integer
              description: Total number of tokens used
          description: Token usage information for the request
      example:
        mergedCode: |-
          function calculateTotal(items) {
            let total = 0;
            
            for (const item of items) {
              total += item.price * item.quantity;
            }
            
            return total;
          }

          function applyDiscount(total, discountRules) {
            let discountedTotal = total;
            
            if (discountRules.percentOff) {
              discountedTotal -= (total * discountRules.percentOff / 100);
            }
            
            if (discountRules.fixedAmount && discountRules.fixedAmount < discountedTotal) {
              discountedTotal -= discountRules.fixedAmount;
            }
            
            return Math.max(0, discountedTotal);
          }
        usage:
          prompt_tokens: 245
          completion_tokens: 187
          total_tokens: 432
    Error400:
      type: object
      properties:
        error:
          type: string
          description: Error message
          example: Invalid JSON in request body
    Error401:
      type: object
      properties:
        error:
          type: string
          description: Error message
          example: Authorized header required
    Error404:
      type: object
      properties:
        error:
          type: string
          description: Error message
          example: 'Bad Request: Route not found'
    Error429:
      type: object
      properties:
        error:
          type: string
          description: Error message
          example: Rate limit exceeded
    Error500:
      type: object
      properties:
        error:
          type: string
          description: Error message
          example: Error fetching from origin server
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      description: Relace API key Authorization header using the Bearer scheme.

````