Chat Completions

POST /v1/chat/completions

Generate a model response for a conversation. Compatible with the OpenAI Chat Completions API.

Headers

Header Required Description
Authorization Yes Bearer <api-key-or-jwt>
Content-Type Yes application/json
X-Quantized-Provider No Force a specific provider (openrouter, anthropic, bedrock)

Request body

Required fields

Field Type Description
model string Model identifier (e.g., openai/gpt-4.1-mini)
messages array List of conversation messages

Generation parameters

Field Type Default Description
max_tokens integer null Maximum tokens in the completion (minimum: 1)
max_completion_tokens integer null Alternative to max_tokens (minimum: 1)
temperature float null Sampling temperature (0–2). Lower is more deterministic
top_p float null Nucleus sampling threshold (0–1)
frequency_penalty float null Penalize tokens by frequency (−2 to 2)
presence_penalty float null Penalize tokens by presence (−2 to 2)
repetition_penalty float null Repetition penalty factor (0–2). OpenRouter-specific
stop string or array null Stop sequence(s)
seed integer null Seed for deterministic generation (best effort)

Output control

Field Type Default Description
response_format object null Output format: {"type": "json_object"} or {"type": "json_schema", "json_schema": {...}}
JSON output is guaranteed parseable

When response_format is json_object or json_schema, the router guarantees the choices[].message.content value is directly JSON.parse-able. Some models (notably Claude Haiku 4.5 when routed through Anthropic) wrap JSON output in markdown fences — the router strips those on your behalf so you don’t have to. See the Anthropic provider notes for details.

Tool calling

Field Type Default Description
tools array null Tool/function definitions. See Tool format below
tool_choice string or object null "auto", "none", "required", or {"type": "function", "function": {"name": "..."}}
parallel_tool_calls boolean null Allow parallel tool execution

Reasoning

Field Type Default Description
reasoning object null Reasoning config for thinking models. effort: "none", "low", "medium", or "high". Optional exclude (boolean) controls whether reasoning content is included in the response. Example: {"effort": "low", "exclude": false}

Streaming

Field Type Default Description
stream boolean false Enable SSE streaming
stream_options object null Streaming configuration, e.g. {"include_usage": true}. Usage is always included in the final SSE chunk regardless of this setting

Advanced

Field Type Default Description
logprobs boolean null Enable token log probabilities in the response
top_logprobs integer null Number of most likely tokens to return per position (0-20). Requires logprobs: true
logit_bias object null Map of token IDs to bias values (-100 to 100). Adjusts likelihood of specific tokens appearing in the output
user string null End-user identifier for tracking and abuse detection
Strict validation

The API uses strict parameter validation. Any field not listed above will be rejected with a 422 error. Parameters like top_k, modalities, audio, web_search_options, and metadata are not currently supported.

Messages

Each message must be an object with role (required) and content. Plain strings are not accepted.

Message fields

Field Type Required Description
role string Yes One of: system, user, assistant, tool
content string, array, or null No Text content, content parts (for vision), or null (for tool call messages)
name string No Participant name (for multi-user conversations)
tool_calls array No Tool calls made by the assistant (in assistant messages)
tool_call_id string No Links a tool response to its call (required for tool messages)
refusal string or null No Model refusal text. Only valid in assistant messages
reasoning string or null No Reasoning text from thinking models. Only valid in assistant messages
reasoning_details array or null No Detailed reasoning steps. Only valid in assistant messages
{"role": "user", "content": "What is 2+2?"}

Roles

Role Description
system Sets the model’s behavior and context
user The user’s input
assistant The model’s previous response (for multi-turn)
tool Response from a tool call (must include tool_call_id)

Text messages

[
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": "Hello!"}
]

Multi-turn conversations

[
  {"role": "user", "content": "What is the capital of France?"},
  {"role": "assistant", "content": "The capital of France is Paris."},
  {"role": "user", "content": "What about Germany?"}
]

Multimodal content parts

To send anything other than plain text, set content to an array of content parts. Each part declares a type and a type-specific payload. The following part types are accepted:

Content type Modality Shape
text text {"type": "text", "text": "..."}
image_url image {"type": "image_url", "image_url": {"url": "..."}}
input_audio audio {"type": "input_audio", "input_audio": {"data": "<base64>", "format": "mp3"}}
video_url video {"type": "video_url", "video_url": {"url": "..."}}
file document {"type": "file", "file": {"filename": "...", "file_data": "..."}}

Any other type value is rejected with a 422 error.

Image

Image URLs can be an HTTPS URL or a base64 data URI.

[
  {
    "role": "user",
    "content": [
      {"type": "text", "text": "What is in this image?"},
      {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
    ]
  }
]
https://example.com/image.png
data:image/jpeg;base64,/9j/4AAQ...

Audio

input_audio.data must be a raw base64 string (no data-URI prefix). input_audio.format is one of wav, mp3, aiff, aac, ogg, flac, m4a, pcm16, pcm24.

[
  {
    "role": "user",
    "content": [
      {"type": "text", "text": "Transcribe and summarize this audio."},
      {"type": "input_audio", "input_audio": {"data": "<base64-mp3>", "format": "mp3"}}
    ]
  }
]

Video

video_url.url can be an HTTPS URL or a base64 data URI with a video MIME type (video/mp4, video/mpeg, video/mov, video/webm).

[
  {
    "role": "user",
    "content": [
      {"type": "text", "text": "Describe what is happening in this video."},
      {"type": "video_url", "video_url": {"url": "data:video/mp4;base64,<base64>"}}
    ]
  }
]

Document (PDF)

file.file_data can be an HTTPS URL or a base64 data URI with application/pdf. Either file_data or file_id is required.

[
  {
    "role": "user",
    "content": [
      {"type": "text", "text": "Summarize this document."},
      {
        "type": "file",
        "file": {
          "filename": "report.pdf",
          "file_data": "data:application/pdf;base64,<base64>"
        }
      }
    ]
  }
]

Model requirements

The model you target must declare support for each non-text modality you send. If a request contains audio parts but the chosen model’s input_modality.audio is false, the request is rejected with a 400:

{"error": {"message": "Model 'openai/gpt-4.1-nano' does not support audio input"}}

See the Providers capability matrix for per-provider support, and the Models endpoint for per-model modality flags. PDFs (file) are routed through OpenRouter’s universal PDF parser and do not require a dedicated modality flag on the model.

Tool calls (multi-turn)

When the model calls a tool, continue the conversation by including the assistant’s tool call and your tool’s response:

[
  {"role": "user", "content": "What's the weather in Paris?"},
  {
    "role": "assistant",
    "content": null,
    "tool_calls": [
      {
        "id": "call_1",
        "type": "function",
        "function": {"name": "get_weather", "arguments": "{\"city\": \"Paris\"}"}
      }
    ]
  },
  {"role": "tool", "tool_call_id": "call_1", "content": "{\"temp\": 18, \"unit\": \"C\"}"}
]

Tool format

Define tools using the standard function calling format. This format is the same regardless of which provider handles the request — the provider layer converts it automatically.

{
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get weather for a city",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {"type": "string", "description": "The city name"}
          },
          "required": ["city"]
        }
      }
    }
  ],
  "tool_choice": "auto"
}

Examples

cURL
Python
OpenAI SDK
curl -X POST https://api.quantized.us/v1/chat/completions \
  -H "Authorization: Bearer sk-quantized-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4.1-mini",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "max_tokens": 128,
    "temperature": 0.7
  }'
import httpx

response = httpx.post(
    "https://api.quantized.us/v1/chat/completions",
    headers={"Authorization": "Bearer sk-quantized-YOUR-KEY"},
    json={
        "model": "openai/gpt-4.1-mini",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "What is the capital of France?"},
        ],
        "max_tokens": 128,
        "temperature": 0.7,
    },
)
data = response.json()
print(data["choices"][0]["message"]["content"])
from openai import OpenAI

client = OpenAI(
    api_key="sk-quantized-YOUR-KEY",
    base_url="https://api.quantized.us/v1",
)

response = client.chat.completions.create(
    model="openai/gpt-4.1-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"},
    ],
    max_tokens=128,
    temperature=0.7,
)
print(response.choices[0].message.content)

Response

{
  "id": "gen-abc123",
  "object": "chat.completion",
  "model": "openai/gpt-4.1-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris.",
        "refusal": null
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 8,
    "total_tokens": 33,
    "credits_used": 2400,
    "credits_remaining": 997600,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "cache_write_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0
    }
  },
  "created": 1719000000
}

Response fields

Field Type Description
id string Unique completion ID
object string Always "chat.completion"
model string Model that generated the response
created integer Unix timestamp
choices array List of completion choices
choices[].index integer Choice index
choices[].message.role string Always "assistant"
choices[].message.content string or null The generated text (null when tool_calls present)
choices[].message.refusal string or null Model’s refusal message if it declined to answer
choices[].message.tool_calls array or null Tool calls made by the model (present when model calls tools)
choices[].message.reasoning string or null Reasoning text from thinking models (present when reasoning is enabled)
choices[].message.reasoning_details array or null Detailed reasoning steps (present when reasoning is enabled)
choices[].finish_reason string "stop", "length", or "tool_calls"
choices[].logprobs object or null Token log probabilities (present when top_logprobs is set in the request)
usage.prompt_tokens integer Input tokens
usage.completion_tokens integer Output tokens
usage.total_tokens integer Total tokens
usage.credits_used integer Micro-credits consumed
usage.credits_remaining integer or null Micro-credits remaining (null if unlimited)
usage.prompt_tokens_details object or null Token breakdown: cached_tokens, cache_write_tokens, audio_tokens
usage.completion_tokens_details object or null Token breakdown: reasoning_tokens, audio_tokens

Streaming

Set "stream": true to receive Server-Sent Events. See the Streaming guide for details and code examples.

Errors

Status Condition
400 Invalid request (missing model, bad field types)
401 Invalid or missing API key
402 Insufficient credits
404 Model not found
422 Unsupported parameter or invalid field structure
503 Provider unavailable