Getting Started
SDKs & Libraries

SDKs & Libraries

Quantized is OpenAI-compatible for chat completions, responses, and embeddings. You can use the official OpenAI SDKs by pointing them at https://api.quantized.us/v1.

OpenAI Python SDK

pip install openai

from openai import OpenAI

client = OpenAI(
    api_key="sk-quantized-YOUR-KEY",
    base_url="https://api.quantized.us/v1",
)

# Chat completion
response = client.chat.completions.create(
    model="openai/gpt-4.1-mini",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="openai/gpt-4.1-mini",
    messages=[{"role": "user", "content": "Write a haiku about code."}],
    stream=True,
)
for chunk in stream:
    # Final chunks may use an empty `choices` list (e.g. usage-only); skip those.
    if not chunk.choices:
        continue
    delta = chunk.choices[0].delta
    if delta and delta.content:
        print(delta.content, end="")

# Embeddings
embedding = client.embeddings.create(
    model="text-embedding-3-small",
    input=["one", "two", "three"],
)
print(len(embedding.data), len(embedding.data[0].embedding))

OpenAI JavaScript/TypeScript SDK

npm install openai

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'sk-quantized-YOUR-KEY',
  baseURL: 'https://api.quantized.us/v1',
});

const response = await client.chat.completions.create({
  model: 'openai/gpt-4.1-mini',
  messages: [{ role: 'user', content: 'Hello!' }],
});

console.log(response.choices[0].message.content);

Direct HTTP (Python httpx)

import httpx

response = httpx.post(
    "https://api.quantized.us/v1/chat/completions",
    headers={"Authorization": "Bearer sk-quantized-YOUR-KEY"},
    json={
        "model": "openai/gpt-4.1-mini",
        "messages": [{"role": "user", "content": "Hello!"}],
    },
)
print(response.json())

Direct HTTP (cURL)

curl -X POST https://api.quantized.us/v1/chat/completions \
  -H "Authorization: Bearer sk-quantized-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4.1-mini",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Model naming

Models use a provider/model naming convention:

Model ID	Provider	Model
`openai/gpt-4.1-mini`	OpenRouter	GPT-4.1 Mini
`openai/gpt-4.1-nano`	OpenRouter	GPT-4.1 Nano
`anthropic/claude-sonnet-4`	Anthropic (direct)	Claude Sonnet 4
`anthropic/claude-sonnet-4.6`	OpenRouter / Anthropic	Claude Sonnet 4.6
`meta-llama/llama-3.3-70b-instruct`	OpenRouter	Llama 3.3 70B Instruct

Use the Models endpoint to list all available models and their pricing.

Web search and fetch

These endpoints are Quantized-specific (not OpenAI-compatible), so use them via direct HTTP calls:

# Web search
response = httpx.post(
    "https://api.quantized.us/v1/web-search",
    headers={"Authorization": "Bearer sk-quantized-YOUR-KEY"},
    json={"query": "Python 3.13 release date", "num_results": 5},
)

# Fetch content
response = httpx.post(
    "https://api.quantized.us/v1/fetch",
    headers={"Authorization": "Bearer sk-quantized-YOUR-KEY"},
    json={"urls": ["https://docs.python.org/3/whatsnew/3.13.html"]},
)