SDKs & Libraries
SDKs & Libraries
Quantized is OpenAI-compatible for chat completions and responses. You can use the official OpenAI SDKs by pointing them at https://api.quantized.us/v1.
OpenAI Python SDK
pip install openai
from openai import OpenAI
client = OpenAI(
api_key="sk-quantized-YOUR-KEY",
base_url="https://api.quantized.us/v1",
)
# Chat completion
response = client.chat.completions.create(
model="openai/gpt-4.1-mini",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
# Streaming
stream = client.chat.completions.create(
model="openai/gpt-4.1-mini",
messages=[{"role": "user", "content": "Write a haiku about code."}],
stream=True,
)
for chunk in stream:
# Final chunks may use an empty `choices` list (e.g. usage-only); skip those.
if not chunk.choices:
continue
delta = chunk.choices[0].delta
if delta and delta.content:
print(delta.content, end="")
OpenAI JavaScript/TypeScript SDK
npm install openai
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'sk-quantized-YOUR-KEY',
baseURL: 'https://api.quantized.us/v1',
});
const response = await client.chat.completions.create({
model: 'openai/gpt-4.1-mini',
messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content);
Direct HTTP (Python httpx)
import httpx
response = httpx.post(
"https://api.quantized.us/v1/chat/completions",
headers={"Authorization": "Bearer sk-quantized-YOUR-KEY"},
json={
"model": "openai/gpt-4.1-mini",
"messages": [{"role": "user", "content": "Hello!"}],
},
)
print(response.json())
Direct HTTP (cURL)
curl -X POST https://api.quantized.us/v1/chat/completions \
-H "Authorization: Bearer sk-quantized-YOUR-KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4.1-mini",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Model naming
Models use a provider/model naming convention:
| Model ID | Provider | Model |
|---|---|---|
openai/gpt-4.1-mini |
OpenRouter | GPT-4.1 Mini |
openai/gpt-4.1-nano |
OpenRouter | GPT-4.1 Nano |
anthropic/claude-sonnet-4-20250514 |
OpenRouter | Claude Sonnet 4 |
meta-llama/llama-4-maverick |
OpenRouter | Llama 4 Maverick |
claude-sonnet-4-20250514 |
Anthropic (direct) | Claude Sonnet 4 |
Use the Models endpoint to list all available models and their pricing.
Web search and fetch
These endpoints are Quantized-specific (not OpenAI-compatible), so use them via direct HTTP calls:
# Web search
response = httpx.post(
"https://api.quantized.us/v1/web-search",
headers={"Authorization": "Bearer sk-quantized-YOUR-KEY"},
json={"query": "Python 3.13 release date", "num_results": 5},
)
# Fetch content
response = httpx.post(
"https://api.quantized.us/v1/fetch",
headers={"Authorization": "Bearer sk-quantized-YOUR-KEY"},
json={"urls": ["https://docs.python.org/3/whatsnew/3.13.html"]},
)