OpenAI-Compatible Gateway

One Token. Every model. Drop-in OpenAI SDK.

Point your existing OpenAI SDK at TonBo's base URL and route every request to GPT-5, Claude, Gemini, Grok, Llama or Qwen — same key, same call signature, pay per token.

Get an API Key Open Playground

Drop-in SDK examples

https://api.tonboai.com/v1

curl https://api.tonboai.com/v1/chat/completions \
  -H "Authorization: Bearer $TONBO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

from openai import OpenAI

client = OpenAI(
    api_key="<TONBO_API_KEY>",
    base_url="https://api.tonboai.com/v1",
)

# One Token, any model
resp = client.chat.completions.create(
    model="gpt-5",
    messages=[{"role": "user", "content": "Refactor this fn"}],
)

# Switch to Claude — same Token, no extra subscription
resp2 = client.chat.completions.create(
    model="claude-4.5-opus",
    messages=[{"role": "user", "content": "Same task, deeper thinking"}],
)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.TONBO_API_KEY,
  baseURL: "https://api.tonboai.com/v1",
});

// Drop-in replacement for OpenAI SDK — streaming supported
const stream = await client.chat.completions.create({
  model: "claude-4.5-sonnet",   // or "gpt-5", "gemini-2.5-pro", "grok-4"
  messages: [{ role: "user", content: "Build me a Next.js page" }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

OpenAI SDK drop-in

Works with openai-python, openai-node, openai-go and any OpenAI-compatible client. Just change base_url — no rewrite.

One balance, all models

Pay-as-you-go tokens work across every model. No per-provider accounts, no per-vendor invoices — one dashboard, one bill.

Streaming out of the box

Server-Sent Events and chunked responses supported for every model. Build real-time agents without any protocol glue.

High availability

Automatic failover across upstream providers. If one vendor has an outage, TonBo re-routes so your product keeps responding.

Smart model routing

Pin a model explicitly, or let TonBo route by latency / cost / capability. Define routing policies per API key.

AES-256 tunnel

Every request traverses TonBo's encrypted AI tunnel with global smart routing — reliable access from any region, no data retention for training.

Start building

Ship multi-model apps without the plumbing

Get API Key See Token Pricing