Rhone AI Gateway

Compatibility Matrix

OpenAI Chat Completions	`/v1/chat/completions`	Yes	Yes	Yes	Yes
OpenAI Responses	`/v1/responses`	Yes	Yes	Yes	Yes
Anthropic Messages	`/v1/messages`	Yes	Yes	Yes	Yes
Gemini Developer API	`/v1beta/models/{model}:{method}` and `/v1/models/{model}:{method}`	Yes	Yes	No	Yes
Vertex AI Gemini	`/v1/publishers/google/models/{model}:{method}` and `/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:{method}`	Yes	Yes	No	Yes

OpenAI Chat Completions

Use the OpenAI Python or Node SDK by changing the base URL to the canonical /v1 surface.

/v1/chat/completions

1from openai import OpenAI
2
3client = OpenAI(
4    api_key="rhone_sk_...",
5    base_url="https://api.rhone.dev/v1",
6)
7
8response = client.chat.completions.create(
9    model="gpt-4o",
10    messages=[
11        {"role": "system", "content": "You are a helpful assistant."},
12        {"role": "user", "content": "Hello!"},
13    ],
14)
15
16print(response.choices[0].message.content)

OpenAI Chat Stateful WebSocket

For OpenAI Chat Completions-compatible websocket mode, use the dedicated vai-openai-chat wrapper. It speaks rhone.ws.v1 under the hood while keeping Chat Completions request and chunk shapes.

typescript

1import OpenAI from "vai-openai-chat";
2import WebSocket from "ws";
3
4const client = new OpenAI({
5  apiKey: "rhone_sk_...",
6  baseURL: "https://api.rhone.dev",
7  webSocket: WebSocket,
8});
9
10const connection = await client.chat.completions.connect();
11await connection.bindSession({ session_id: "sess_01abc..." });
12
13const stream = connection.create({
14  model: "gpt-5.4-mini",
15  messages: [{ role: "user", content: "Draft a haiku." }],
16});
17
18for await (const chunk of stream) {
19  const text = chunk.choices[0]?.delta?.content;
20  if (text) process.stdout.write(text);
21}

OpenAI Responses

OpenAI Responses compatibility uses the canonical /v1/responses route.

/v1/responses

1from openai import OpenAI
2
3client = OpenAI(
4    api_key="rhone_sk_...",
5    base_url="https://api.rhone.dev/v1",
6)
7
8response = client.responses.create(
9    model="gpt-4o",
10    input="Explain quantum computing in one paragraph.",
11)
12
13print(response.output_text)

OpenAI Responses Stateful WebSocket

For OpenAI Responses-compatible websocket mode, use the dedicated vai-openai-responses wrapper. It speaks rhone.ws.v1 under the hood while keeping Responses request and event shapes.

typescript

1import OpenAI from "vai-openai-responses";
2import WebSocket from "ws";
3
4const client = new OpenAI({
5  apiKey: "rhone_sk_...",
6  baseURL: "https://api.rhone.dev",
7  webSocket: WebSocket,
8});
9
10const connection = await client.responses.connect();
11await connection.bindSession({ session_id: "sess_01abc..." });
12
13const stream = connection.create({
14  model: "gpt-5.4-mini",
15  input: "Draft a haiku.",
16});
17
18for await (const event of stream) {
19  if (event.type === "response.output_text.delta") {
20    process.stdout.write(event.delta);
21  }
22}

Anthropic Messages

Anthropic HTTP and SSE compatibility uses the canonical /v1/messages route. The gateway requires the standard Anthropic anthropic-version header and accepts gateway auth through x-api-key or bearer auth.

/v1/messages

1import anthropic
2
3client = anthropic.Anthropic(
4    api_key="rhone_sk_...",
5    base_url="https://api.rhone.dev",
6)
7
8message = client.messages.create(
9    model="claude-sonnet-4-6",
10    max_tokens=1024,
11    messages=[
12        {"role": "user", "content": "Hello, Claude!"},
13    ],
14)
15
16print(message.content[0].text)

Anthropic Stateful WebSocket

For Anthropic-compatible websocket mode, use the dedicated vai-anthropic wrapper. It speaks rhone.ws.v1 under the hood while keeping Anthropic request, response, and event shapes.

typescript

1import Anthropic from "vai-anthropic";
2import WebSocket from "ws";
3
4const client = new Anthropic({
5  apiKey: "rhone_sk_...",
6  baseURL: "https://api.rhone.dev",
7  webSocket: WebSocket,
8});
9
10const connection = await client.messages.connect();
11await connection.bindSession({ session_id: "sess_01abc..." });
12
13const stream = connection.create({
14  model: "claude-sonnet-4-6",
15  max_tokens: 512,
16  messages: [{ role: "user", content: "Draft a haiku." }],
17});
18
19for await (const event of stream) {
20  if (event.type === "content_block_delta" && event.delta.type === "text_delta") {
21    process.stdout.write(event.delta.text);
22  }
23}

Gemini Developer API

Route to Gemini models through the Gemini Developer API compatibility layer. This surface uses the gemini provider identity and GEMINI_API_KEY. Compatibility WebSocket is intentionally not exposed; use native /v1/ws for Gemini WebSocket calls and runs.

/v1beta/models/{model}:generateContent

shell

1curl https://api.rhone.dev/v1beta/models/gemini-2.0-flash:generateContent \
2  -H "Authorization: Bearer rhone_sk_..." \
3  -H "Content-Type: application/json" \
4  -d '{
5    "contents": [
6      {"role": "user", "parts": [{"text": "Hello!"}]}
7    ]
8  }'

Vertex AI Gemini

Route to Gemini models through the Vertex AI compatibility layer. This surface uses the vertex provider identity. API-key routing uses Vertex credentials only; project and location configuration is optional for project-scoped routing. Vertex final function calls are supported; incremental streamed tool-call argument deltas are currently marked unsupported/unverified in acceptance.

/v1/publishers/google/models/{model}:generateContent

/v1/projects/{project}/locations/{location}/publishers/google/models/{model}:generateContent

shell

1curl https://api.rhone.dev/v1/publishers/google/models/gemini-2.0-flash:generateContent \
2  -H "Authorization: Bearer rhone_sk_..." \
3  -H "Content-Type: application/json" \
4  -d '{
5    "contents": [
6      {"role": "user", "parts": [{"text": "Hello!"}]}
7    ]
8  }'

Rhone Continuity

Compatibility surfaces use the top-level rhone object for session continuity and context selection. Stateful mode is the default. Set rhone.session_id to continue an existing session, or set rhone.context_mode="stateless" to send a full replayed transcript explicitly.

shell

1# First call creates a session and returns rhone.session_id in the response body
2curl https://api.rhone.dev/v1/messages \
3  -H "x-api-key: rhone_sk_..." \
4  -H "anthropic-version: 2023-06-01" \
5  -H "Content-Type: application/json" \
6  -d '{
7    "model": "claude-sonnet-4-6",
8    "max_tokens": 256,
9    "messages": [{"role":"user","content":"Hi"}]
10  }'
11
12# Second call continues the same stateful session
13curl https://api.rhone.dev/v1/messages \
14  -H "x-api-key: rhone_sk_..." \
15  -H "anthropic-version: 2023-06-01" \
16  -H "Content-Type: application/json" \
17  -d '{
18    "model": "claude-sonnet-4-6",
19    "max_tokens": 256,
20    "messages": [{"role":"user","content":"What did I just say?"}],
21    "rhone": {"session_id":"sess_01abc..."}
22  }'

Provider Compatibility

Compatibility Matrix

OpenAI Chat Completions

OpenAI Chat Stateful WebSocket

OpenAI Responses

OpenAI Responses Stateful WebSocket

Anthropic Messages

Anthropic Stateful WebSocket

Gemini Developer API

Vertex AI Gemini

Rhone Continuity