Chat Completions API
Generate text completions using AI models through the ModelProxy.ai Chat Completions API
Chat Completions API
The Chat Completions API is the primary endpoint for generating responses from AI models through ModelProxy.ai.
Endpoint#
POST https://modelproxy.theitdept.au/api/v1/chat/completions
Headers#
Header | Value | Description |
---|---|---|
Authorization | Bearer YOUR_API_KEY | Your API key |
Content-Type | application/json | Indicates JSON payload |
Request Body#
{
"model": "openai/gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Explain quantum computing in simple terms"
}
],
"temperature": 0.7,
"max_tokens": 150,
"stream": false,
"usage": {
"include": true
}
}
Required Parameters#
Parameter | Type | Description |
---|---|---|
model | string | ID of the model to use. See Available Models for options. |
messages | array | Array of message objects representing the conversation history. |
Optional Parameters#
Parameter | Type | Default | Description |
---|---|---|---|
temperature | number | 1.0 | Controls randomness (0-2). Lower values make responses more deterministic. |
max_tokens | integer | Varies by model | Maximum number of tokens to generate. |
stream | boolean | false | If true, partial message deltas will be sent as a stream of Server-Sent Events. |
usage.include | boolean | false | If true, includes token count and cost information in the response. |
top_p | number | 1.0 | Controls diversity via nucleus sampling. |
frequency_penalty | number | 0.0 | Reduces repetition of token sequences (-2.0 to 2.0). |
presence_penalty | number | 0.0 | Reduces repetition of topics (-2.0 to 2.0). |
Response Format#
Standard Response (Non-Streaming)#
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1677858242,
"model": "openai/gpt-4o",
"choices": [
{
"message": {
"role": "assistant",
"content": "Quantum computing is like a super-powered computer that uses the weird rules of quantum physics to solve problems that regular computers can't handle efficiently. Instead of using bits (0s and 1s), quantum computers use 'qubits' that can exist in multiple states at once, allowing them to process vast amounts of information simultaneously."
},
"finish_reason": "stop",
"index": 0
}
],
"usage": {
"prompt_tokens": 23,
"completion_tokens": 63,
"total_tokens": 86,
"cost": 0.00086
}
}
Streaming Response#
When stream: true
is set, the API returns a stream of Server-Sent Events. Each event contains a delta of the message being generated:
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1677858242,"model":"openai/gpt-4o","choices":[{"delta":{"role":"assistant"},"index":0,"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1677858242,"model":"openai/gpt-4o","choices":[{"delta":{"content":"Quantum"},"index":0,"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1677858242,"model":"openai/gpt-4o","choices":[{"delta":{"content":" computing"},"index":0,"finish_reason":null}]}
...
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1677858242,"model":"openai/gpt-4o","choices":[{"delta":{},"index":0,"finish_reason":"stop"}]}
data: [DONE]
Error Codes#
Status Code | Error Type | Description |
---|---|---|
400 | Bad Request | Invalid request format or parameters |
401 | Unauthorized | Invalid API key |
402 | Payment Required | Insufficient credits |
404 | Not Found | Requested model not found |
429 | Too Many Requests | Rate limit exceeded |
500 | Server Error | Internal server error |
Examples#
Basic Example#
curl -X POST https://modelproxy.theitdept.au/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello, who are you?"}
]
}'
Streaming Example#
curl -X POST https://modelproxy.theitdept.au/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me a short story about a robot."}
],
"stream": true
}'
For more information on how to handle streaming responses, see our Streaming Guide.