Chat Completions API

The Chat Completions API is the primary endpoint for generating responses from AI models through ModelProxy.ai.

Endpoint #

POST https://modelproxy.theitdept.au/api/v1/chat/completions

Headers #

Header	Value	Description
`Authorization`	`Bearer YOUR_API_KEY`	Your API key
`Content-Type`	`application/json`	Indicates JSON payload

Request Body #

{
  "model": "openai/gpt-4o",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Explain quantum computing in simple terms"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 150,
  "stream": false,
  "usage": {
    "include": true
  }
}

Required Parameters #

Parameter	Type	Description
`model`	string	ID of the model to use. See Available Models for options.
`messages`	array	Array of message objects representing the conversation history.

Optional Parameters #

Parameter	Type	Default	Description
`temperature`	number	1.0	Controls randomness (0-2). Lower values make responses more deterministic.
`max_tokens`	integer	Varies by model	Maximum number of tokens to generate.
`stream`	boolean	false	If true, partial message deltas will be sent as a stream of Server-Sent Events.
`usage.include`	boolean	false	If true, includes token count and cost information in the response.
`top_p`	number	1.0	Controls diversity via nucleus sampling.
`frequency_penalty`	number	0.0	Reduces repetition of token sequences (-2.0 to 2.0).
`presence_penalty`	number	0.0	Reduces repetition of topics (-2.0 to 2.0).

Response Format #

Standard Response (Non-Streaming)#

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1677858242,
  "model": "openai/gpt-4o",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "Quantum computing is like a super-powered computer that uses the weird rules of quantum physics to solve problems that regular computers can't handle efficiently. Instead of using bits (0s and 1s), quantum computers use 'qubits' that can exist in multiple states at once, allowing them to process vast amounts of information simultaneously."
      },
      "finish_reason": "stop",
      "index": 0
    }
  ],
  "usage": {
    "prompt_tokens": 23,
    "completion_tokens": 63,
    "total_tokens": 86,
    "cost": 0.00086
  }
}

Streaming Response #

When stream: true is set, the API returns a stream of Server-Sent Events. Each event contains a delta of the message being generated:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1677858242,"model":"openai/gpt-4o","choices":[{"delta":{"role":"assistant"},"index":0,"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1677858242,"model":"openai/gpt-4o","choices":[{"delta":{"content":"Quantum"},"index":0,"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1677858242,"model":"openai/gpt-4o","choices":[{"delta":{"content":" computing"},"index":0,"finish_reason":null}]}

...

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1677858242,"model":"openai/gpt-4o","choices":[{"delta":{},"index":0,"finish_reason":"stop"}]}

data: [DONE]

Error Codes #

Status Code	Error Type	Description
400	Bad Request	Invalid request format or parameters
401	Unauthorized	Invalid API key
402	Payment Required	Insufficient credits
404	Not Found	Requested model not found
429	Too Many Requests	Rate limit exceeded
500	Server Error	Internal server error

Examples #

Basic Example #

curl -X POST https://modelproxy.theitdept.au/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello, who are you?"}
    ]
  }'

Streaming Example #

curl -X POST https://modelproxy.theitdept.au/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Tell me a short story about a robot."}
    ],
    "stream": true
  }'

For more information on how to handle streaming responses, see our Streaming Guide.