Create Chat Completion

Create a chat completion request. The model generates a response based on the provided message list.

Content Field Description

The content field supports the following two forms:Plain text string

"content": "Hello"

Array of objects (for multimodal input)Each element in the array is distinguished by the type field:

"content": [
    { "type": "text", "text": "Describe this image" },
    { "type": "image_url", "image_url": { "url": "data:image/png;base64,..." } },
    { "type": "video_url", "video_url": { "url": "data:video/mp4;base64,..." } }
]

image_url and video_url also support passing a string directly, equivalent to the url field in object form:

{ "type": "image_url", "image_url": "data:image/png;base64,..." }

Parameter Description

Each element in the array has the following fields:

Parameter	Required	Description	Type
`type`	required	Content type	`"text"` \| `"image_url"` \| `"video_url"`
`text`	required when `type=text`	Text content	string
`image_url`	required when `type=image_url`	For transmitting images. Supports object form `{"url": "..."}` or a URL string directly	object \| string
`video_url`	required when `type=video_url`	For transmitting videos. Supports object form `{"url": "..."}` or a URL string directly	object \| string

When image_url is passed as an object, its fields are:

Parameter	Required	Description	Type
`url`	required	Image content specified via base64 encoding or file id	string

When video_url is passed as an object, its fields are:

Parameter	Required	Description	Type
`url`	required	Video content specified via base64 encoding or file id, for example `data:video/mp4;base64,...`	string

Both the object form (url field) and the string shorthand support the following formats:

Base64 encoding: data:image/png;base64,... or data:video/mp4;base64,...
File reference: ms://<file_id>

See Use the Kimi Vision Model.

Usage Example

import os
import base64

from openai import OpenAI
from openai.types.chat import ChatCompletion

client: OpenAI = OpenAI(
    api_key=os.environ.get("MOONSHOT_API_KEY"),
    base_url="https://api.moonshot.ai/v1",
)

# Encode the image to base64
with open("your_image_path", "rb") as f:
    img_base: str = base64.b64encode(f.read()).decode("utf-8")

response: ChatCompletion = client.chat.completions.create(
    model="kimi-k2.6",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{img_base}",
                    },
                },
                {
                    "type": "text",
                    "text": "Describe this image",
                },
            ],
        }
    ],
)
print(response.choices[0].message.content)

Response Format

Non-streaming Response

{
    "id": "cmpl-04ea926191a14749b7f2c7a48a68abc6",
    "object": "chat.completion",
    "created": 1698999496,
    "model": "kimi-k2.6",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Hello, Li Lei! 1+1 equals 2. If you have any other questions, feel free to ask!"
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 19,
        "completion_tokens": 21,
        "total_tokens": 40,
        "cached_tokens": 10
    }
}

Streaming Response

data: {"id":"cmpl-xxx","object":"chat.completion.chunk","created":1698999575,"model":"kimi-k2.6","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"cmpl-xxx","object":"chat.completion.chunk","created":1698999575,"model":"kimi-k2.6","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

...

data: {"id":"cmpl-xxx","object":"chat.completion.chunk","created":1698999575,"model":"kimi-k2.6","choices":[{"index":0,"delta":{},"finish_reason":"stop","usage":{"prompt_tokens":19,"completion_tokens":13,"total_tokens":32}}]}

data: [DONE]

The model name in the response example will be returned based on the model parameter in the request. When using the kimi-k2.6 model, the "model" field in the response will show "kimi-k2.6".

Authorizations

Authorization

string

header

required

The Authorization header expects a Bearer token. Use an MOONSHOT_API_KEY as the token. This is a server-side secret key. Generate one on the API keys page in your dashboard.

Body

application/json

kimi-k2.6
kimi-k2.5
kimi-k2
kimi-k2-thinking
moonshot-v1

messages

object[]

required

A list of messages in the conversation so far. Each element has the format {"role": "user", "content": "Hello"}. role supports system, user, or assistant. content must not be empty. The content field can be a string or an array[object] (for multimodal input).

Show child attributes

model

enum<string>

default:kimi-k2.6

required

Model ID

Available options:

kimi-k2.6

max_tokens

integer

deprecated

Deprecated, please refer to max_completion_tokens

max_completion_tokens

integer

The maximum number of tokens to generate for the chat completion. If not specified, defaults to a reasonable integer such as 1024. If the result reaches the maximum number of tokens without ending, the finish reason will be "length"; otherwise, it will be "stop". This refers to the length of tokens you expect us to return, not the total length of input plus output. If input plus max_completion_tokens exceeds the model context window, the API returns invalid_request_error.

response_format

object

Controls the model output format. Default is {"type": "text"} for plain text output. Set to {"type": "json_object"} to enable JSON mode, ensuring output is a valid JSON object (you must guide the model to output JSON in the prompt). Set to {"type": "json_schema"} to enable Structured Output, constraining output to match a specified JSON Schema (recommended, requires the json_schema field). If you encounter schema validation issues, please submit feedback at walle GitHub Issues (https://github.com/MoonshotAI/walle/issues).

Show child attributes

stop

Stop words, which will halt the output when a full match is found. The matched words themselves will not be output. A maximum of 5 strings is allowed, and each string must not exceed 32 bytes

stream

boolean

default:false

Whether to return the response in a streaming fashion. Default is false.

stream_options

object

Options for streaming responses

Show child attributes

tools

object[]

A list of tools the model may call

Maximum array length: 128

Show child attributes

prompt_cache_key

string

Used to cache responses for similar requests to optimize cache hit rates. For Coding Agents, this is typically a session id or task id representing a single session; if the session is exited and later resumed, this value should remain the same. For Kimi Code Plan, this field is required to improve cache hit rates. For other agents involving multi-turn conversations, it is also recommended to implement this field

safety_identifier

string

A stable identifier used to help detect users of your application that may be violating usage policies. The ID should be a string that uniquely identifies each user. It is recommended to hash the username or email address to avoid sending any identifying information

thinking

object

Controls whether thinking is enabled for the kimi-k2.6 model, and whether to fully preserve reasoning_content across multi-turn conversations. Optional parameter. Default value is {"type": "enabled"}.

Show child attributes

Response

Chat completion response

string

Unique identifier for the completion

object

string

Object type

Example:

"chat.completion"

created

integer

Unix timestamp of when the completion was created

model

string

Model used for the completion

choices

object[]

List of completion choices

Show child attributes

usage

object

Show child attributes

Using the API

Capabilities

API Reference

Files

Batch API

Parameter Description

Usage Example

Non-streaming Response

Streaming Response

Authorizations

Body

Response

Using the API

Capabilities

API Reference

Files

Batch API

Documentation Index

​Parameter Description

​Usage Example

​Non-streaming Response

​Streaming Response

Authorizations

Body

Response

Parameter Description

Usage Example

Non-streaming Response

Streaming Response