🎉 New kimi k2.5 Multi-modal Model released! Now supports multimodal understanding and processing.
Docs
Getting Started Guide
Use Tool Calls with Kimi API

Use Kimi API for Tool Calls

Tool calls, or tool_calls, evolved from function calls (function_call). In certain contexts, or when reading compatibility code, you can consider tool_calls and function_call to be the same. function_call is a subset of tool_calls.

What are Tool Calls?

Tool calls give the Kimi large language model the ability to perform specific actions. The Kimi large language model can engage in conversations and answer questions, which is its "talking" ability. Through tool calls, it also gains the ability to "do" things. With tool_calls, the Kimi large language model can help you search the internet, query databases, and even control smart home devices.

A tool call involves several steps:

  1. Define the tool using JSON Schema format;
  2. Submit the defined tool to the Kimi large language model via the tools parameter. You can submit multiple tools at once;
  3. The Kimi large language model will decide which tool(s) to use based on the context of the current conversation. It can also choose not to use any tools;
  4. The Kimi large language model will output the parameters and information needed to call the tool in JSON format;
  5. Use the parameters output by the Kimi large language model to execute the corresponding tool and submit the results back to the Kimi large language model;
  6. The Kimi large language model will respond to the user based on the results of the tool execution;

Reading the above steps, you might wonder:

Why can't the Kimi large language model execute the tools itself? Why do we need to "help" the Kimi large language model execute the tools based on the parameters it generates? If we are the ones executing the tool calls, what is the role of the Kimi large language model?

We will use a practical example of a tool call to explain these questions to the reader.

Enable the Kimi Large Language Model to Access the Internet via tool_calls

The knowledge of the Kimi large language model comes from its training data. For questions that are time-sensitive, the Kimi large language model cannot find answers from its existing knowledge. In such cases, we want the Kimi large language model to search the internet for the latest information and answer our questions based on that information.

Define the Tools

Imagine how we find the information we want on the internet:

  1. We open a search engine, such as Baidu or Bing, and search for the content we want. We then browse the search results and decide which one to click based on the website title and description;
  2. We might open one or more web pages from the search results and browse them to obtain the knowledge we need;

Reviewing our actions, we "use a search engine to search" and "open the web pages corresponding to the search results." The tools we use are the "search engine" and the "web browser." Therefore, we need to abstract these actions into tools in JSON Schema format and submit them to the Kimi large language model, allowing it to use search engines and browse web pages just like humans do.

Before we proceed, let's briefly introduce the JSON Schema format:

JSON Schema (opens in a new tab) is a vocabulary that you can use to annotate and validate JSON documents.

JSON Schema (opens in a new tab) is a JSON document used to describe the format of JSON data.

We define the following JSON Schema:

{
	"type": "object",
	"properties": {
		"name": {
			"type": "string"
		}
	}
}

This JSON Schema defines a JSON Object that contains a field named name, and the type of this field is string, for example:

{
	"name": "Hei"
}

By describing our tool definitions using JSON Schema, we can make it clearer and more intuitive for the Kimi large language model to understand what parameters our tools require, as well as the type and description of each parameter. Now let's define the "search engine" and "web browser" tools mentioned earlier:

tools = [
	{
		"type": "function", # The agreed-upon field type, currently supports function as a value
		"function": { # When type is function, use the function field to define the specific function content
			"name": "search", # The name of the function. Please use English letters, numbers, hyphens, and underscores as the function name
			"description": """ 
				Search for content on the internet using a search engine.
 
				When your knowledge cannot answer the user's question, or when the user requests an online search, call this tool. Extract the content the user wants to search for from the conversation and use it as the value of the query parameter.
				The search results include the website title, address (URL), and description.
			""", # A description of the function, detailing its specific role and usage scenarios, to help the Kimi large language model correctly select which functions to use
			"parameters": { # Use the parameters field to define the parameters the function accepts
				"type": "object", # Always use type: object to make the Kimi large language model generate a JSON Object parameter
				"required": ["query"], # Use the required field to tell the Kimi large language model which parameters are mandatory
				"properties": { # The properties field contains the specific parameter definitions; you can define multiple parameters
					"query": { # Here, the key is the parameter name, and the value is the specific definition of the parameter
						"type": "string", # Use type to define the parameter type
						"description": """
							The content the user wants to search for, extracted from the user's question or conversation context.
						""" # Use description to describe the parameter so that the Kimi large language model can better generate the parameter
					}
				}
			}
		}
	},
	{
		"type": "function", # The agreed-upon field type, currently supports function as a value
		"function": { # When type is function, use the function field to define the specific function content
			"name": "crawl", # The name of the function. Please use English letters, numbers, hyphens, and underscores as the function name
			"description": """
				Retrieve web page content based on the website address (URL).
			""", # A description of the function, detailing its specific role and usage scenarios, to help the Kimi large language model correctly select which functions to use
			"parameters": { # Use the parameters field to define the parameters the function accepts
				"type": "object", # Always use type: object to make the Kimi large language model generate a JSON Object parameter
				"required": ["url"], # Use the required field to tell the Kimi large language model which parameters are mandatory
				"properties": { # The properties field contains the specific parameter definitions; you can define multiple parameters
					"url": { # Here, the key is the parameter name, and the value is the specific definition of the parameter
						"type": "string", # Use type to define the parameter type
						"description": """
							The website address (URL) from which to retrieve content, usually obtained from search results.
						""" # Use description to describe the parameter so that the Kimi large language model can better generate the parameter
					}
				}
			}
		}
	}
]

When defining tools using JSON Schema, we use the following fixed format:

{
	"type": "function",
	"function": {
		"name": "NAME",
		"description": "DESCRIPTION",
		"parameters": {
			"type": "object",
			"properties": {
				
			}
		}
	}
}

Here, name, description, and parameters.properties are defined by the tool provider. The description explains the specific function and when to use the tool, while parameters outlines the specific parameters needed to successfully call the tool, including parameter types and descriptions. Ultimately, the Kimi large language model will generate a JSON Object that meets the defined requirements as the parameters (arguments) for the tool call based on the JSON Schema.

Register Tools

Let's try submitting the search tool to the Kimi large language model to see if it can correctly call the tool:

from openai import OpenAI
 
 
client = OpenAI(
    api_key="MOONSHOT_API_KEY", # Replace MOONSHOT_API_KEY with the API Key you obtained from the Kimi Open Platform
    base_url="https://api.moonshot.ai/v1",
)
 
tools = [
	{
		"type": "function", # The field "type" is a convention, currently supporting "function" as its value
		"function": { # When "type" is "function", use the "function" field to define the specific function content
			"name": "search", # The name of the function, please use English letters, numbers, plus hyphens and underscores as the function name
			"description": """ 
				Search for content on the internet using a search engine.
 
				When your knowledge cannot answer the user's question, or when the user requests an online search, call this tool. Extract the content the user wants to search from the conversation as the value of the query parameter.
				The search results include the website title, website address (URL), and website description.
			""", # Description of the function, write the specific function and usage scenarios here so that the Kimi large language model can correctly choose which functions to use
			"parameters": { # Use the "parameters" field to define the parameters accepted by the function
				"type": "object", # Always use "type": "object" to make the Kimi large language model generate a JSON Object parameter
				"required": ["query"], # Use the "required" field to tell the Kimi large language model which parameters are required
				"properties": { # The specific parameter definitions are in "properties", you can define multiple parameters
					"query": { # Here, the key is the parameter name, and the value is the specific definition of the parameter
						"type": "string", # Use "type" to define the parameter type
						"description": """
							The content the user wants to search for, extract it from the user's question or chat context.
						""" # Use "description" to describe the parameter so that the Kimi large language model can better generate the parameter
					}
				}
			}
		}
	},
	# {
	# 	"type": "function", # The field "type" is a convention, currently supporting "function" as its value
	# 	"function": { # When "type" is "function", use the "function" field to define the specific function content
	# 		"name": "crawl", # The name of the function, please use English letters, numbers, plus hyphens and underscores as the function name
	# 		"description": """
	# 			Get the content of a webpage based on the website address (URL).
	# 		""", // Description of the function, write the specific function and usage scenarios here so that the Kimi large language model can correctly choose which functions to use
	# 		"parameters": { // Use the "parameters" field to define the parameters accepted by the function
	# 			"type": "object", // Always use "type": "object" to make the Kimi large language model generate a JSON Object parameter
	# 			"required": ["url"], // Use the "required" field to tell the Kimi large language model which parameters are required
	# 			"properties": { // The specific parameter definitions are in "properties", you can define multiple parameters
	# 				"url": { // Here, the key is the parameter name, and the value is the specific definition of the parameter
	# 					"type": "string", // Use "type" to define the parameter type
	# 					"description": """
	# 						The website address (URL) of the content to be obtained, which can usually be obtained from the search results.
	# 					""" // Use "description" to describe the parameter so that the Kimi large language model can better generate the parameter
	# 				}
	# 			}
	# 		}
	# 	}
	# }
]
 
```python
completion = client.chat.completions.create(
    model="kimi-k2.5",
    messages=[
        {"role": "system", "content": "You are Kimi, an AI assistant provided by Moonshot AI. You are proficient in Chinese and English conversations. You provide users with safe, helpful, and accurate answers. You refuse to answer any questions related to terrorism, racism, or explicit content. Moonshot AI is a proper noun and should not be translated."},
        {"role": "user", "content": "Please search the internet for 'Context Caching' and tell me what it is."} # In the question, we ask Kimi large language model to search online
    ],
    tools=tools, # <-- We pass the defined tools to Kimi large language model via the tools parameter
)
 
print(completion.choices[0].model_dump_json(indent=4))

When the above code runs successfully, we get the response from Kimi large language model:

{
    "finish_reason": "tool_calls",
    "message": {
        "content": "",
        "role": "assistant",
        "tool_calls": [
            {
                "id": "search:0",
                "function": {
                    "arguments": "{\n    \"query\": \"Context Caching\"\n}",
                    "name": "search"
                },
                "type": "function",
            }
        ]
    }
}

Notice that in this response, the value of finish_reason is tool_calls, which means that the response is not the answer from Kimi large language model, but rather the tool that Kimi large language model has chosen to execute. You can determine whether the current response from Kimi large language model is a tool call tool_calls by checking the value of finish_reason.

In the message section, the content field is empty because the model is currently executing tool_calls and has not yet generated a response for the user. Meanwhile, a new field tool_calls has been added. The tool_calls field is a list that contains all the tool call information for this execution. This also indicates another characteristic of tool_calls: the model can choose to call multiple tools at once, which can be different tools or the same tool with different parameters. Each element in tool_calls represents a tool call. Kimi large language model generates a unique id for each tool call. The function.name field indicates the name of the function being executed, and the parameters are placed in function.arguments. The arguments parameter is a valid serialized JSON Object (additionally, the type parameter is currently a fixed value function).

Next, we should use the tool call parameters generated by Kimi large language model to execute the specific tools.

Execute the Tools

Kimi large language model does not execute the tools for us. We need to execute the parameters generated by Kimi large language model after receiving them. Before explaining how to execute the tools, let's first address the question we raised earlier:

Why can't Kimi large language model execute the tools itself, but instead requires us to "help" it execute the tools based on the parameters generated by Kimi large language model? If we are the ones executing the tool calls, what is the purpose of Kimi large language model?

Let's imagine a scenario where we use Kimi large language model: we provide users with a smart robot based on Kimi large language model. In this scenario, there are three roles: the user, the robot, and Kimi large language model. The user asks the robot a question, the robot calls the Kimi large language model API, and returns the API result to the user. When using tool_calls, the user asks the robot a question, the robot calls the Kimi API with tools, Kimi large language model returns the tool_calls parameters, the robot executes the tool_calls, submits the results back to the Kimi API, Kimi large language model generates the message to be returned to the user (finish_reason=stop), and only then does the robot return the message to the user. Throughout this process, the entire tool_calls process is transparent and implicit to the user.

Returning to the question above, as users, we are not actually executing the tool calls, nor do we directly "see" the tool calls. Instead, the robot that provides us with the service is completing the tool calls and presenting us with the final response generated by Kimi large language model.

Let's explain how to execute the tool_calls returned by Kimi large language model from the perspective of the "robot":

from typing import *
 
import json
 
from openai import OpenAI
 
 
client = OpenAI(
    api_key="MOONSHOT_API_KEY", # Replace MOONSHOT_API_KEY with the API Key you obtained from the Kimi Open Platform
    base_url="https://api.moonshot.ai/v1",
)
 
tools = [
	{
		"type": "function", # The field type is agreed upon, and currently supports function as a value
		"function": { # When type is function, use the function field to define the specific function content
			"name": "search", # The name of the function, please use English letters, numbers, plus hyphens and underscores as the function name
			"description": """ 
				Search for content on the internet using a search engine.
 
				When your knowledge cannot answer the user's question, or the user requests you to perform an online search, call this tool. Extract the content the user wants to search from the conversation as the value of the query parameter.
				The search results include the title of the website, the website address (URL), and a brief introduction to the website.
			""", # Introduction to the function, write the specific function here, as well as the usage scenario, so that the Kimi large language model can correctly choose which functions to use
			"parameters": { # Use the parameters field to define the parameters accepted by the function
				"type": "object", # Fixed use type: object to make the Kimi large language model generate a JSON Object parameter
				"required": ["query"], # Use the required field to tell the Kimi large language model which parameters are required
				"properties": { # The specific parameter definitions are in properties, and you can define multiple parameters
					"query": { # Here, the key is the parameter name, and the value is the specific definition of the parameter
						"type": "string", # Use type to define the parameter type
						"description": """
							The content the user wants to search for, extracted from the user's question or chat context.
						""" # Use description to describe the parameter so that the Kimi large language model can better generate the parameter
					}
				}
			}
		}
	},
	{
		"type": "function", # The field type is agreed upon, and currently supports function as a value
		"function": { # When type is function, use the function field to define the specific function content
			"name": "crawl", # The name of the function, please use English letters, numbers, plus hyphens and underscores as the function name
			"description": """
				Get the content of a webpage based on the website address (URL).
			""", # Introduction to the function, write the specific function here, as well as the usage scenario, so that the Kimi large language model can correctly choose which functions to use
			"parameters": { # Use the parameters field to define the parameters accepted by the function
				"type": "object", # Fixed use type: object to make the Kimi large language model generate a JSON Object parameter
				"required": ["url"], # Use the required field to tell the Kimi large language model which parameters are required
				"properties": { # The specific parameter definitions are in properties, and you can define multiple parameters
					"url": { # Here, the key is the parameter name, and the value is the specific definition of the parameter
						"type": "string", # Use type to define the parameter type
						"description": """
							The website address (URL) of the content to be obtained, which can usually be obtained from the search results.
						""" # Use description to describe the parameter so that the Kimi large language model can better generate the parameter
					}
				}
			}
		}
	}
]
 
 
def search_impl(query: str) -> List[Dict[str, Any]]:
    """
    search_impl uses a search engine to search for query. Most mainstream search engines (such as Bing) provide API calls. You can choose
    your preferred search engine API and place the website title, link, and brief introduction information from the return results in a dict to return.
 
    This is just a simple example, and you may need to write some authentication, validation, and parsing code.
    """
    r = httpx.get("https://your.search.api", params={"query": query})
    return r.json()
 
 
def search(arguments: Dict[str, Any]) -> Any:
    query = arguments["query"]
    result = search_impl(query)
    return {"result": result}
 
 
def crawl_impl(url: str) -> str:
    """
    crawl_url gets the content of a webpage based on the url.
 
    This is just a simple example. In actual web scraping, you may need to write more code to handle complex situations, such as asynchronously loaded data; and after obtaining
    the webpage content, you can clean the webpage content according to your needs, such as retaining only the text or removing unnecessary content (such as advertisements).
    """
    r = httpx.get(url)
    return r.text
 
 
def crawl(arguments: dict) -> str:
    url = arguments["url"]
    content = crawl_impl(url)
    return {"content": content}
 
 
# Map each tool name and its corresponding function through tool_map so that when the Kimi large language model returns tool_calls, we can quickly find the function to execute
tool_map = {
    "search": search,
    "crawl": crawl,
}
 
messages = [
    {"role": "system",
     "content": "You are Kimi, an artificial intelligence assistant provided by Moonshot AI. You are better at conversing in Chinese and English. You provide users with safe, helpful, and accurate answers. At the same time, you will refuse to answer any questions involving terrorism, racial discrimination, pornography, and violence. Moonshot AI is a proper noun and should not be translated into other languages."},
    {"role": "user", "content": "Please search for Context Caching online and tell me what it is."}  # Request Kimi large language model to perform an online search in the question
]
 
finish_reason = None
 
 
# Our basic process is to ask the Kimi large language model questions with the user's question and tools. If the Kimi large language model returns finish_reason: tool_calls, we execute the corresponding tool_calls,
# and submit the execution results in the form of a message with role=tool back to the Kimi large language model. The Kimi large language model then generates the next content based on the tool_calls results:
#
#   1. If the Kimi large language model believes that the current tool call results can answer the user's question, it returns finish_reason: stop, and we exit the loop and print out message.content;
#   2. If the Kimi large language model believes that the current tool call results cannot answer the user's question and needs to call the tool again, we continue to execute the next tool_calls in the loop until finish_reason is no longer tool_calls;
#
# During this process, we only return the result to the user when finish_reason is stop.
 
while finish_reason is None or finish_reason == "tool_calls":
    completion = client.chat.completions.create(
        model="kimi-k2.5",
        messages=messages,
        tools=tools,  # <-- We submit the defined tools to the Kimi large language model through the tools parameter
    )
    choice = completion.choices[0]
    finish_reason = choice.finish_reason
    if finish_reason == "tool_calls": # <-- Determine whether the current return content contains tool_calls
        messages.append(choice.message) # <-- We add the assistant message returned to us by the Kimi large language model to the context so that the Kimi large language model can understand our request next time
        for tool_call in choice.message.tool_calls: # <-- tool_calls may be multiple, so we use a loop to execute them one by one
            tool_call_name = tool_call.function.name
            tool_call_arguments = json.loads(tool_call.function.arguments) # <-- arguments is a serialized JSON Object, and we need to deserialize it with json.loads
            tool_function = tool_map[tool_call_name] # <-- Quickly find which function to execute through tool_map
            tool_result = tool_function(tool_call_arguments)
 
            # Construct a message with role=tool using the function execution result to show the result of the tool call to the model;
            # Note that we need to provide the tool_call_id and name fields in the message so that the Kimi large language model
            # can correctly match the corresponding tool_call.
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "name": tool_call_name,
                "content": json.dumps(tool_result), # <-- We agree to submit the tool call result to the Kimi large language model in string format, so we use json.dumps to serialize the execution result into a string here
            })
 
print(choice.message.content) # <-- Here, we return the reply generated by the model to the user

We use a while loop to execute the code logic that includes tool calls because the Kimi large language model typically doesn't make just one tool call, especially in the context of online searching. Usually, Kimi will first call the search tool to get search results, and then call the crawl tool to convert the URLs in the search results into actual web page content. The overall structure of the messages is as follows:

system: prompt                                                                                               # System prompt
user: prompt                                                                                                 # User's question
assistant: tool_call(name=search, arguments={query: query})                                                  # Kimi returns a tool_call (single)
tool: search_result(tool_call_id=tool_call.id, name=search)                                                  # Submit the tool_call execution result
assistant: tool_call_1(name=crawl, arguments={url: url_1}), tool_call_2(name=crawl, arguments={url: url_2})  # Kimi continues to return tool_calls (multiple)
tool: crawl_content(tool_call_id=tool_call_1.id, name=crawl)                                                 # Submit the execution result of tool_call_1
tool: crawl_content(tool_call_id=tool_call_2.id, name=crawl)                                                 # Submit the execution result of tool_call_2
assistant: message_content(finish_reason=stop)                                                               # Kimi generates a reply to the user, ending the conversation

This completes the entire process of making "online query" tool calls. If you have implemented your own search and crawl methods, when you ask Kimi to search online, it will call the search and crawl tools and give you the correct response based on the tool call results.

Common Questions and Notes

About Streaming Output

In streaming output mode (stream), tool_calls are still applicable, but there are some additional things to note, as follows:

  • During streaming output, since finish_reason will appear in the last data chunk, it is recommended to check if the delta.tool_calls field exists to determine if the current response includes a tool call;
  • During streaming output, delta.content will be output first, followed by delta.tool_calls, so you must wait until delta.content has finished outputting before you can determine and identify tool_calls;
  • During streaming output, we will specify the tool_call.id and tool_call.function.name in the initial data chunk, and only tool_call.function.arguments will be output in subsequent chunks;
  • During streaming output, if Kimi returns multiple tool_calls at once, we will use an additional field called index to indicate the index of the current tool_call, so that you can correctly concatenate the tool_call.function.arguments parameters. We use a code example from the streaming output section (without using the SDK) to illustrate how to do this:
import os
import json
import httpx  # We use the httpx library to make our HTTP requests
 
 
 
tools = [
    {
        "type": "function",  # The type field is fixed as "function"
        "function": {  # When type is "function", use the function field to define the specific function content
            "name": "search",  # The name of the function, please use English letters, numbers, hyphens, and underscores
            "description": """ 
				Search the internet for content using a search engine.
 
				When your knowledge cannot answer the user's question or the user requests an online search, call this tool. Extract the content the user wants to search from the conversation as the value of the query parameter.
				The search results include the title of the website, the website's address (URL), and a brief introduction to the website.
			""",  # Description of the function, explaining its specific role and usage scenarios to help the Kimi large language model choose the right functions
            "parameters": {  # Use the parameters field to define the parameters the function accepts
                "type": "object",  # Always use type: object to make the Kimi large language model generate a JSON Object parameter
                "required": ["query"],  # Use the required field to tell the Kimi large language model which parameters are mandatory
                "properties": {  # Specific parameter definitions in properties, you can define multiple parameters
                    "query": {  # Here, the key is the parameter name, and the value is the specific definition of the parameter
                        "type": "string",  # Use type to define the parameter type
                        "description": """
							The content the user wants to search for, extracted from the user's question or chat context.
						"""  # Use description to help the Kimi large language model generate parameters more effectively
                    }
                }
            }
        }
    },
]
 
header = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {os.environ.get('MOONSHOT_API_KEY')}",
}
 
data = {
    "model": "kimi-k2.5",
    "messages": [
        {"role": "user", "content": "Please search for Context Caching technology online."}
    ],
    "stream": True,
    "tools": tools,  # <-- Add tool invocation
}
# Use httpx to send a chat request to the Kimi large language model and get the response r
r = httpx.post("https://api.moonshot.ai/v1/chat/completions",
               headers=header,
               json=data)
if r.status_code != 200:
    raise Exception(r.text)
 
data: str
 
# Here, we pre-build a List to store different response messages. Since we set n=2, we initialize the List with 2 elements
messages = [{}, {}]
 
# Here, we use the iter_lines method to read the response body line by line
for line in r.iter_lines():
    # Remove leading and trailing spaces from each line to better handle data blocks
    line = line.strip()
 
    # Next, we need to handle three different cases:
    #   1. If the current line is empty, it indicates that the previous data block has been received (as mentioned earlier, data blocks are ended with two newline characters). We can deserialize the data block and print the corresponding content;
    #   2. If the current line is not empty and starts with data:, it indicates the start of a data block transmission. After removing the data: prefix, first check if it is the end marker [DONE]. If not, save the data content to the data variable;
    #   3. If the current line is not empty but does not start with data:, it means the current line still belongs to the previous data block being transmitted. Append the content of the current line to the end of the data variable;
 
    if len(line) == 0:
        chunk = json.loads(data)
 
        # Loop through all choices in each data block to get the message object corresponding to the index
        for choice in chunk["choices"]:
            index = choice["index"]
            message = messages[index]
            usage = choice.get("usage")
            if usage:
                message["usage"] = usage
            delta = choice["delta"]
            role = delta.get("role")
            if role:
                message["role"] = role
            content = delta.get("content")
            if content:
            	if "content" not in message:
            		message["content"] = content
            	else:
                	message["content"] = message["content"] + content
 
            # From here, we start processing tool_calls
            tool_calls = delta.get("tool_calls")  # <-- First, check if the data block contains tool_calls
            if tool_calls:
                if "tool_calls" not in message:
                    message["tool_calls"] = []  # <-- If it contains tool_calls, initialize a list to store these tool_calls. Note that the list is empty at this point, with a length of 0
                for tool_call in tool_calls:
                    tool_call_index = tool_call["index"]  # <-- Get the index of the current tool_call
                    if len(message["tool_calls"]) < (
                            tool_call_index + 1):  # <-- Expand the tool_calls list according to the index to access the corresponding tool_call via index
                        message["tool_calls"].extend([{}] * (tool_call_index + 1 - len(message["tool_calls"])))
                    tool_call_object = message["tool_calls"][tool_call_index]  # <-- Access the corresponding tool_call via index
                    tool_call_object["index"] = tool_call_index
 
                    # The following steps fill in the id, type, and function fields of each tool_call based on the information in the data block
                    # In the function field, there are name and arguments fields. The arguments field will be supplemented by each data block
                    # in the same way as the delta.content field.
 
                    tool_call_id = tool_call.get("id")
                    if tool_call_id:
                        tool_call_object["id"] = tool_call_id
                    tool_call_type = tool_call.get("type")
                    if tool_call_type:
                        tool_call_object["type"] = tool_call_type
                    tool_call_function = tool_call.get("function")
                    if tool_call_function:
                        if "function" not in tool_call_object:
                            tool_call_object["function"] = {}
                        tool_call_function_name = tool_call_function.get("name")
                        if tool_call_function_name:
                            tool_call_object["function"]["name"] = tool_call_function_name
                        tool_call_function_arguments = tool_call_function.get("arguments")
                        if tool_call_function_arguments:
                            if "arguments" not in tool_call_object["function"]:
                                tool_call_object["function"]["arguments"] = tool_call_function_arguments
                            else:
                                tool_call_object["function"]["arguments"] = tool_call_object["function"][
                                                                            "arguments"] + tool_call_function_arguments  # <-- Supplement the value of the function.arguments field sequentially
                    message["tool_calls"][tool_call_index] = tool_call_object
 
            data = ""  # Reset data
    elif line.startswith("data: "):
        data = line.lstrip("data: ")
 
        # When the data block content is [DONE], it indicates that all data blocks have been sent and the network connection can be disconnected
        if data == "[DONE]":
            break
    else:
        data = data + "\n" + line  # When appending content, add a newline character because this might be intentional line breaks in the data block
 
# After assembling all messages, print their contents separately
for index, message in enumerate(messages):
    print("index:", index)
    print("message:", json.dumps(message, ensure_ascii=False))
    print("")

Below is an example of handling tool_calls in streaming output using the openai SDK:

import os
import json
 
from openai import OpenAI
 
client = OpenAI(
    api_key=os.environ.get("MOONSHOT_API_KEY"),
    base_url="https://api.moonshot.ai/v1",
)
 
tools = [
    {
        "type": "function",  # The agreed-upon field type, currently supports function as a value
        "function": {  # When type is function, use the function field to define the specific function content
            "name": "search",  # The name of the function, please use English letters, numbers, plus hyphens and underscores as the function name
            "description": """ 
				Search for content on the internet using a search engine.
 
				When your knowledge cannot answer the user's question, or the user requests you to perform an online search, call this tool. Please extract the content the user wants to search from the conversation with the user as the value of the query parameter.
				The search results include the title of the website, the website's address (URL), and the website's description.
			""",  # The introduction of the function, write the specific function here and its usage scenarios so that the Kimi large language model can correctly choose which functions to use
            "parameters": {  # Use the parameters field to define the parameters accepted by the function
                "type": "object",  # Fixed use type: object to make the Kimi large language model generate a JSON Object parameter
                "required": ["query"],  # Use the required field to tell the Kimi large language model which parameters are required
                "properties": {  # The properties are the specific parameter definitions, you can define multiple parameters
                    "query": {  # Here, the key is the parameter name, and the value is the specific definition of the parameter
                        "type": "string",  # Use type to define the parameter type
                        "description": """
							The content the user is searching for, please extract it from the user's question or chat context.
						"""  # Use description to describe the parameter so that the Kimi large language model can better generate the parameter
                    }
                }
            }
        }
    },
]
 
completion = client.chat.completions.create(
    model="kimi-k2.5",
    messages=[
        {"role": "user", "content": "Please search for Context Caching technology online."}
    ],
    stream=True,
    tools=tools,  # <-- Add tool invocation
)
 
# Here, we pre-build a List to store different response messages, since we set n=2, we initialize the List with 2 elements
messages = [{}, {}]
 
for chunk in completion:
    # Loop through all the choices in each data chunk and get the message object corresponding to the index
    for choice in chunk.choices:
        index = choice.index
        message = messages[index]
        delta = choice.delta
        role = delta.role
        if role:
            message["role"] = role
        content = delta.content
        if content:
        	if "content" not in message:
        		message["content"] = content
        	else:
            	message["content"] = message["content"] + content
 
        # From here, we start processing tool_calls
        tool_calls = delta.tool_calls  # <-- First check if the data chunk contains tool_calls
        if tool_calls:
            if "tool_calls" not in message:
                message["tool_calls"] = []  # <-- If it contains tool_calls, we initialize a list to save these tool_calls, note that the list is empty at this time with a length of 0
            for tool_call in tool_calls:
                tool_call_index = tool_call.index  # <-- Get the index of the current tool_call
                if len(message["tool_calls"]) < (
                        tool_call_index + 1):  # <-- Expand the tool_calls list according to the index so that we can access the corresponding tool_call via the subscript
                    message["tool_calls"].extend([{}] * (tool_call_index + 1 - len(message["tool_calls"])))
                tool_call_object = message["tool_calls"][tool_call_index]  # <-- Access the corresponding tool_call via the subscript
                tool_call_object["index"] = tool_call_index
 
                # The following steps are to fill in the id, type, and function fields of each tool_call based on the information in the data chunk
                # In the function field, there are name and arguments fields, the arguments field will be supplemented by each data chunk
                # Sequentially, just like the delta.content field.
 
                tool_call_id = tool_call.id
                if tool_call_id:
                    tool_call_object["id"] = tool_call_id
                tool_call_type = tool_call.type
                if tool_call_type:
                    tool_call_object["type"] = tool_call_type
                tool_call_function = tool_call.function
                if tool_call_function:
                    if "function" not in tool_call_object:
                        tool_call_object["function"] = {}
                    tool_call_function_name = tool_call_function.name
                    if tool_call_function_name:
                        tool_call_object["function"]["name"] = tool_call_function_name
                    tool_call_function_arguments = tool_call_function.arguments
                    if tool_call_function_arguments:
                        if "arguments" not in tool_call_object["function"]:
                            tool_call_object["function"]["arguments"] = tool_call_function_arguments
                        else:
                            tool_call_object["function"]["arguments"] = tool_call_object["function"][
                                                                            "arguments"] + tool_call_function_arguments  # <-- Sequentially supplement the value of the function.arguments field
                message["tool_calls"][tool_call_index] = tool_call_object
 
# After assembling all messages, we print their contents separately
for index, message in enumerate(messages):
    print("index:", index)
    print("message:", json.dumps(message, ensure_ascii=False))
    print("")

About tool_calls and function_call

tool_calls is an advanced version of function_call. Since OpenAI has marked parameters such as function_call (for example, functions) as "deprecated," our API will no longer support function_call. You can consider using tool_calls instead of function_call. Compared to function_call, tool_calls has the following advantages:

  • It supports parallel calls. The Kimi large language model can return multiple tool_calls at once. You can use concurrency in your code to call these tool_call simultaneously, reducing time consumption;
  • For tool_calls that have no dependencies, the Kimi large language model will also tend to call them in parallel. Compared to the original sequential calls of function_call, this reduces token consumption to some extent;

About content

When using the tool_calls tool, you may notice that under the condition of finish_reason=tool_calls, the message.content field is occasionally not empty. Typically, the content here is the Kimi large language model explaining which tools need to be called and why these tools need to be called. Its significance lies in the fact that if your tool call process takes a long time, or if completing a round of chat requires multiple sequential tool calls, providing a descriptive sentence to the user before calling the tool can reduce the anxiety or dissatisfaction that users may feel due to waiting. Additionally, explaining to the user which tools are being called and why helps them understand the entire tool call process and allows them to intervene and correct in a timely manner (for example, if the user thinks the current tool selection is incorrect, they can terminate the tool call in time, or correct the model's tool selection in the next round of chat through a prompt).

About Tokens

The content in the tools parameter is also counted in the total Tokens. Please ensure that the total number of Tokens in tools and messages does not exceed the model's context window size.

About Message Layout

In scenarios where tools are called, our messages are no longer laid out like this:

system: ...
user: ...
assistant: ...
user: ...
assistant: ...

Instead, they will look like this:

system: ...
user: ...
assistant: ...
tool: ...
tool: ...
assistant: ...

It is important to note that when the Kimi large language model generates tool_calls, ensure that each tool_call has a corresponding message with role=tool, and that this message has the correct tool_call_id. If the number of role=tool messages does not match the number of tool_calls, or if the tool_call_id in the role=tool messages cannot be matched with the tool_call.id in tool_calls, an error will occur.

If You Encounter the tool_call_id not found Error

If you encounter the tool_call_id not found error, it may be because you did not add the role=assistant message returned by the Kimi API to the messages list. The correct message sequence should look like this:

system: ...
user: ...
assistant: ...  # <-- Perhaps you did not add this assistant message to the messages list
tool: ...
tool: ...
assistant: ...

You can avoid the tool_call_id not found error by executing messages.append(message) each time you receive a return value from the Kimi API, to add the message returned by the Kimi API to the messages list.

Note: Assistant messages added to the messages list before the role=tool message must fully include the tool_calls field and its values returned by the Kimi API. We recommend directly adding the choice.message returned by the Kimi API to the messages list "as is" to avoid potential errors.