Use Kimi API for Tool Calls
Tool calls, or tool_calls, evolved from function calls (function_call). In certain contexts, or when reading compatibility code, you can consider tool_calls and function_call to be the same. function_call is a subset of tool_calls.
What are Tool Calls?
Tool calls give the Kimi large language model the ability to perform specific actions. The Kimi large language model can engage in conversations and answer questions, which is its "talking" ability. Through tool calls, it also gains the ability to "do" things. With tool_calls, the Kimi large language model can help you search the internet, query databases, and even control smart home devices.
A tool call involves several steps:
- Define the tool using JSON Schema format;
- Submit the defined tool to the Kimi large language model via the
toolsparameter. You can submit multiple tools at once; - The Kimi large language model will decide which tool(s) to use based on the context of the current conversation. It can also choose not to use any tools;
- The Kimi large language model will output the parameters and information needed to call the tool in JSON format;
- Use the parameters output by the Kimi large language model to execute the corresponding tool and submit the results back to the Kimi large language model;
- The Kimi large language model will respond to the user based on the results of the tool execution;
Reading the above steps, you might wonder:
Why can't the Kimi large language model execute the tools itself? Why do we need to "help" the Kimi large language model execute the tools based on the parameters it generates? If we are the ones executing the tool calls, what is the role of the Kimi large language model?
We will use a practical example of a tool call to explain these questions to the reader.
Enable the Kimi Large Language Model to Access the Internet via tool_calls
The knowledge of the Kimi large language model comes from its training data. For questions that are time-sensitive, the Kimi large language model cannot find answers from its existing knowledge. In such cases, we want the Kimi large language model to search the internet for the latest information and answer our questions based on that information.
Define the Tools
Imagine how we find the information we want on the internet:
- We open a search engine, such as Baidu or Bing, and search for the content we want. We then browse the search results and decide which one to click based on the website title and description;
- We might open one or more web pages from the search results and browse them to obtain the knowledge we need;
Reviewing our actions, we "use a search engine to search" and "open the web pages corresponding to the search results." The tools we use are the "search engine" and the "web browser." Therefore, we need to abstract these actions into tools in JSON Schema format and submit them to the Kimi large language model, allowing it to use search engines and browse web pages just like humans do.
Before we proceed, let's briefly introduce the JSON Schema format:
JSON Schema (opens in a new tab) is a vocabulary that you can use to annotate and validate JSON documents.
JSON Schema (opens in a new tab) is a JSON document used to describe the format of JSON data.
We define the following JSON Schema:
{
"type": "object",
"properties": {
"name": {
"type": "string"
}
}
}This JSON Schema defines a JSON Object that contains a field named name, and the type of this field is string, for example:
{
"name": "Hei"
}By describing our tool definitions using JSON Schema, we can make it clearer and more intuitive for the Kimi large language model to understand what parameters our tools require, as well as the type and description of each parameter. Now let's define the "search engine" and "web browser" tools mentioned earlier:
tools = [
{
"type": "function", # The agreed-upon field type, currently supports function as a value
"function": { # When type is function, use the function field to define the specific function content
"name": "search", # The name of the function. Please use English letters, numbers, hyphens, and underscores as the function name
"description": """
Search for content on the internet using a search engine.
When your knowledge cannot answer the user's question, or when the user requests an online search, call this tool. Extract the content the user wants to search for from the conversation and use it as the value of the query parameter.
The search results include the website title, address (URL), and description.
""", # A description of the function, detailing its specific role and usage scenarios, to help the Kimi large language model correctly select which functions to use
"parameters": { # Use the parameters field to define the parameters the function accepts
"type": "object", # Always use type: object to make the Kimi large language model generate a JSON Object parameter
"required": ["query"], # Use the required field to tell the Kimi large language model which parameters are mandatory
"properties": { # The properties field contains the specific parameter definitions; you can define multiple parameters
"query": { # Here, the key is the parameter name, and the value is the specific definition of the parameter
"type": "string", # Use type to define the parameter type
"description": """
The content the user wants to search for, extracted from the user's question or conversation context.
""" # Use description to describe the parameter so that the Kimi large language model can better generate the parameter
}
}
}
}
},
{
"type": "function", # The agreed-upon field type, currently supports function as a value
"function": { # When type is function, use the function field to define the specific function content
"name": "crawl", # The name of the function. Please use English letters, numbers, hyphens, and underscores as the function name
"description": """
Retrieve web page content based on the website address (URL).
""", # A description of the function, detailing its specific role and usage scenarios, to help the Kimi large language model correctly select which functions to use
"parameters": { # Use the parameters field to define the parameters the function accepts
"type": "object", # Always use type: object to make the Kimi large language model generate a JSON Object parameter
"required": ["url"], # Use the required field to tell the Kimi large language model which parameters are mandatory
"properties": { # The properties field contains the specific parameter definitions; you can define multiple parameters
"url": { # Here, the key is the parameter name, and the value is the specific definition of the parameter
"type": "string", # Use type to define the parameter type
"description": """
The website address (URL) from which to retrieve content, usually obtained from search results.
""" # Use description to describe the parameter so that the Kimi large language model can better generate the parameter
}
}
}
}
}
]When defining tools using JSON Schema, we use the following fixed format:
{
"type": "function",
"function": {
"name": "NAME",
"description": "DESCRIPTION",
"parameters": {
"type": "object",
"properties": {
}
}
}
}Here, name, description, and parameters.properties are defined by the tool provider. The description explains the specific function and when to use the tool, while parameters outlines the specific parameters needed to successfully call the tool, including parameter types and descriptions. Ultimately, the Kimi large language model will generate a JSON Object that meets the defined requirements as the parameters (arguments) for the tool call based on the JSON Schema.
Register Tools
Let's try submitting the search tool to the Kimi large language model to see if it can correctly call the tool:
from openai import OpenAI
client = OpenAI(
api_key="MOONSHOT_API_KEY", # Replace MOONSHOT_API_KEY with the API Key you obtained from the Kimi Open Platform
base_url="https://api.moonshot.ai/v1",
)
tools = [
{
"type": "function", # The field "type" is a convention, currently supporting "function" as its value
"function": { # When "type" is "function", use the "function" field to define the specific function content
"name": "search", # The name of the function, please use English letters, numbers, plus hyphens and underscores as the function name
"description": """
Search for content on the internet using a search engine.
When your knowledge cannot answer the user's question, or when the user requests an online search, call this tool. Extract the content the user wants to search from the conversation as the value of the query parameter.
The search results include the website title, website address (URL), and website description.
""", # Description of the function, write the specific function and usage scenarios here so that the Kimi large language model can correctly choose which functions to use
"parameters": { # Use the "parameters" field to define the parameters accepted by the function
"type": "object", # Always use "type": "object" to make the Kimi large language model generate a JSON Object parameter
"required": ["query"], # Use the "required" field to tell the Kimi large language model which parameters are required
"properties": { # The specific parameter definitions are in "properties", you can define multiple parameters
"query": { # Here, the key is the parameter name, and the value is the specific definition of the parameter
"type": "string", # Use "type" to define the parameter type
"description": """
The content the user wants to search for, extract it from the user's question or chat context.
""" # Use "description" to describe the parameter so that the Kimi large language model can better generate the parameter
}
}
}
}
},
# {
# "type": "function", # The field "type" is a convention, currently supporting "function" as its value
# "function": { # When "type" is "function", use the "function" field to define the specific function content
# "name": "crawl", # The name of the function, please use English letters, numbers, plus hyphens and underscores as the function name
# "description": """
# Get the content of a webpage based on the website address (URL).
# """, // Description of the function, write the specific function and usage scenarios here so that the Kimi large language model can correctly choose which functions to use
# "parameters": { // Use the "parameters" field to define the parameters accepted by the function
# "type": "object", // Always use "type": "object" to make the Kimi large language model generate a JSON Object parameter
# "required": ["url"], // Use the "required" field to tell the Kimi large language model which parameters are required
# "properties": { // The specific parameter definitions are in "properties", you can define multiple parameters
# "url": { // Here, the key is the parameter name, and the value is the specific definition of the parameter
# "type": "string", // Use "type" to define the parameter type
# "description": """
# The website address (URL) of the content to be obtained, which can usually be obtained from the search results.
# """ // Use "description" to describe the parameter so that the Kimi large language model can better generate the parameter
# }
# }
# }
# }
# }
]
```python
completion = client.chat.completions.create(
model="kimi-k2.5",
messages=[
{"role": "system", "content": "You are Kimi, an AI assistant provided by Moonshot AI. You are proficient in Chinese and English conversations. You provide users with safe, helpful, and accurate answers. You refuse to answer any questions related to terrorism, racism, or explicit content. Moonshot AI is a proper noun and should not be translated."},
{"role": "user", "content": "Please search the internet for 'Context Caching' and tell me what it is."} # In the question, we ask Kimi large language model to search online
],
tools=tools, # <-- We pass the defined tools to Kimi large language model via the tools parameter
)
print(completion.choices[0].model_dump_json(indent=4))When the above code runs successfully, we get the response from Kimi large language model:
{
"finish_reason": "tool_calls",
"message": {
"content": "",
"role": "assistant",
"tool_calls": [
{
"id": "search:0",
"function": {
"arguments": "{\n \"query\": \"Context Caching\"\n}",
"name": "search"
},
"type": "function",
}
]
}
}Notice that in this response, the value of finish_reason is tool_calls, which means that the response is not the answer from Kimi large language model, but rather the tool that Kimi large language model has chosen to execute. You can determine whether the current response from Kimi large language model is a tool call tool_calls by checking the value of finish_reason.
In the message section, the content field is empty because the model is currently executing tool_calls and has not yet generated a response for the user. Meanwhile, a new field tool_calls has been added. The tool_calls field is a list that contains all the tool call information for this execution. This also indicates another characteristic of tool_calls: the model can choose to call multiple tools at once, which can be different tools or the same tool with different parameters. Each element in tool_calls represents a tool call. Kimi large language model generates a unique id for each tool call. The function.name field indicates the name of the function being executed, and the parameters are placed in function.arguments. The arguments parameter is a valid serialized JSON Object (additionally, the type parameter is currently a fixed value function).
Next, we should use the tool call parameters generated by Kimi large language model to execute the specific tools.
Execute the Tools
Kimi large language model does not execute the tools for us. We need to execute the parameters generated by Kimi large language model after receiving them. Before explaining how to execute the tools, let's first address the question we raised earlier:
Why can't Kimi large language model execute the tools itself, but instead requires us to "help" it execute the tools based on the parameters generated by Kimi large language model? If we are the ones executing the tool calls, what is the purpose of Kimi large language model?
Let's imagine a scenario where we use Kimi large language model: we provide users with a smart robot based on Kimi large language model. In this scenario, there are three roles: the user, the robot, and Kimi large language model. The user asks the robot a question, the robot calls the Kimi large language model API, and returns the API result to the user. When using tool_calls, the user asks the robot a question, the robot calls the Kimi API with tools, Kimi large language model returns the tool_calls parameters, the robot executes the tool_calls, submits the results back to the Kimi API, Kimi large language model generates the message to be returned to the user (finish_reason=stop), and only then does the robot return the message to the user. Throughout this process, the entire tool_calls process is transparent and implicit to the user.
Returning to the question above, as users, we are not actually executing the tool calls, nor do we directly "see" the tool calls. Instead, the robot that provides us with the service is completing the tool calls and presenting us with the final response generated by Kimi large language model.
Let's explain how to execute the tool_calls returned by Kimi large language model from the perspective of the "robot":
from typing import *
import json
from openai import OpenAI
client = OpenAI(
api_key="MOONSHOT_API_KEY", # Replace MOONSHOT_API_KEY with the API Key you obtained from the Kimi Open Platform
base_url="https://api.moonshot.ai/v1",
)
tools = [
{
"type": "function", # The field type is agreed upon, and currently supports function as a value
"function": { # When type is function, use the function field to define the specific function content
"name": "search", # The name of the function, please use English letters, numbers, plus hyphens and underscores as the function name
"description": """
Search for content on the internet using a search engine.
When your knowledge cannot answer the user's question, or the user requests you to perform an online search, call this tool. Extract the content the user wants to search from the conversation as the value of the query parameter.
The search results include the title of the website, the website address (URL), and a brief introduction to the website.
""", # Introduction to the function, write the specific function here, as well as the usage scenario, so that the Kimi large language model can correctly choose which functions to use
"parameters": { # Use the parameters field to define the parameters accepted by the function
"type": "object", # Fixed use type: object to make the Kimi large language model generate a JSON Object parameter
"required": ["query"], # Use the required field to tell the Kimi large language model which parameters are required
"properties": { # The specific parameter definitions are in properties, and you can define multiple parameters
"query": { # Here, the key is the parameter name, and the value is the specific definition of the parameter
"type": "string", # Use type to define the parameter type
"description": """
The content the user wants to search for, extracted from the user's question or chat context.
""" # Use description to describe the parameter so that the Kimi large language model can better generate the parameter
}
}
}
}
},
{
"type": "function", # The field type is agreed upon, and currently supports function as a value
"function": { # When type is function, use the function field to define the specific function content
"name": "crawl", # The name of the function, please use English letters, numbers, plus hyphens and underscores as the function name
"description": """
Get the content of a webpage based on the website address (URL).
""", # Introduction to the function, write the specific function here, as well as the usage scenario, so that the Kimi large language model can correctly choose which functions to use
"parameters": { # Use the parameters field to define the parameters accepted by the function
"type": "object", # Fixed use type: object to make the Kimi large language model generate a JSON Object parameter
"required": ["url"], # Use the required field to tell the Kimi large language model which parameters are required
"properties": { # The specific parameter definitions are in properties, and you can define multiple parameters
"url": { # Here, the key is the parameter name, and the value is the specific definition of the parameter
"type": "string", # Use type to define the parameter type
"description": """
The website address (URL) of the content to be obtained, which can usually be obtained from the search results.
""" # Use description to describe the parameter so that the Kimi large language model can better generate the parameter
}
}
}
}
}
]
def search_impl(query: str) -> List[Dict[str, Any]]:
"""
search_impl uses a search engine to search for query. Most mainstream search engines (such as Bing) provide API calls. You can choose
your preferred search engine API and place the website title, link, and brief introduction information from the return results in a dict to return.
This is just a simple example, and you may need to write some authentication, validation, and parsing code.
"""
r = httpx.get("https://your.search.api", params={"query": query})
return r.json()
def search(arguments: Dict[str, Any]) -> Any:
query = arguments["query"]
result = search_impl(query)
return {"result": result}
def crawl_impl(url: str) -> str:
"""
crawl_url gets the content of a webpage based on the url.
This is just a simple example. In actual web scraping, you may need to write more code to handle complex situations, such as asynchronously loaded data; and after obtaining
the webpage content, you can clean the webpage content according to your needs, such as retaining only the text or removing unnecessary content (such as advertisements).
"""
r = httpx.get(url)
return r.text
def crawl(arguments: dict) -> str:
url = arguments["url"]
content = crawl_impl(url)
return {"content": content}
# Map each tool name and its corresponding function through tool_map so that when the Kimi large language model returns tool_calls, we can quickly find the function to execute
tool_map = {
"search": search,
"crawl": crawl,
}
messages = [
{"role": "system",
"content": "You are Kimi, an artificial intelligence assistant provided by Moonshot AI. You are better at conversing in Chinese and English. You provide users with safe, helpful, and accurate answers. At the same time, you will refuse to answer any questions involving terrorism, racial discrimination, pornography, and violence. Moonshot AI is a proper noun and should not be translated into other languages."},
{"role": "user", "content": "Please search for Context Caching online and tell me what it is."} # Request Kimi large language model to perform an online search in the question
]
finish_reason = None
# Our basic process is to ask the Kimi large language model questions with the user's question and tools. If the Kimi large language model returns finish_reason: tool_calls, we execute the corresponding tool_calls,
# and submit the execution results in the form of a message with role=tool back to the Kimi large language model. The Kimi large language model then generates the next content based on the tool_calls results:
#
# 1. If the Kimi large language model believes that the current tool call results can answer the user's question, it returns finish_reason: stop, and we exit the loop and print out message.content;
# 2. If the Kimi large language model believes that the current tool call results cannot answer the user's question and needs to call the tool again, we continue to execute the next tool_calls in the loop until finish_reason is no longer tool_calls;
#
# During this process, we only return the result to the user when finish_reason is stop.
while finish_reason is None or finish_reason == "tool_calls":
completion = client.chat.completions.create(
model="kimi-k2.5",
messages=messages,
tools=tools, # <-- We submit the defined tools to the Kimi large language model through the tools parameter
)
choice = completion.choices[0]
finish_reason = choice.finish_reason
if finish_reason == "tool_calls": # <-- Determine whether the current return content contains tool_calls
messages.append(choice.message) # <-- We add the assistant message returned to us by the Kimi large language model to the context so that the Kimi large language model can understand our request next time
for tool_call in choice.message.tool_calls: # <-- tool_calls may be multiple, so we use a loop to execute them one by one
tool_call_name = tool_call.function.name
tool_call_arguments = json.loads(tool_call.function.arguments) # <-- arguments is a serialized JSON Object, and we need to deserialize it with json.loads
tool_function = tool_map[tool_call_name] # <-- Quickly find which function to execute through tool_map
tool_result = tool_function(tool_call_arguments)
# Construct a message with role=tool using the function execution result to show the result of the tool call to the model;
# Note that we need to provide the tool_call_id and name fields in the message so that the Kimi large language model
# can correctly match the corresponding tool_call.
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"name": tool_call_name,
"content": json.dumps(tool_result), # <-- We agree to submit the tool call result to the Kimi large language model in string format, so we use json.dumps to serialize the execution result into a string here
})
print(choice.message.content) # <-- Here, we return the reply generated by the model to the userWe use a while loop to execute the code logic that includes tool calls because the Kimi large language model typically doesn't make just one tool call, especially in the context of online searching. Usually, Kimi will first call the search tool to get search results, and then call the crawl tool to convert the URLs in the search results into actual web page content. The overall structure of the messages is as follows:
system: prompt # System prompt
user: prompt # User's question
assistant: tool_call(name=search, arguments={query: query}) # Kimi returns a tool_call (single)
tool: search_result(tool_call_id=tool_call.id, name=search) # Submit the tool_call execution result
assistant: tool_call_1(name=crawl, arguments={url: url_1}), tool_call_2(name=crawl, arguments={url: url_2}) # Kimi continues to return tool_calls (multiple)
tool: crawl_content(tool_call_id=tool_call_1.id, name=crawl) # Submit the execution result of tool_call_1
tool: crawl_content(tool_call_id=tool_call_2.id, name=crawl) # Submit the execution result of tool_call_2
assistant: message_content(finish_reason=stop) # Kimi generates a reply to the user, ending the conversationThis completes the entire process of making "online query" tool calls. If you have implemented your own search and crawl methods, when you ask Kimi to search online, it will call the search and crawl tools and give you the correct response based on the tool call results.
Common Questions and Notes
About Streaming Output
In streaming output mode (stream), tool_calls are still applicable, but there are some additional things to note, as follows:
- During streaming output, since
finish_reasonwill appear in the last data chunk, it is recommended to check if thedelta.tool_callsfield exists to determine if the current response includes a tool call; - During streaming output,
delta.contentwill be output first, followed bydelta.tool_calls, so you must wait untildelta.contenthas finished outputting before you can determine and identifytool_calls; - During streaming output, we will specify the
tool_call.idandtool_call.function.namein the initial data chunk, and onlytool_call.function.argumentswill be output in subsequent chunks; - During streaming output, if Kimi returns multiple
tool_callsat once, we will use an additional field calledindexto indicate the index of the currenttool_call, so that you can correctly concatenate thetool_call.function.argumentsparameters. We use a code example from the streaming output section (without using the SDK) to illustrate how to do this:
import os
import json
import httpx # We use the httpx library to make our HTTP requests
tools = [
{
"type": "function", # The type field is fixed as "function"
"function": { # When type is "function", use the function field to define the specific function content
"name": "search", # The name of the function, please use English letters, numbers, hyphens, and underscores
"description": """
Search the internet for content using a search engine.
When your knowledge cannot answer the user's question or the user requests an online search, call this tool. Extract the content the user wants to search from the conversation as the value of the query parameter.
The search results include the title of the website, the website's address (URL), and a brief introduction to the website.
""", # Description of the function, explaining its specific role and usage scenarios to help the Kimi large language model choose the right functions
"parameters": { # Use the parameters field to define the parameters the function accepts
"type": "object", # Always use type: object to make the Kimi large language model generate a JSON Object parameter
"required": ["query"], # Use the required field to tell the Kimi large language model which parameters are mandatory
"properties": { # Specific parameter definitions in properties, you can define multiple parameters
"query": { # Here, the key is the parameter name, and the value is the specific definition of the parameter
"type": "string", # Use type to define the parameter type
"description": """
The content the user wants to search for, extracted from the user's question or chat context.
""" # Use description to help the Kimi large language model generate parameters more effectively
}
}
}
}
},
]
header = {
"Content-Type": "application/json",
"Authorization": f"Bearer {os.environ.get('MOONSHOT_API_KEY')}",
}
data = {
"model": "kimi-k2.5",
"messages": [
{"role": "user", "content": "Please search for Context Caching technology online."}
],
"stream": True,
"tools": tools, # <-- Add tool invocation
}
# Use httpx to send a chat request to the Kimi large language model and get the response r
r = httpx.post("https://api.moonshot.ai/v1/chat/completions",
headers=header,
json=data)
if r.status_code != 200:
raise Exception(r.text)
data: str
# Here, we pre-build a List to store different response messages. Since we set n=2, we initialize the List with 2 elements
messages = [{}, {}]
# Here, we use the iter_lines method to read the response body line by line
for line in r.iter_lines():
# Remove leading and trailing spaces from each line to better handle data blocks
line = line.strip()
# Next, we need to handle three different cases:
# 1. If the current line is empty, it indicates that the previous data block has been received (as mentioned earlier, data blocks are ended with two newline characters). We can deserialize the data block and print the corresponding content;
# 2. If the current line is not empty and starts with data:, it indicates the start of a data block transmission. After removing the data: prefix, first check if it is the end marker [DONE]. If not, save the data content to the data variable;
# 3. If the current line is not empty but does not start with data:, it means the current line still belongs to the previous data block being transmitted. Append the content of the current line to the end of the data variable;
if len(line) == 0:
chunk = json.loads(data)
# Loop through all choices in each data block to get the message object corresponding to the index
for choice in chunk["choices"]:
index = choice["index"]
message = messages[index]
usage = choice.get("usage")
if usage:
message["usage"] = usage
delta = choice["delta"]
role = delta.get("role")
if role:
message["role"] = role
content = delta.get("content")
if content:
if "content" not in message:
message["content"] = content
else:
message["content"] = message["content"] + content
# From here, we start processing tool_calls
tool_calls = delta.get("tool_calls") # <-- First, check if the data block contains tool_calls
if tool_calls:
if "tool_calls" not in message:
message["tool_calls"] = [] # <-- If it contains tool_calls, initialize a list to store these tool_calls. Note that the list is empty at this point, with a length of 0
for tool_call in tool_calls:
tool_call_index = tool_call["index"] # <-- Get the index of the current tool_call
if len(message["tool_calls"]) < (
tool_call_index + 1): # <-- Expand the tool_calls list according to the index to access the corresponding tool_call via index
message["tool_calls"].extend([{}] * (tool_call_index + 1 - len(message["tool_calls"])))
tool_call_object = message["tool_calls"][tool_call_index] # <-- Access the corresponding tool_call via index
tool_call_object["index"] = tool_call_index
# The following steps fill in the id, type, and function fields of each tool_call based on the information in the data block
# In the function field, there are name and arguments fields. The arguments field will be supplemented by each data block
# in the same way as the delta.content field.
tool_call_id = tool_call.get("id")
if tool_call_id:
tool_call_object["id"] = tool_call_id
tool_call_type = tool_call.get("type")
if tool_call_type:
tool_call_object["type"] = tool_call_type
tool_call_function = tool_call.get("function")
if tool_call_function:
if "function" not in tool_call_object:
tool_call_object["function"] = {}
tool_call_function_name = tool_call_function.get("name")
if tool_call_function_name:
tool_call_object["function"]["name"] = tool_call_function_name
tool_call_function_arguments = tool_call_function.get("arguments")
if tool_call_function_arguments:
if "arguments" not in tool_call_object["function"]:
tool_call_object["function"]["arguments"] = tool_call_function_arguments
else:
tool_call_object["function"]["arguments"] = tool_call_object["function"][
"arguments"] + tool_call_function_arguments # <-- Supplement the value of the function.arguments field sequentially
message["tool_calls"][tool_call_index] = tool_call_object
data = "" # Reset data
elif line.startswith("data: "):
data = line.lstrip("data: ")
# When the data block content is [DONE], it indicates that all data blocks have been sent and the network connection can be disconnected
if data == "[DONE]":
break
else:
data = data + "\n" + line # When appending content, add a newline character because this might be intentional line breaks in the data block
# After assembling all messages, print their contents separately
for index, message in enumerate(messages):
print("index:", index)
print("message:", json.dumps(message, ensure_ascii=False))
print("")Below is an example of handling tool_calls in streaming output using the openai SDK:
import os
import json
from openai import OpenAI
client = OpenAI(
api_key=os.environ.get("MOONSHOT_API_KEY"),
base_url="https://api.moonshot.ai/v1",
)
tools = [
{
"type": "function", # The agreed-upon field type, currently supports function as a value
"function": { # When type is function, use the function field to define the specific function content
"name": "search", # The name of the function, please use English letters, numbers, plus hyphens and underscores as the function name
"description": """
Search for content on the internet using a search engine.
When your knowledge cannot answer the user's question, or the user requests you to perform an online search, call this tool. Please extract the content the user wants to search from the conversation with the user as the value of the query parameter.
The search results include the title of the website, the website's address (URL), and the website's description.
""", # The introduction of the function, write the specific function here and its usage scenarios so that the Kimi large language model can correctly choose which functions to use
"parameters": { # Use the parameters field to define the parameters accepted by the function
"type": "object", # Fixed use type: object to make the Kimi large language model generate a JSON Object parameter
"required": ["query"], # Use the required field to tell the Kimi large language model which parameters are required
"properties": { # The properties are the specific parameter definitions, you can define multiple parameters
"query": { # Here, the key is the parameter name, and the value is the specific definition of the parameter
"type": "string", # Use type to define the parameter type
"description": """
The content the user is searching for, please extract it from the user's question or chat context.
""" # Use description to describe the parameter so that the Kimi large language model can better generate the parameter
}
}
}
}
},
]
completion = client.chat.completions.create(
model="kimi-k2.5",
messages=[
{"role": "user", "content": "Please search for Context Caching technology online."}
],
stream=True,
tools=tools, # <-- Add tool invocation
)
# Here, we pre-build a List to store different response messages, since we set n=2, we initialize the List with 2 elements
messages = [{}, {}]
for chunk in completion:
# Loop through all the choices in each data chunk and get the message object corresponding to the index
for choice in chunk.choices:
index = choice.index
message = messages[index]
delta = choice.delta
role = delta.role
if role:
message["role"] = role
content = delta.content
if content:
if "content" not in message:
message["content"] = content
else:
message["content"] = message["content"] + content
# From here, we start processing tool_calls
tool_calls = delta.tool_calls # <-- First check if the data chunk contains tool_calls
if tool_calls:
if "tool_calls" not in message:
message["tool_calls"] = [] # <-- If it contains tool_calls, we initialize a list to save these tool_calls, note that the list is empty at this time with a length of 0
for tool_call in tool_calls:
tool_call_index = tool_call.index # <-- Get the index of the current tool_call
if len(message["tool_calls"]) < (
tool_call_index + 1): # <-- Expand the tool_calls list according to the index so that we can access the corresponding tool_call via the subscript
message["tool_calls"].extend([{}] * (tool_call_index + 1 - len(message["tool_calls"])))
tool_call_object = message["tool_calls"][tool_call_index] # <-- Access the corresponding tool_call via the subscript
tool_call_object["index"] = tool_call_index
# The following steps are to fill in the id, type, and function fields of each tool_call based on the information in the data chunk
# In the function field, there are name and arguments fields, the arguments field will be supplemented by each data chunk
# Sequentially, just like the delta.content field.
tool_call_id = tool_call.id
if tool_call_id:
tool_call_object["id"] = tool_call_id
tool_call_type = tool_call.type
if tool_call_type:
tool_call_object["type"] = tool_call_type
tool_call_function = tool_call.function
if tool_call_function:
if "function" not in tool_call_object:
tool_call_object["function"] = {}
tool_call_function_name = tool_call_function.name
if tool_call_function_name:
tool_call_object["function"]["name"] = tool_call_function_name
tool_call_function_arguments = tool_call_function.arguments
if tool_call_function_arguments:
if "arguments" not in tool_call_object["function"]:
tool_call_object["function"]["arguments"] = tool_call_function_arguments
else:
tool_call_object["function"]["arguments"] = tool_call_object["function"][
"arguments"] + tool_call_function_arguments # <-- Sequentially supplement the value of the function.arguments field
message["tool_calls"][tool_call_index] = tool_call_object
# After assembling all messages, we print their contents separately
for index, message in enumerate(messages):
print("index:", index)
print("message:", json.dumps(message, ensure_ascii=False))
print("")About tool_calls and function_call
tool_calls is an advanced version of function_call. Since OpenAI has marked parameters such as function_call (for example, functions) as "deprecated," our API will no longer support function_call. You can consider using tool_calls instead of function_call. Compared to function_call, tool_calls has the following advantages:
- It supports parallel calls. The Kimi large language model can return multiple
tool_callsat once. You can use concurrency in your code to call thesetool_callsimultaneously, reducing time consumption; - For
tool_callsthat have no dependencies, the Kimi large language model will also tend to call them in parallel. Compared to the original sequential calls offunction_call, this reduces token consumption to some extent;
About content
When using the tool_calls tool, you may notice that under the condition of finish_reason=tool_calls, the message.content field is occasionally not empty. Typically, the content here is the Kimi large language model explaining which tools need to be called and why these tools need to be called. Its significance lies in the fact that if your tool call process takes a long time, or if completing a round of chat requires multiple sequential tool calls, providing a descriptive sentence to the user before calling the tool can reduce the anxiety or dissatisfaction that users may feel due to waiting. Additionally, explaining to the user which tools are being called and why helps them understand the entire tool call process and allows them to intervene and correct in a timely manner (for example, if the user thinks the current tool selection is incorrect, they can terminate the tool call in time, or correct the model's tool selection in the next round of chat through a prompt).
About Tokens
The content in the tools parameter is also counted in the total Tokens. Please ensure that the total number of Tokens in tools and messages does not exceed the model's context window size.
About Message Layout
In scenarios where tools are called, our messages are no longer laid out like this:
system: ...
user: ...
assistant: ...
user: ...
assistant: ...Instead, they will look like this:
system: ...
user: ...
assistant: ...
tool: ...
tool: ...
assistant: ...It is important to note that when the Kimi large language model generates tool_calls, ensure that each tool_call has a corresponding message with role=tool, and that this message has the correct tool_call_id. If the number of role=tool messages does not match the number of tool_calls, or if the tool_call_id in the role=tool messages cannot be matched with the tool_call.id in tool_calls, an error will occur.
If You Encounter the tool_call_id not found Error
If you encounter the tool_call_id not found error, it may be because you did not add the role=assistant message returned by the Kimi API to the messages list. The correct message sequence should look like this:
system: ...
user: ...
assistant: ... # <-- Perhaps you did not add this assistant message to the messages list
tool: ...
tool: ...
assistant: ...You can avoid the tool_call_id not found error by executing messages.append(message) each time you receive a return value from the Kimi API, to add the message returned by the Kimi API to the messages list.
Note: Assistant messages added to the messages list before the role=tool message must fully include the tool_calls field and its values returned by the Kimi API. We recommend directly adding the choice.message returned by the Kimi API to the messages list "as is" to avoid potential errors.