|
## Tool Calling |
|
To enable the tool calling feature, you may need to set certain tool calling parser options when starting the service. See [deploy_guidance](./deploy_guidance.md) for details. |
|
In Kimi-K2, a tool calling process includes: |
|
- Passing function descriptions to Kimi-K2 |
|
- Kimi-K2 decides to make a function call and returns the necessary information for the function call to the user |
|
- The user performs the function call, collects the call results, and passes the function call results to Kimi-K2 |
|
- Kimi-K2 continues to generate content based on the function call results until the model believes it has obtained sufficient information to respond to the user |
|
|
|
### Preparing Tools |
|
Suppose we have a function `get_weather` that can query the weather conditions in real-time. |
|
This function accepts a city name as a parameter and returns the weather conditions. We need to prepare a structured description for it so that Kimi-K2 can understand its functionality. |
|
|
|
```python |
|
def get_weather(city): |
|
return {"weather": "Sunny"} |
|
|
|
# Collect the tool descriptions in tools |
|
tools = [{ |
|
"type": "function", |
|
"function": { |
|
"name": "get_weather", |
|
"description": "Get weather information. Call this tool when the user needs to get weather information", |
|
"parameters": { |
|
"type": "object", |
|
"required": ["city"], |
|
"properties": { |
|
"city": { |
|
"type": "string", |
|
"description": "City name", |
|
} |
|
} |
|
} |
|
} |
|
}] |
|
|
|
# Tool name->object mapping for easy calling later |
|
tool_map = { |
|
"get_weather": get_weather |
|
} |
|
``` |
|
### Chat with tools |
|
We use `openai.OpenAI` to send messages to Kimi-K2 along with tool descriptions. Kimi-K2 will autonomously decide whether to use and how to use the provided tools. |
|
If Kimi-K2 believes a tool call is needed, it will return a result with `finish_reason='tool_calls'`. At this point, the returned result includes the tool call information. |
|
After calling tools with the provided information, we then need to append the tool call results to the chat history and continue calling Kimi-K2. |
|
Kimi-K2 may need to call tools multiple times until the model believes the current results can answer the user's question. We should check `finish_reason` until it is not `tool_calls`. |
|
|
|
The results obtained by the user after calling the tools should be added to `messages` with `role='tool'`. |
|
|
|
```python |
|
import json |
|
from openai import OpenAI |
|
model_name='moonshotai/Kimi-K2-Instruct' |
|
client = OpenAI(base_url=endpoint, |
|
api_key='xxx') |
|
|
|
messages = [ |
|
{"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."} |
|
] |
|
finish_reason = None |
|
while finish_reason is None or finish_reason == "tool_calls": |
|
completion = client.chat.completions.create( |
|
model=model_name, |
|
messages=messages, |
|
temperature=0.3, |
|
tools=tools, |
|
tool_choice="auto", |
|
) |
|
choice = completion.choices[0] |
|
finish_reason = choice.finish_reason |
|
# Note: The finish_reason when tool calls end may vary across different engines, so this condition check needs to be adjusted accordingly |
|
if finish_reason == "tool_calls": |
|
messages.append(choice.message) |
|
for tool_call in choice.message.tool_calls: |
|
tool_call_name = tool_call.function.name |
|
tool_call_arguments = json.loads(tool_call.function.arguments) |
|
tool_function = tool_map[tool_call_name] |
|
tool_result = tool_function(tool_call_arguments) |
|
print("tool_result", tool_result) |
|
|
|
messages.append({ |
|
"role": "tool", |
|
"tool_call_id": tool_call.id, |
|
"name": tool_call_name, |
|
"content": json.dumps(tool_result), |
|
}) |
|
print('-' * 100) |
|
print(choice.message.content) |
|
``` |
|
### Tool Calling in Streaming Mode |
|
Tool calling can also be used in streaming mode. In this case, we need to collect the tool call information returned in the stream until we have a complete tool call. Please refer to the code below: |
|
|
|
```python |
|
messages = [ |
|
{"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."} |
|
] |
|
finish_reason = None |
|
msg = '' |
|
while finish_reason is None or finish_reason == "tool_calls": |
|
completion = client.chat.completions.create( |
|
model=model_name, |
|
messages=messages, |
|
temperature=0.3, |
|
tools=tools, |
|
tool_choice="auto", |
|
stream=True |
|
) |
|
tool_calls = [] |
|
for chunk in completion: |
|
delta = chunk.choices[0].delta |
|
if delta.content: |
|
msg += delta.content |
|
if delta.tool_calls: |
|
for tool_call_chunk in delta.tool_calls: |
|
if tool_call_chunk.index is not None: |
|
# Extend the tool_calls list |
|
while len(tool_calls) <= tool_call_chunk.index: |
|
tool_calls.append({ |
|
"id": "", |
|
"type": "function", |
|
"function": { |
|
"name": "", |
|
"arguments": "" |
|
} |
|
}) |
|
|
|
tc = tool_calls[tool_call_chunk.index] |
|
|
|
if tool_call_chunk.id: |
|
tc["id"] += tool_call_chunk.id |
|
if tool_call_chunk.function.name: |
|
tc["function"]["name"] += tool_call_chunk.function.name |
|
if tool_call_chunk.function.arguments: |
|
tc["function"]["arguments"] += tool_call_chunk.function.arguments |
|
|
|
finish_reason = chunk.choices[0].finish_reason |
|
# Note: The finish_reason when tool calls end may vary across different engines, so this condition check needs to be adjusted accordingly |
|
if finish_reason == "tool_calls": |
|
for tool_call in tool_calls: |
|
tool_call_name = tool_call['function']['name'] |
|
tool_call_arguments = json.loads(tool_call['function']['arguments']) |
|
tool_function = tool_map[tool_call_name] |
|
tool_result = tool_function(tool_call_arguments) |
|
messages.append({ |
|
"role": "tool", |
|
"tool_call_id": tool_call['id'], |
|
"name": tool_call_name, |
|
"content": json.dumps(tool_result), |
|
}) |
|
# The text generated by the tool call is not the final version, reset msg |
|
msg = '' |
|
|
|
print(msg) |
|
``` |
|
### Manually Parsing Tool Calls |
|
The tool call requests generated by Kimi-K2 can also be parsed manually, which is especially useful when the service you are using does not provide a tool-call parser. |
|
The tool call requests generated by Kimi-K2 are wrapped by `<|tool_calls_section_begin|>` and `<|tool_calls_section_end|>`, |
|
with each tool call wrapped by `<|tool_call_begin|>` and `<|tool_call_end|>`. The tool ID and arguments are separated by `<|tool_call_argument_begin|>`. |
|
The format of the tool ID is `functions.{func_name}:{idx}`, from which we can parse the function name. |
|
|
|
Based on the above rules, we can directly post request to the completions interface and manually parse tool calls. |
|
|
|
```python |
|
import requests |
|
from transformers import AutoTokenizer |
|
messages = [ |
|
{"role": "user", "content": "What's the weather like in Beijing today? Let's check using the tool."} |
|
] |
|
msg = '' |
|
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) |
|
while True: |
|
text = tokenizer.apply_chat_template( |
|
messages, |
|
tokenize=False, |
|
tools=tools, |
|
add_generation_prompt=True, |
|
) |
|
payload = { |
|
"model": model_name, |
|
"prompt": text, |
|
"max_tokens": 512 |
|
} |
|
response = requests.post( |
|
f"{endpoint}/completions", |
|
headers={"Content-Type": "application/json"}, |
|
json=payload, |
|
stream=False, |
|
) |
|
raw_out = response.json() |
|
|
|
raw_output = raw_out["choices"][0]["text"] |
|
tool_calls = extract_tool_call_info(raw_output) |
|
if len(tool_calls) == 0: |
|
# No tool calls |
|
msg = raw_output |
|
break |
|
else: |
|
for tool_call in tool_calls: |
|
tool_call_name = tool_call['function']['name'] |
|
tool_call_arguments = json.loads(tool_call['function']['arguments']) |
|
tool_function = tool_map[tool_call_name] |
|
tool_result = tool_function(tool_call_arguments) |
|
|
|
messages.append({ |
|
"role": "tool", |
|
"tool_call_id": tool_call['id'], |
|
"name": tool_call_name, |
|
"content": json.dumps(tool_result), |
|
}) |
|
print('-' * 100) |
|
print(msg) |
|
``` |
|
Here, `extract_tool_call_info` parses the model output and returns the model call information. A simple implementation would be: |
|
```python |
|
def extract_tool_call_info(tool_call_rsp: str): |
|
if '<|tool_calls_section_begin|>' not in tool_call_rsp: |
|
# No tool calls |
|
return [] |
|
import re |
|
pattern = r"<\|tool_calls_section_begin\|>(.*?)<\|tool_calls_section_end\|>" |
|
|
|
tool_calls_sections = re.findall(pattern, tool_call_rsp, re.DOTALL) |
|
|
|
# Extract multiple tool calls |
|
func_call_pattern = r"<\|tool_call_begin\|>\s*(?P<tool_call_id>[\w\.]+:\d+)\s*<\|tool_call_argument_begin\|>\s*(?P<function_arguments>.*?)\s*<\|tool_call_end\|>" |
|
tool_calls = [] |
|
for match in re.findall(func_call_pattern, tool_calls_sections[0], re.DOTALL): |
|
function_id, function_args = match |
|
# function_id: functions.get_weather:0 |
|
function_name = function_id.split('.')[1].split(':')[0] |
|
tool_calls.append( |
|
{ |
|
"id": function_id, |
|
"type": "function", |
|
"function": { |
|
"name": function_name, |
|
"arguments": function_args |
|
} |
|
} |
|
) |
|
return tool_calls |
|
``` |
|
|
|
## FAQ |
|
|
|
#### Q1: I received special tokens like '<|tool_call_begin|>' in the 'content' field instead of a normal tool_call. |
|
|
|
This indicates a tool-call crash, which most often occurs in multi-turn tool-calling scenarios due to incorrect tool-call ID. K2 expects the ID to follow the format `functions.func_name:idx`, where `functions` is a fixed string; `func_name` is the actual function name, like `get_weather`, and `idx` is a global counter that starts at 0 and increments with each function invocation. |
|
Please check all tool-call IDs in the message list. |
|
|
|
|
|
#### Q2: My tool-call ID is incorrect—how can I fix it? |
|
|
|
First, make sure your code and chat template are up to date with the latest version from the Hugging Face repo. |
|
If you're using vLLM or SGLang and they are generating random tool-call IDs, upgrade them to the latest release. For other frameworks, you must either parse the tool-call ID from the model output and set it correctly in the server-side response, or rewrite every tool-call ID according to the rules above on the client side before sending the messages to Kimi K2. |
|
|
|
#### Q3: My tool call id is correct, but I still get crashed in multiturn tool call. |
|
|
|
Please describe your situation in the [discussion](https://huggingface.co/moonshotai/Kimi-K2-Instruct-0905/discussions) |