Skip to content

Conversation

@hksdpc255
Copy link
Contributor

@hksdpc255 hksdpc255 commented Nov 19, 2025

Kimi-K2 still appears to have issues with tool-call parsing.

Closes #17155.

@github-actions github-actions bot added the testing Everything test related label Nov 19, 2025
@hksdpc255 hksdpc255 changed the title Fix Kimi-K2 tool-call parsing issues (community investigation needed) Fix Kimi-K2 tool-call parsing issues (help wanted) Nov 20, 2025
@fernandaspets
Copy link

seems to work. i've been using the copy of this PR in ik_llama.cpp and it solved the problems I was having with Roo Code not working after they hardcoded openai compatible endpoints to use native tool calling instead of XML style

@hksdpc255 hksdpc255 marked this pull request as ready for review November 24, 2025 04:45
@hksdpc255 hksdpc255 requested a review from ggerganov as a code owner November 24, 2025 04:45
@hksdpc255 hksdpc255 changed the title Fix Kimi-K2 tool-call parsing issues (help wanted) Fix Kimi-K2 tool-call parsing issues Nov 24, 2025
@hksdpc255
Copy link
Contributor Author

@ggerganov Could we take a look for this?

@aldehir
Copy link
Collaborator

aldehir commented Dec 1, 2025

I'm looking at the tool calling done by Kimi K2, and I don't understand why it was implemented with the XML parsing.

The output looks more akin to GPT-OSS, with the function name separated from the arguments and the arguments provided in a single JSON object. I don't see any resemblance to the separate XML arguments that the other models exhibit (MiniMax M2, Qwen3-Coder).

What's the rationale for rolling this into the XML parsing?

@hksdpc255
Copy link
Contributor Author

hksdpc255 commented Dec 1, 2025

@aldehir You’re correct that Kimi-K2 is more like a GPT-OSS style tool call. However, the XML tool-call parser can still handle it with roughly 30 lines of code. (no one has implemented a generic GPT-OSS style parser yet)

Additionally, another purpose of this PR is to make the previously untested allow_toolcall_in_think feature fully tested and usable.

@hksdpc255
Copy link
Contributor Author

hksdpc255 commented Dec 5, 2025

I think it is ready for merge. @ggerganov @ngxson @CISC

@CISC CISC requested a review from pwilkin December 5, 2025 09:09
Copy link
Collaborator

@pwilkin pwilkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions, some nitpicks, but the important part is we need to support the singular form <|tool_call_section_begin|> in addition to <|tool_calls_section_begin|>

Copy link
Collaborator

@pwilkin pwilkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, the tests mirror those in vLLM so I guess they should be pretty comprehensive. @bartowski1182 do you potentially have any box where you could run a Kimi K2-Thinking with some tool-calling tests?

@20jeka08
Copy link

20jeka08 commented Dec 8, 2025

Looks fine to me, the tests mirror those in vLLM so I guess they should be pretty comprehensive. @bartowski1182 do you potentially have any box where you could run a Kimi K2-Thinking with some tool-calling tests?

Hello, I've tried some tests of this PR on my local PC with 512gb RAM + 32gb VRAM (unsloth, kimi-k2-thinking Q3 quant), it looks good with Roo Code + Native tool calls, it can create, edit, perform terminal cmd. Also it works with Anthropic API without issues for me.

@pwilkin pwilkin merged commit 636fc17 into ggml-org:master Dec 8, 2025
68 of 78 checks passed
@pwilkin
Copy link
Collaborator

pwilkin commented Dec 8, 2025

Merged into master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Kimi-K2-Thinking reasoning and tool calling support

5 participants