-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Fix Kimi-K2 tool-call parsing issues #17376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
seems to work. i've been using the copy of this PR in ik_llama.cpp and it solved the problems I was having with Roo Code not working after they hardcoded openai compatible endpoints to use native tool calling instead of XML style |
|
@ggerganov Could we take a look for this? |
|
I'm looking at the tool calling done by Kimi K2, and I don't understand why it was implemented with the XML parsing. The output looks more akin to GPT-OSS, with the function name separated from the arguments and the arguments provided in a single JSON object. I don't see any resemblance to the separate XML arguments that the other models exhibit (MiniMax M2, Qwen3-Coder). What's the rationale for rolling this into the XML parsing? |
|
@aldehir You’re correct that Kimi-K2 is more like a GPT-OSS style tool call. However, the XML tool-call parser can still handle it with roughly 30 lines of code. (no one has implemented a generic GPT-OSS style parser yet) Additionally, another purpose of this PR is to make the previously untested |
Removed TODO comment about untested tool call feature.
|
I think it is ready for merge. @ggerganov @ngxson @CISC |
pwilkin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions, some nitpicks, but the important part is we need to support the singular form <|tool_call_section_begin|> in addition to <|tool_calls_section_begin|>
pwilkin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me, the tests mirror those in vLLM so I guess they should be pretty comprehensive. @bartowski1182 do you potentially have any box where you could run a Kimi K2-Thinking with some tool-calling tests?
Hello, I've tried some tests of this PR on my local PC with 512gb RAM + 32gb VRAM (unsloth, kimi-k2-thinking Q3 quant), it looks good with Roo Code + Native tool calls, it can create, edit, perform terminal cmd. Also it works with Anthropic API without issues for me. |
|
Merged into master. |
Kimi-K2 still appears to have issues with tool-call parsing.
Closes #17155.