UPSTREAM PR #17707: common: Deepseek V3.2 tool call parser #405

loci-dev · 2025-12-03T02:14:13Z

Deepseek V3.2 uses a new tool-call format like this:

<｜DSML｜function_calls>
<｜DSML｜invoke name="get_datetime">
<｜DSML｜parameter name="timezone" string="true">Asia/Shanghai</｜DSML｜parameter>
</｜DSML｜invoke>
</｜DSML｜function_calls>

<｜DSML｜function_calls>
<｜DSML｜invoke name="search">
<｜DSML｜parameter name="query" string="true">search agent benchmark 2024</｜DSML｜parameter>
<｜DSML｜parameter name="topn" string="false">10</｜DSML｜parameter>
<｜DSML｜parameter name="source" string="true">web</｜DSML｜parameter>
</｜DSML｜invoke>
<｜DSML｜invoke name="search">
<｜DSML｜parameter name="query" string="true">搜索智能体 基准测试</｜DSML｜parameter>
<｜DSML｜parameter name="topn" string="false">10</｜DSML｜parameter>
<｜DSML｜parameter name="source" string="true">web</｜DSML｜parameter>
</｜DSML｜invoke>
</｜DSML｜function_calls>

This PR introduces the tool-call parser for the new DeepSeek V3.2 model.

Since the official release does not provide a chat template, a provisional template has been added and tested only with llama.cpp. Compatibility with other inference engines is not guaranteed and may require further adjustments.

In addition, Minja polyfill detection has been slightly updated to accommodate the new template structure.

Needs PR #17376 to be merged first.

Removed TODO comment about untested tool call feature.

loci-agentic-ai · 2025-12-03T03:04:49Z

Explore the complete analysis inside the Version Insights

Pull Request #405 - Performance Analysis Summary

PR Title: UPSTREAM PR #17707: common: Deepseek V3.2 tool call parser
Change Scope: 10 files modified (+729 additions, -60 deletions)

Analysis Classification: Condition 1 (No Performance Impact)

This PR introduces a new chat template parser for DeepSeek V3.2 model with XML-based tool call format. The code changes add new parsing logic and test coverage without modifying existing inference paths.

Performance Impact Assessment:

The performance metrics show variations in STL container operations (vector::end(), map::begin()) with throughput changes ranging from 60-195 ns. However, these functions are not in the inference critical path. The actual changes are:

chat-parser-xml-toolcall.cpp: Added allowed_literal_between_kvsep field support for parsing boolean literals between key-value separators. Modified parse_msg_with_xml_tool_calls() to handle tool calls within thinking blocks when allow_toolcall_in_think is enabled.
Power consumption: Three binaries show minimal increases: llama-tts (+914 nJ, +0.407%), llama-cvector-generator (+756 nJ, +0.343%), llama-run (+442 nJ, +0.230%). These are chat template utilities, not inference engines.

Inference Impact: None. The modified functions (chat parsers, template renderers) execute before model inference begins. Functions like llama_decode, llama_encode, and llama_tokenize are unchanged. Tokens per second remains unaffected.

The observed STL performance variations are compiler optimization artifacts unrelated to the functional changes, which purely extend chat template parsing capabilities for a new model format.

loci-agentic-ai · 2025-12-09T04:36:47Z

Explore the complete analysis inside the Version Insights

hksdpc255 added 15 commits November 19, 2025 11:09

Fix kimi-k2 parsing

af928d4

fix template & add more tests for kimi-k2

f519483

Another fix for Kimi-K2 chat template.

e0eda17

enable allow_toolcall_in_think for Kimi-K2

043a6a7

Merge branch 'master' into hksdpc255-patch-1

e0bcf9c

Refine key-value separator and value end format

4267641

Enable tool call in think for kimi-k2

ff19d40

allow_toolcall_in_think is now tested with Kimi-K2

82a9bdc

Remove outdated TODO comment in XML tool call parser

65b1c67

Removed TODO comment about untested tool call feature.

Merge branch 'ggml-org:master' into hksdpc255-patch-1

d300e25

Add deepseek v3.2 chat template

dfc0246

minja: deepseek v3.2 template support

0e6502e

common: chat parser for Deepseek V3.2

79e5b24

Add test for Deepseek V3.2 chat parser

4f426af

Cleanup

8ebe47c

loci-dev temporarily deployed to PROD__AL_DEMO December 3, 2025 02:14 — with GitHub Actions Inactive

loci-dev force-pushed the main branch 13 times, most recently from 738bfbf to f01b714 Compare December 4, 2025 09:11

loci-dev force-pushed the main branch 3 times, most recently from f72076f to 3f5e1ff Compare December 8, 2025 21:08

Merge branch 'master' into hksdpc255-patch-2

819c76a

loci-dev temporarily deployed to PROD__AL_DEMO December 9, 2025 03:48 — with GitHub Actions Inactive

loci-dev force-pushed the main branch from 3f5e1ff to 6f5d23d Compare December 9, 2025 04:14

loci-dev force-pushed the main branch 23 times, most recently from 78ff3d3 to 117bfc3 Compare December 11, 2025 18:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

UPSTREAM PR #17707: common: Deepseek V3.2 tool call parser #405

UPSTREAM PR #17707: common: Deepseek V3.2 tool call parser #405

Uh oh!

loci-dev commented Dec 3, 2025

Uh oh!

loci-agentic-ai bot commented Dec 3, 2025

Uh oh!

loci-agentic-ai bot commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

UPSTREAM PR #17707: common: Deepseek V3.2 tool call parser #405

Are you sure you want to change the base?

UPSTREAM PR #17707: common: Deepseek V3.2 tool call parser #405

Uh oh!

Conversation

loci-dev commented Dec 3, 2025

Uh oh!

loci-agentic-ai bot commented Dec 3, 2025

Pull Request #405 - Performance Analysis Summary

Analysis Classification: Condition 1 (No Performance Impact)

Uh oh!

loci-agentic-ai bot commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants