feat: DeepSeek V3.2 chat template support #4797

vladnosiv · 2025-12-08T13:01:14Z

Overview:

Add DeepSeek V3.2 chat template support (without tool call parser support now)
NOTE: ~5k lines here are the official tests files (json|txt) from deepseek: https://huggingface.co/deepseek-ai/DeepSeek-V3.2/tree/main/encoding

Details:

implemented native Rust renderer for DeepSeek V3.2 models (no jinja template available by design) - official python implementation used as a reference
auto-detection in from_mdc for dsv 3.2 and dsv 3.2-Speciale
tool calls encoding support (todo in another pr: tool call parser)
tests including 3 official DeepSeek test cases

Where should the reviewer start?

lib/llm/src/preprocessor/prompt/deepseek_v32.rs main implementation
lib/llm/tests/deepseek_v32_encoding.rs tests with official test data
lib/llm/src/preprocessor/prompt/template.rs renderer auto-detection

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Relates to #4796

Signed-off-by: Vladislav Nosivskoy <[email protected]>

copy-pr-bot · 2025-12-08T13:01:18Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2025-12-08T13:01:23Z

👋 Hi vladnosiv! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

coderabbitai · 2025-12-08T13:05:06Z

Walkthrough

This change introduces native DeepSeek V3.2 prompt formatting support to the LLM library. A new deepseek_v32 module is added with DSML-based message encoding, tool handling, and thinking mode support. Model detection logic in template.rs routes DeepSeek V3.2 models to the new formatter. Comprehensive test data and validation tests are included.

Changes

Cohort / File(s)	Summary
DeepSeek V3.2 Module Introduction `lib/llm/src/preprocessor/prompt.rs`, `lib/llm/src/preprocessor/prompt/deepseek_v32.rs`	Declares and implements new `deepseek_v32` module with `DeepSeekV32Formatter` struct, `ThinkingMode` enum, and `encode_messages()` function. Includes DSML token constants, tool rendering with JSON schema handling, message encoding by role (system, user, developer, assistant, tool), and trait implementation for `OAIPromptFormatter`.
Integration with Prompt Routing `lib/llm/src/preprocessor/prompt/template.rs`	Adds special-case detection for DeepSeek V3.2 models in `PromptFormatter::from_mdc()`. If model name contains "deepseek" and "v3.2" (excluding "exp" variants), bypasses MDC template matching and returns an OAI formatter wrapping `DeepSeekV32Formatter::new_thinking()`.
Test Data Files `lib/llm/tests/data/deepseek-v3.2/*`	Adds four test data files: `test_input.json` (weather tool workflow), `test_input_search_wo_date.json` (document search puzzle), `test_output.txt` (DSML-formatted weather conversation), and `test_output_search_wo_date.txt` (DSML tool workflow with reasoning blocks).
Test Suite `lib/llm/tests/deepseek_v32_encoding.rs`	Implements comprehensive validation test module that loads test data pairs and validates `encode_messages()` output against expected DSML format, covering tool calls, function results, reasoning content, and thinking mode behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

deepseek_v32.rs: High-density logic with multiple public APIs (struct, enum, function), trait implementation, tool rendering with JSON handling, and message encoding pipeline. Requires verification of DSML format correctness and role-specific message handling.
template.rs: Model name detection logic and conditional routing; ensure pattern matching is robust and precedence is correct.
Test data alignment: Validate that test cases comprehensively cover tool usage, thinking blocks, and expected DSML output format.

Poem

🐰 A formatter hops in with V3.2 grace,
DSML tokens light up the space,
Tools and thinking modes intertwine,
Each message encoded in rhythms divine,
DeepSeek dreams rendered, perfectly fine! 🌟

Pre-merge checks

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main change: adding DeepSeek V3.2 chat template support. It accurately reflects the primary objective of the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check	✅ Passed	The PR description covers all required template sections with clear details about implementation, auto-detection, testing, and reviewer entry points.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

lib/llm/tests/deepseek_v32_encoding.rs (1)
41-47: Potential panic on malformed test input.

The code uses .unwrap() on as_object_mut() which will panic if the first message isn't an object. While acceptable for tests, consider using expect() with a descriptive message for better debugging.
-            first_msg
-                .as_object_mut()
-                .unwrap()
+            first_msg
+                .as_object_mut()
+                .expect("First message should be an object")
lib/llm/src/preprocessor/prompt/deepseek_v32.rs (1)

66-68: Remove unused constant or add usage.

TOOL_CALLS_TEMPLATE is marked with #[allow(dead_code)] but appears unused. If it's reserved for future use, consider adding a TODO comment explaining the intent. Otherwise, remove it to reduce code clutter.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 046229f and ab84711.

📒 Files selected for processing (8)

lib/llm/src/preprocessor/prompt.rs (1 hunks)
lib/llm/src/preprocessor/prompt/deepseek_v32.rs (1 hunks)
lib/llm/src/preprocessor/prompt/template.rs (1 hunks)
lib/llm/tests/data/deepseek-v3.2/test_input.json (1 hunks)
lib/llm/tests/data/deepseek-v3.2/test_input_search_wo_date.json (1 hunks)
lib/llm/tests/data/deepseek-v3.2/test_output.txt (1 hunks)
lib/llm/tests/data/deepseek-v3.2/test_output_search_wo_date.txt (1 hunks)
lib/llm/tests/deepseek_v32_encoding.rs (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-06-24T20:59:35.725Z

Learnt from: ishandhanani
Repo: ai-dynamo/dynamo PR: 1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.

Applied to files:

lib/llm/src/preprocessor/prompt.rs

🧬 Code graph analysis (3)

lib/llm/src/preprocessor/prompt/template.rs (1)

lib/llm/src/preprocessor/prompt/deepseek_v32.rs (2)

new (420-422)

new_thinking (425-427)

lib/llm/tests/deepseek_v32_encoding.rs (1)

lib/llm/src/preprocessor/prompt/deepseek_v32.rs (1)

encode_messages (390-409)

lib/llm/src/preprocessor/prompt/deepseek_v32.rs (1)

lib/llm/src/preprocessor/prompt.rs (6)

tools (53-55)

messages (52-52)

supports_add_generation_prompt (83-83)

supports_add_generation_prompt (96-98)

render (84-84)

render (100-116)

🪛 GitHub Actions: Copyright Checks

lib/llm/tests/data/deepseek-v3.2/test_output_search_wo_date.txt

[error] 1-1: Copyright check failed: Invalid/Missing Header detected in file lib/llm/tests/data/deepseek-v3.2/test_output_search_wo_date.txt.

lib/llm/tests/data/deepseek-v3.2/test_output.txt

[error] 1-1: Copyright check failed: Invalid/Missing Header detected in file lib/llm/tests/data/deepseek-v3.2/test_output.txt.

🪛 LanguageTool

lib/llm/tests/data/deepseek-v3.2/test_output.txt

[grammar] ~17-~17: Use a hyphen to join words.
Context: ...ibute should be set to "true" for string type parameters and "false" for other ty...

(QB_NEW_EN_HYPHEN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: clippy (.)
GitHub Check: clippy (launch/dynamo-run)
GitHub Check: clippy (lib/bindings/python)
GitHub Check: Build and Test - dynamo

🔇 Additional comments (11)

lib/llm/src/preprocessor/prompt.rs (1)

26-26: LGTM!

The new deepseek_v32 module is correctly exposed as public, following the existing pattern in this file.

lib/llm/src/preprocessor/prompt/template.rs (1)

21-31: Model detection relies on display_name string matching.

The detection logic is reasonable for the current use case. However, string-based model detection can be fragile if model naming conventions change.

Consider documenting the expected model name patterns that will match this logic (e.g., "DeepSeek-V3.2", "DeepSeek-V3.2-Speciale") either in a comment here or in the module documentation.

lib/llm/tests/data/deepseek-v3.2/test_input.json (1)

1-149: LGTM!

Comprehensive test input covering the weather workflow scenario with tools, tool calls, reasoning content, and multi-turn conversation. The JSON structure is valid and exercises the key encoding features.

lib/llm/tests/deepseek_v32_encoding.rs (2)

403-408: Verify chat mode behavior.

The comment states "Chat mode should have </think>, thinking mode should have <think>" but the assertion on line 407 only checks that thinking mode contains <think> without checking it also contains </think>. This may be intentional based on the encoding logic, but worth verifying the expected behavior.

100-106: Verify that referenced test data files exist.

The test references test_input_search_w_date.json and test_output_search_w_date.txt. Confirm these test data files are present in the repository's test data directory.

lib/llm/tests/data/deepseek-v3.2/test_input_search_wo_date.json (1)

1-533: LGTM!

Comprehensive test input exercising the developer role, multi-step tool invocations (search, open, find), and extensive reasoning content. This provides good coverage for complex agentic workflows.

lib/llm/src/preprocessor/prompt/deepseek_v32.rs (5)

88-119: JSON formatting handles common cases but has edge cases.

The to_json() function adds spaces after colons and commas to match Python's json.dumps() format. The string detection logic (prev_char != '\\') handles simple escaped quotes but won't correctly handle sequences like \\" (escaped backslash followed by quote).

For test data compatibility this is likely sufficient, but if edge cases arise, consider using serde_json::to_string_pretty with custom formatting or a more robust state machine.

278-285: Verify thinking mode logic for assistant messages.

The condition last_user_idx.is_some_and(|idx| index > idx) correctly checks if this assistant message comes after the last user message. The comment on lines 276-277 clarifies that <think> was already added in the user message rendering.

This is correct behavior - just noting for reviewers that the thinking tag handling is split across user and assistant message rendering.

360-371: Verify tool result thinking tag logic.

The condition index >= idx (line 362) triggers thinking start when the tool result index is greater than or equal to last_user_idx. This differs from the assistant handling which uses index > idx.

Is the >= intentional? If the tool result is at exactly the same index as the last user (which shouldn't happen structurally), this could produce unexpected output. The > comparison would be more consistent with assistant handling.

1-3: LGTM on overall implementation.

Clean native Rust implementation of DeepSeek V3.2 prompt formatting. The code follows the official Python reference implementation structure and handles the key features: system/user/developer/assistant/tool messages, DSML token encoding, thinking mode, and tool call formatting.

440-458: Verify tool handling in render() method.

The render() method converts messages but doesn't call req.tools(). If the request provides tools separately via req.tools(), they won't be rendered unless pre-injected into messages. Confirm whether tools are always guaranteed to be embedded in messages before render() is called, or if tool injection logic is needed.

lib/llm/tests/data/deepseek-v3.2/test_output_search_wo_date.txt

lib/llm/tests/data/deepseek-v3.2/test_output.txt

Signed-off-by: Vladislav Nosivskoy <[email protected]>

grahamking · 2025-12-09T21:33:16Z

/ok to test 82faa91

copy-pr-bot · 2025-12-09T21:33:19Z

/ok to test 82faa91

@grahamking, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

grahamking · 2025-12-09T21:34:51Z

/ok to test 4b6b97b

dmitry-tokarev-nv

added a comment wrt attribution

lib/llm/tests/data/deepseek-v3.2/test_input.json

Signed-off-by: Vladislav Nosivskoy <[email protected]>

dmitry-tokarev-nv

LGTM

rmccorm4 · 2025-12-10T20:58:50Z

/ok to test 6652fc9

rmccorm4 · 2025-12-11T23:22:22Z

/ok to test c462de6

add deepseek v32 chat template support

ab84711

Signed-off-by: Vladislav Nosivskoy <[email protected]>

vladnosiv requested a review from a team as a code owner December 8, 2025 13:01

pull-request-size bot added the size/XXL label Dec 8, 2025

github-actions bot added feat external-contribution Pull request is from an external contributor labels Dec 8, 2025

coderabbitai bot reviewed Dec 8, 2025

View reviewed changes

lib/llm/tests/data/deepseek-v3.2/test_output_search_wo_date.txt Show resolved Hide resolved

lib/llm/tests/data/deepseek-v3.2/test_output.txt Show resolved Hide resolved

vladnosiv added 2 commits December 8, 2025 17:08

use original json|txt files

48048b5

Signed-off-by: Vladislav Nosivskoy <[email protected]>

fix fmt

a65129e

Signed-off-by: Vladislav Nosivskoy <[email protected]>

rmccorm4 requested a review from GuanLuo December 9, 2025 00:32

dagil-nvidia added the backend::sglang Relates to the sglang backend label Dec 9, 2025

vladnosiv and others added 2 commits December 9, 2025 14:04

Merge branch 'main' into dsv32-chat-template

51a0b5a

ignore ds test files in copyright check

4b6b97b

Signed-off-by: Vladislav Nosivskoy <[email protected]>

vladnosiv requested a review from a team as a code owner December 9, 2025 11:05

vladnosiv added a commit to vladnosiv/dynamo that referenced this pull request Dec 9, 2025

add necessary files from PR ai-dynamo#4797

82faa91

Signed-off-by: Vladislav Nosivskoy <[email protected]>

vladnosiv mentioned this pull request Dec 9, 2025

feat: DeepSeek V3.2 tool calling support #4822

Merged

grahamking approved these changes Dec 9, 2025

View reviewed changes

copy-pr-bot bot had a problem deploying to GITLAB December 9, 2025 21:34 Failure

dmitry-tokarev-nv requested changes Dec 9, 2025

View reviewed changes

lib/llm/tests/data/deepseek-v3.2/test_input.json Show resolved Hide resolved

vladnosiv added 2 commits December 10, 2025 13:28

Merge branch 'main' into dsv32-chat-template

48f86fb

fix license issues

a322800

Signed-off-by: Vladislav Nosivskoy <[email protected]>

rmccorm4 requested a review from dmitry-tokarev-nv December 10, 2025 20:49

dmitry-tokarev-nv approved these changes Dec 10, 2025

View reviewed changes

Merge branch 'main' into dsv32-chat-template

6652fc9

rmccorm4 enabled auto-merge (squash) December 10, 2025 20:58

copy-pr-bot bot temporarily deployed to GITLAB December 10, 2025 20:58 Inactive

copy-pr-bot bot temporarily deployed to GITLAB December 10, 2025 21:07 Inactive

vladnosiv added 2 commits December 11, 2025 18:55

Merge branch 'main' into dsv32-chat-template

534bf79

Merge branch 'main' into dsv32-chat-template

c462de6

copy-pr-bot bot temporarily deployed to GITLAB December 11, 2025 23:22 Inactive

copy-pr-bot bot temporarily deployed to GITLAB December 11, 2025 23:24 Inactive

rmccorm4 merged commit 1efc7d6 into ai-dynamo:main Dec 12, 2025
47 of 55 checks passed

rmccorm4 mentioned this pull request Dec 12, 2025

[FEATURE]: DeepSeek V3.2 Support #4796

Open

feat: DeepSeek V3.2 chat template support #4797

feat: DeepSeek V3.2 chat template support #4797

Uh oh!

Conversation

vladnosiv commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Uh oh!

copy-pr-bot bot commented Dec 8, 2025

Uh oh!

github-actions bot commented Dec 8, 2025

Uh oh!

coderabbitai bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

grahamking commented Dec 9, 2025

Uh oh!

copy-pr-bot bot commented Dec 9, 2025

Uh oh!

grahamking commented Dec 9, 2025

Uh oh!

dmitry-tokarev-nv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dmitry-tokarev-nv left a comment

Choose a reason for hiding this comment

Uh oh!

rmccorm4 commented Dec 10, 2025

Uh oh!

rmccorm4 commented Dec 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

vladnosiv commented Dec 8, 2025 •

edited

Loading

coderabbitai bot commented Dec 8, 2025 •

edited

Loading