Skip to content

Conversation

@vladnosiv
Copy link
Contributor

@vladnosiv vladnosiv commented Dec 8, 2025

Overview:

Add DeepSeek V3.2 chat template support (without tool call parser support now)
NOTE: ~5k lines here are the official tests files (json|txt) from deepseek: https://huggingface.co/deepseek-ai/DeepSeek-V3.2/tree/main/encoding

Details:

  • implemented native Rust renderer for DeepSeek V3.2 models (no jinja template available by design) - official python implementation used as a reference
  • auto-detection in from_mdc for dsv 3.2 and dsv 3.2-Speciale
  • tool calls encoding support (todo in another pr: tool call parser)
  • tests including 3 official DeepSeek test cases

Where should the reviewer start?

  • lib/llm/src/preprocessor/prompt/deepseek_v32.rs main implementation
  • lib/llm/tests/deepseek_v32_encoding.rs tests with official test data
  • lib/llm/src/preprocessor/prompt/template.rs renderer auto-detection

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Relates to #4796

Signed-off-by: Vladislav Nosivskoy <[email protected]>
@vladnosiv vladnosiv requested a review from a team as a code owner December 8, 2025 13:01
@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 8, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link

github-actions bot commented Dec 8, 2025

👋 Hi vladnosiv! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

@github-actions github-actions bot added feat external-contribution Pull request is from an external contributor labels Dec 8, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 8, 2025

Walkthrough

This change introduces native DeepSeek V3.2 prompt formatting support to the LLM library. A new deepseek_v32 module is added with DSML-based message encoding, tool handling, and thinking mode support. Model detection logic in template.rs routes DeepSeek V3.2 models to the new formatter. Comprehensive test data and validation tests are included.

Changes

Cohort / File(s) Summary
DeepSeek V3.2 Module Introduction
lib/llm/src/preprocessor/prompt.rs, lib/llm/src/preprocessor/prompt/deepseek_v32.rs
Declares and implements new deepseek_v32 module with DeepSeekV32Formatter struct, ThinkingMode enum, and encode_messages() function. Includes DSML token constants, tool rendering with JSON schema handling, message encoding by role (system, user, developer, assistant, tool), and trait implementation for OAIPromptFormatter.
Integration with Prompt Routing
lib/llm/src/preprocessor/prompt/template.rs
Adds special-case detection for DeepSeek V3.2 models in PromptFormatter::from_mdc(). If model name contains "deepseek" and "v3.2" (excluding "exp" variants), bypasses MDC template matching and returns an OAI formatter wrapping DeepSeekV32Formatter::new_thinking().
Test Data Files
lib/llm/tests/data/deepseek-v3.2/*
Adds four test data files: test_input.json (weather tool workflow), test_input_search_wo_date.json (document search puzzle), test_output.txt (DSML-formatted weather conversation), and test_output_search_wo_date.txt (DSML tool workflow with reasoning blocks).
Test Suite
lib/llm/tests/deepseek_v32_encoding.rs
Implements comprehensive validation test module that loads test data pairs and validates encode_messages() output against expected DSML format, covering tool calls, function results, reasoning content, and thinking mode behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • deepseek_v32.rs: High-density logic with multiple public APIs (struct, enum, function), trait implementation, tool rendering with JSON handling, and message encoding pipeline. Requires verification of DSML format correctness and role-specific message handling.
  • template.rs: Model name detection logic and conditional routing; ensure pattern matching is robust and precedence is correct.
  • Test data alignment: Validate that test cases comprehensively cover tool usage, thinking blocks, and expected DSML output format.

Poem

🐰 A formatter hops in with V3.2 grace,
DSML tokens light up the space,
Tools and thinking modes intertwine,
Each message encoded in rhythms divine,
DeepSeek dreams rendered, perfectly fine! 🌟

Pre-merge checks

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: adding DeepSeek V3.2 chat template support. It accurately reflects the primary objective of the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The PR description covers all required template sections with clear details about implementation, auto-detection, testing, and reviewer entry points.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
lib/llm/tests/deepseek_v32_encoding.rs (1)

41-47: Potential panic on malformed test input.

The code uses .unwrap() on as_object_mut() which will panic if the first message isn't an object. While acceptable for tests, consider using expect() with a descriptive message for better debugging.

-            first_msg
-                .as_object_mut()
-                .unwrap()
+            first_msg
+                .as_object_mut()
+                .expect("First message should be an object")
lib/llm/src/preprocessor/prompt/deepseek_v32.rs (1)

66-68: Remove unused constant or add usage.

TOOL_CALLS_TEMPLATE is marked with #[allow(dead_code)] but appears unused. If it's reserved for future use, consider adding a TODO comment explaining the intent. Otherwise, remove it to reduce code clutter.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 046229f and ab84711.

📒 Files selected for processing (8)
  • lib/llm/src/preprocessor/prompt.rs (1 hunks)
  • lib/llm/src/preprocessor/prompt/deepseek_v32.rs (1 hunks)
  • lib/llm/src/preprocessor/prompt/template.rs (1 hunks)
  • lib/llm/tests/data/deepseek-v3.2/test_input.json (1 hunks)
  • lib/llm/tests/data/deepseek-v3.2/test_input_search_wo_date.json (1 hunks)
  • lib/llm/tests/data/deepseek-v3.2/test_output.txt (1 hunks)
  • lib/llm/tests/data/deepseek-v3.2/test_output_search_wo_date.txt (1 hunks)
  • lib/llm/tests/deepseek_v32_encoding.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-24T20:59:35.725Z
Learnt from: ishandhanani
Repo: ai-dynamo/dynamo PR: 1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.

Applied to files:

  • lib/llm/src/preprocessor/prompt.rs
🧬 Code graph analysis (3)
lib/llm/src/preprocessor/prompt/template.rs (1)
lib/llm/src/preprocessor/prompt/deepseek_v32.rs (2)
  • new (420-422)
  • new_thinking (425-427)
lib/llm/tests/deepseek_v32_encoding.rs (1)
lib/llm/src/preprocessor/prompt/deepseek_v32.rs (1)
  • encode_messages (390-409)
lib/llm/src/preprocessor/prompt/deepseek_v32.rs (1)
lib/llm/src/preprocessor/prompt.rs (6)
  • tools (53-55)
  • messages (52-52)
  • supports_add_generation_prompt (83-83)
  • supports_add_generation_prompt (96-98)
  • render (84-84)
  • render (100-116)
🪛 GitHub Actions: Copyright Checks
lib/llm/tests/data/deepseek-v3.2/test_output_search_wo_date.txt

[error] 1-1: Copyright check failed: Invalid/Missing Header detected in file lib/llm/tests/data/deepseek-v3.2/test_output_search_wo_date.txt.

lib/llm/tests/data/deepseek-v3.2/test_output.txt

[error] 1-1: Copyright check failed: Invalid/Missing Header detected in file lib/llm/tests/data/deepseek-v3.2/test_output.txt.

🪛 LanguageTool
lib/llm/tests/data/deepseek-v3.2/test_output.txt

[grammar] ~17-~17: Use a hyphen to join words.
Context: ...ibute should be set to "true" for string type parameters and "false" for other ty...

(QB_NEW_EN_HYPHEN)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: clippy (.)
  • GitHub Check: clippy (launch/dynamo-run)
  • GitHub Check: clippy (lib/bindings/python)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (11)
lib/llm/src/preprocessor/prompt.rs (1)

26-26: LGTM!

The new deepseek_v32 module is correctly exposed as public, following the existing pattern in this file.

lib/llm/src/preprocessor/prompt/template.rs (1)

21-31: Model detection relies on display_name string matching.

The detection logic is reasonable for the current use case. However, string-based model detection can be fragile if model naming conventions change.

Consider documenting the expected model name patterns that will match this logic (e.g., "DeepSeek-V3.2", "DeepSeek-V3.2-Speciale") either in a comment here or in the module documentation.

lib/llm/tests/data/deepseek-v3.2/test_input.json (1)

1-149: LGTM!

Comprehensive test input covering the weather workflow scenario with tools, tool calls, reasoning content, and multi-turn conversation. The JSON structure is valid and exercises the key encoding features.

lib/llm/tests/deepseek_v32_encoding.rs (2)

403-408: Verify chat mode behavior.

The comment states "Chat mode should have </think>, thinking mode should have <think>" but the assertion on line 407 only checks that thinking mode contains <think> without checking it also contains </think>. This may be intentional based on the encoding logic, but worth verifying the expected behavior.


100-106: Verify that referenced test data files exist.

The test references test_input_search_w_date.json and test_output_search_w_date.txt. Confirm these test data files are present in the repository's test data directory.

lib/llm/tests/data/deepseek-v3.2/test_input_search_wo_date.json (1)

1-533: LGTM!

Comprehensive test input exercising the developer role, multi-step tool invocations (search, open, find), and extensive reasoning content. This provides good coverage for complex agentic workflows.

lib/llm/src/preprocessor/prompt/deepseek_v32.rs (5)

88-119: JSON formatting handles common cases but has edge cases.

The to_json() function adds spaces after colons and commas to match Python's json.dumps() format. The string detection logic (prev_char != '\\') handles simple escaped quotes but won't correctly handle sequences like \\" (escaped backslash followed by quote).

For test data compatibility this is likely sufficient, but if edge cases arise, consider using serde_json::to_string_pretty with custom formatting or a more robust state machine.


278-285: Verify thinking mode logic for assistant messages.

The condition last_user_idx.is_some_and(|idx| index > idx) correctly checks if this assistant message comes after the last user message. The comment on lines 276-277 clarifies that <think> was already added in the user message rendering.

This is correct behavior - just noting for reviewers that the thinking tag handling is split across user and assistant message rendering.


360-371: Verify tool result thinking tag logic.

The condition index >= idx (line 362) triggers thinking start when the tool result index is greater than or equal to last_user_idx. This differs from the assistant handling which uses index > idx.

Is the >= intentional? If the tool result is at exactly the same index as the last user (which shouldn't happen structurally), this could produce unexpected output. The > comparison would be more consistent with assistant handling.


1-3: LGTM on overall implementation.

Clean native Rust implementation of DeepSeek V3.2 prompt formatting. The code follows the official Python reference implementation structure and handles the key features: system/user/developer/assistant/tool messages, DSML token encoding, thinking mode, and tool call formatting.


440-458: Verify tool handling in render() method.

The render() method converts messages but doesn't call req.tools(). If the request provides tools separately via req.tools(), they won't be rendered unless pre-injected into messages. Confirm whether tools are always guaranteed to be embedded in messages before render() is called, or if tool injection logic is needed.

Signed-off-by: Vladislav Nosivskoy <[email protected]>
Signed-off-by: Vladislav Nosivskoy <[email protected]>
@rmccorm4 rmccorm4 requested a review from GuanLuo December 9, 2025 00:32
@dagil-nvidia dagil-nvidia added the backend::sglang Relates to the sglang backend label Dec 9, 2025
@vladnosiv vladnosiv requested a review from a team as a code owner December 9, 2025 11:05
vladnosiv added a commit to vladnosiv/dynamo that referenced this pull request Dec 9, 2025
Signed-off-by: Vladislav Nosivskoy <[email protected]>
@grahamking
Copy link
Contributor

/ok to test 82faa91

@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 9, 2025

/ok to test 82faa91

@grahamking, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

@grahamking
Copy link
Contributor

/ok to test 4b6b97b

Copy link
Contributor

@dmitry-tokarev-nv dmitry-tokarev-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a comment wrt attribution

Copy link
Contributor

@dmitry-tokarev-nv dmitry-tokarev-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rmccorm4
Copy link
Contributor

/ok to test 6652fc9

@rmccorm4
Copy link
Contributor

/ok to test c462de6

@rmccorm4 rmccorm4 merged commit 1efc7d6 into ai-dynamo:main Dec 12, 2025
47 of 55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend::sglang Relates to the sglang backend external-contribution Pull request is from an external contributor feat size/XXL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants