-
Notifications
You must be signed in to change notification settings - Fork 738
feat: DeepSeek V3.2 chat template support #4797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Vladislav Nosivskoy <[email protected]>
|
👋 Hi vladnosiv! Thank you for contributing to ai-dynamo/dynamo. Just a reminder: The 🚀 |
WalkthroughThis change introduces native DeepSeek V3.2 prompt formatting support to the LLM library. A new Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Poem
Pre-merge checks✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
lib/llm/tests/deepseek_v32_encoding.rs (1)
41-47: Potential panic on malformed test input.The code uses
.unwrap()onas_object_mut()which will panic if the first message isn't an object. While acceptable for tests, consider usingexpect()with a descriptive message for better debugging.- first_msg - .as_object_mut() - .unwrap() + first_msg + .as_object_mut() + .expect("First message should be an object")lib/llm/src/preprocessor/prompt/deepseek_v32.rs (1)
66-68: Remove unused constant or add usage.
TOOL_CALLS_TEMPLATEis marked with#[allow(dead_code)]but appears unused. If it's reserved for future use, consider adding a TODO comment explaining the intent. Otherwise, remove it to reduce code clutter.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
lib/llm/src/preprocessor/prompt.rs(1 hunks)lib/llm/src/preprocessor/prompt/deepseek_v32.rs(1 hunks)lib/llm/src/preprocessor/prompt/template.rs(1 hunks)lib/llm/tests/data/deepseek-v3.2/test_input.json(1 hunks)lib/llm/tests/data/deepseek-v3.2/test_input_search_wo_date.json(1 hunks)lib/llm/tests/data/deepseek-v3.2/test_output.txt(1 hunks)lib/llm/tests/data/deepseek-v3.2/test_output_search_wo_date.txt(1 hunks)lib/llm/tests/deepseek_v32_encoding.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-24T20:59:35.725Z
Learnt from: ishandhanani
Repo: ai-dynamo/dynamo PR: 1626
File: lib/llm/src/preprocessor.rs:238-239
Timestamp: 2025-06-24T20:59:35.725Z
Learning: In lib/llm/src/preprocessor.rs, the `sampling_options` call in the `preprocess_request` method is placed in the common section after the match statement on `request.prompt_input_type()`, meaning it applies to both `PromptInput::Tokens` and `PromptInput::Text` request types.
Applied to files:
lib/llm/src/preprocessor/prompt.rs
🧬 Code graph analysis (3)
lib/llm/src/preprocessor/prompt/template.rs (1)
lib/llm/src/preprocessor/prompt/deepseek_v32.rs (2)
new(420-422)new_thinking(425-427)
lib/llm/tests/deepseek_v32_encoding.rs (1)
lib/llm/src/preprocessor/prompt/deepseek_v32.rs (1)
encode_messages(390-409)
lib/llm/src/preprocessor/prompt/deepseek_v32.rs (1)
lib/llm/src/preprocessor/prompt.rs (6)
tools(53-55)messages(52-52)supports_add_generation_prompt(83-83)supports_add_generation_prompt(96-98)render(84-84)render(100-116)
🪛 GitHub Actions: Copyright Checks
lib/llm/tests/data/deepseek-v3.2/test_output_search_wo_date.txt
[error] 1-1: Copyright check failed: Invalid/Missing Header detected in file lib/llm/tests/data/deepseek-v3.2/test_output_search_wo_date.txt.
lib/llm/tests/data/deepseek-v3.2/test_output.txt
[error] 1-1: Copyright check failed: Invalid/Missing Header detected in file lib/llm/tests/data/deepseek-v3.2/test_output.txt.
🪛 LanguageTool
lib/llm/tests/data/deepseek-v3.2/test_output.txt
[grammar] ~17-~17: Use a hyphen to join words.
Context: ...ibute should be set to "true" for string type parameters and "false" for other ty...
(QB_NEW_EN_HYPHEN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: clippy (.)
- GitHub Check: clippy (launch/dynamo-run)
- GitHub Check: clippy (lib/bindings/python)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (11)
lib/llm/src/preprocessor/prompt.rs (1)
26-26: LGTM!The new
deepseek_v32module is correctly exposed as public, following the existing pattern in this file.lib/llm/src/preprocessor/prompt/template.rs (1)
21-31: Model detection relies on display_name string matching.The detection logic is reasonable for the current use case. However, string-based model detection can be fragile if model naming conventions change.
Consider documenting the expected model name patterns that will match this logic (e.g., "DeepSeek-V3.2", "DeepSeek-V3.2-Speciale") either in a comment here or in the module documentation.
lib/llm/tests/data/deepseek-v3.2/test_input.json (1)
1-149: LGTM!Comprehensive test input covering the weather workflow scenario with tools, tool calls, reasoning content, and multi-turn conversation. The JSON structure is valid and exercises the key encoding features.
lib/llm/tests/deepseek_v32_encoding.rs (2)
403-408: Verify chat mode behavior.The comment states "Chat mode should have
</think>, thinking mode should have<think>" but the assertion on line 407 only checks that thinking mode contains<think>without checking it also contains</think>. This may be intentional based on the encoding logic, but worth verifying the expected behavior.
100-106: Verify that referenced test data files exist.The test references
test_input_search_w_date.jsonandtest_output_search_w_date.txt. Confirm these test data files are present in the repository's test data directory.lib/llm/tests/data/deepseek-v3.2/test_input_search_wo_date.json (1)
1-533: LGTM!Comprehensive test input exercising the developer role, multi-step tool invocations (search, open, find), and extensive reasoning content. This provides good coverage for complex agentic workflows.
lib/llm/src/preprocessor/prompt/deepseek_v32.rs (5)
88-119: JSON formatting handles common cases but has edge cases.The
to_json()function adds spaces after colons and commas to match Python'sjson.dumps()format. The string detection logic (prev_char != '\\') handles simple escaped quotes but won't correctly handle sequences like\\"(escaped backslash followed by quote).For test data compatibility this is likely sufficient, but if edge cases arise, consider using
serde_json::to_string_prettywith custom formatting or a more robust state machine.
278-285: Verify thinking mode logic for assistant messages.The condition
last_user_idx.is_some_and(|idx| index > idx)correctly checks if this assistant message comes after the last user message. The comment on lines 276-277 clarifies that<think>was already added in the user message rendering.This is correct behavior - just noting for reviewers that the thinking tag handling is split across user and assistant message rendering.
360-371: Verify tool result thinking tag logic.The condition
index >= idx(line 362) triggers thinking start when the tool result index is greater than or equal tolast_user_idx. This differs from the assistant handling which usesindex > idx.Is the
>=intentional? If the tool result is at exactly the same index as the last user (which shouldn't happen structurally), this could produce unexpected output. The>comparison would be more consistent with assistant handling.
1-3: LGTM on overall implementation.Clean native Rust implementation of DeepSeek V3.2 prompt formatting. The code follows the official Python reference implementation structure and handles the key features: system/user/developer/assistant/tool messages, DSML token encoding, thinking mode, and tool call formatting.
440-458: Verify tool handling inrender()method.The
render()method converts messages but doesn't callreq.tools(). If the request provides tools separately viareq.tools(), they won't be rendered unless pre-injected into messages. Confirm whether tools are always guaranteed to be embedded in messages beforerender()is called, or if tool injection logic is needed.
Signed-off-by: Vladislav Nosivskoy <[email protected]>
Signed-off-by: Vladislav Nosivskoy <[email protected]>
Signed-off-by: Vladislav Nosivskoy <[email protected]>
Signed-off-by: Vladislav Nosivskoy <[email protected]>
|
/ok to test 82faa91 |
@grahamking, there was an error processing your request: See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/ |
|
/ok to test 4b6b97b |
dmitry-tokarev-nv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a comment wrt attribution
Signed-off-by: Vladislav Nosivskoy <[email protected]>
dmitry-tokarev-nv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
/ok to test 6652fc9 |
|
/ok to test c462de6 |
Overview:
Add DeepSeek V3.2 chat template support (without tool call parser support now)
NOTE: ~5k lines here are the official tests files (json|txt) from deepseek: https://huggingface.co/deepseek-ai/DeepSeek-V3.2/tree/main/encoding
Details:
from_mdcfor dsv 3.2 and dsv 3.2-SpecialeWhere should the reviewer start?
lib/llm/src/preprocessor/prompt/deepseek_v32.rsmain implementationlib/llm/tests/deepseek_v32_encoding.rstests with official test datalib/llm/src/preprocessor/prompt/template.rsrenderer auto-detectionRelated Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Relates to #4796