-
Notifications
You must be signed in to change notification settings - Fork 767
fix: Support for msg[content] as a list #4380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
879b1d5
d5eaec9
7e662ce
fdfc7da
ef0165f
2e14a80
9b82304
fbe3d68
3decf92
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
Signed-off-by: Krishnan Prashanth <[email protected]>
- Loading branch information
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -70,6 +70,44 @@ fn replace_non_standard_blocks(template: &str) -> String { | |
| result | ||
| } | ||
|
|
||
| /// Detects whether a chat template requires message content as arrays (multimodal) | ||
| /// or accepts simple strings (standard text-only templates). | ||
| /// | ||
| /// This function test-renders the template with both formats: | ||
| /// - Array format: `[{"type": "text", "text": "X"}]` | ||
| /// - String format: `"X"` | ||
| /// | ||
| /// If the array format works but string format doesn't produce output, | ||
| /// the template requires arrays (e.g., llava, Qwen-VL multimodal templates). | ||
| fn detect_content_array_usage(env: &Environment) -> bool { | ||
| use minijinja::context; | ||
| use serde_json::json; | ||
|
|
||
| // Test with array format | ||
| let test_array = context! { | ||
| messages => json!([{"role": "user", "content": [{"type": "text", "text": "X"}]}]), | ||
| add_generation_prompt => false, | ||
| }; | ||
|
|
||
| // Test with string format | ||
| let test_string = context! { | ||
| messages => json!([{"role": "user", "content": "X"}]), | ||
| add_generation_prompt => false, | ||
| }; | ||
|
|
||
| let out_array = env | ||
| .get_template("default") | ||
| .and_then(|t| t.render(&test_array)) | ||
| .unwrap_or_default(); | ||
| let out_string = env | ||
| .get_template("default") | ||
| .and_then(|t| t.render(&test_string)) | ||
| .unwrap_or_default(); | ||
|
|
||
| // If array works but string doesn't, template requires arrays | ||
| out_array.contains("X") && !out_string.contains("X") | ||
| } | ||
|
|
||
| impl JinjaEnvironment { | ||
| fn env(self) -> Environment<'static> { | ||
| self.env | ||
|
|
@@ -165,11 +203,20 @@ impl HfTokenizerConfigJsonFormatter { | |
| } | ||
| } | ||
|
|
||
| // Detect at model load time whether this template requires content arrays | ||
| let requires_content_arrays = detect_content_array_usage(&env); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How does vllm-serve, sglang, trtllm-serve, etc. solve this problem?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From previous investigations, vLLM's pre-processor does something like:
And from my limited investigation into |
||
|
|
||
| tracing::info!( | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. info is too noisy for this, maybe debug or trace instead if we need this log |
||
| "Template analysis: requires_content_arrays = {}", | ||
| requires_content_arrays | ||
| ); | ||
|
|
||
| Ok(HfTokenizerConfigJsonFormatter { | ||
| env, | ||
| config, | ||
| mixins: Arc::new(mixins), | ||
| supports_add_generation_prompt: supports_add_generation_prompt.unwrap_or(false), | ||
| requires_content_arrays, | ||
| }) | ||
| } | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be an indepdendent fix targeting
maininstead of keivenchang/MDC-fix-on-main-nvbugs5662072? Or does it need to target Keiven's branch?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we merge Keiven's PR (without this one), special models (with custom block tags) would still be usable with dynamo+vLLM, but could fail if their inference request's are malformed (msg[content] is a string, but the model chat template wants a list). I will leave it up to you on what branch this fix should target. I guess it's more just a question of model support.