-
Notifications
You must be signed in to change notification settings - Fork 738
docs: Add multimodal documentation vllm, sglang, and trtllm backends #4510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Indrajit Bhosale <[email protected]>
rmccorm4
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update https://github.com/ai-dynamo/dynamo/blob/main/docs/multimodal/multimodal_intro.md at the bottom with links to each of the backend specific docs as a central location?
Signed-off-by: Indrajit Bhosale <[email protected]>
rmccorm4
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: https://github.com/ai-dynamo/dynamo/actions/runs/19580969130/job/56078496698?pr=4510
#15 5.473 checking consistency... /workspace/dynamo/docs/backends/sglang/multimodal_sglang_guide.md: WARNING: document isn't included in any toctree [toc.not_included]
#15 5.474 /workspace/dynamo/docs/backends/trtllm/multimodal_trtllm_guide.md: WARNING: document isn't included in any toctree [toc.not_included]
#15 5.474 /workspace/dynamo/docs/backends/vllm/multimodal_vllm_guide.md: WARNING: document isn't included in any toctree [toc.not_included]
Needs these files added to docs/hidden_toctree.rst
|
@krishung5 can you help review the docs here? Main point is to clearly document what is supported in each backend today with relation to multimodality, and highlight at least 1 key example or model for each. |
krishung5
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WalkthroughAdds three new comprehensive multimodal backend guides for SGLang, TRT-LLM, and vLLM, documenting support matrices, deployment architectures, component configurations, inter-component communication patterns, and known limitations. Updates documentation registry to include these new guides. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
docs/backends/sglang/multimodal_sglang_guide.md(1 hunks)docs/backends/trtllm/multimodal_trtllm_guide.md(1 hunks)docs/backends/vllm/multimodal_vllm_guide.md(1 hunks)docs/hidden_toctree.rst(3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-16T19:47:30.312Z
Learnt from: KrishnanPrash
Repo: ai-dynamo/dynamo PR: 3067
File: lib/llm/src/preprocessor/prompt/template/oai.rs:87-134
Timestamp: 2025-09-16T19:47:30.312Z
Learning: In Dynamo, multimodal requests (containing image_url or other non-text content) are processed through a completely different workflow than text-only requests, so the may_be_fix_msg_content function in lib/llm/src/preprocessor/prompt/template/oai.rs will only encounter text-only content arrays.
Applied to files:
docs/backends/trtllm/multimodal_trtllm_guide.mddocs/backends/vllm/multimodal_vllm_guide.mddocs/backends/sglang/multimodal_sglang_guide.md
📚 Learning: 2025-10-28T05:48:37.621Z
Learnt from: ayushag-nv
Repo: ai-dynamo/dynamo PR: 3634
File: components/src/dynamo/vllm/multimodal_utils/model.py:39-42
Timestamp: 2025-10-28T05:48:37.621Z
Learning: In components/src/dynamo/vllm/multimodal_utils/model.py, the AutoModel.from_pretrained call with trust_remote_code=True in the load_vision_model function is intentional and expected for the vLLM multimodal implementation.
Applied to files:
docs/backends/trtllm/multimodal_trtllm_guide.md
🪛 LanguageTool
docs/backends/trtllm/multimodal_trtllm_guide.md
[uncategorized] ~175-~175: Do not mix variants of the same word (‘precompute’ and ‘pre-compute’) within a single text.
Context: ...dding path | No | | Encode → Prefill (Precomputed Embeddings) | NIXL metadata (pre-comp...
(EN_WORD_COHERENCY)
[uncategorized] ~186-~186: Do not mix variants of the same word (‘precompute’ and ‘pre-compute’) within a single text.
Context: ...UCX or NIXL) | | E->P->D Disaggregated (Precomputed Embeddings) | [`examples/backends/trtll...
(EN_WORD_COHERENCY)
docs/backends/sglang/multimodal_sglang_guide.md
[style] ~464-~464: To elevate your writing, try using a synonym here.
Context: ...ng difficulty**: Token count mismatches hard to diagnose Comparison with vLLM: ...
(HARD_TO)
🪛 markdownlint-cli2 (0.18.1)
docs/backends/trtllm/multimodal_trtllm_guide.md
33-33: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
62-62: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
86-86: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
115-115: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
224-224: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
235-235: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
295-295: Multiple headings with the same content
(MD024, no-duplicate-heading)
docs/backends/vllm/multimodal_vllm_guide.md
35-35: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
51-51: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
84-84: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
112-112: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
177-177: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
183-183: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
198-198: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
docs/backends/sglang/multimodal_sglang_guide.md
35-35: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
51-51: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
84-84: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
112-112: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
177-177: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
183-183: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
198-198: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
222-222: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
375-375: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build and Test - dynamo
- GitHub Check: Mirror Repository to GitLab
🔇 Additional comments (2)
docs/hidden_toctree.rst (2)
78-78: Verify LMCache_Integration.md file exists.Line 78 adds
backends/vllm/LMCache_Integration.mdto the toctree, but this file was not included in the PR files for review. Confirm that this file exists or is intended to be added in this PR.
54-54: Toctree entries are well-structured.The four new toctree entries (lines 54, 65, 78, 81) are correctly formatted and logically placed within their respective backend sections. They follow the existing naming and ordering conventions.
Also applies to: 65-65, 78-78, 81-81
|
Originally we have two TRTLLM multimodal docs, multimodal_support.md and multimodal_epd.md. I think it will be confusing if we have total 3 docs for multimodal(including this new guide) so I consolidated the first two into the |
|
/ok to test b2870e2 |
rmccorm4
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking this up Kris!
-
Only one small comment left for this PR: #4510 (comment)
-
For a future PR maybe, I think it's a little confusing to users that sglang has a multimodal_epd.md but vllm/trtllm describe it in separate doc. Maybe we can move sglang EPD info into single doc, or split out the other backend EPDs to separate doc -- either way is fine, just to be consistent for all 3.
I think we could maybe just rename the sglang |
…4510) Signed-off-by: Indrajit Bhosale <[email protected]> Co-authored-by: krishung5 <[email protected]>
Merged 238 commits from main branch to bring the feature branch up to date. Key conflicts resolved: - Removed lib/kvbm-kernels references (deleted in main) - Kept nova/nova-backend/kvbm workspace members from feature branch - Maintained v2 module API refactoring from feature branch - Updated Cargo.lock files to reflect new dependencies Major updates from main include: - LoRA support for vLLM (#4810) - Multimodal documentation (#4510) - Scaling adapter features (#4699, #4825) - Tool calling support (#4822, #4722) - NIXL connect improvements (#4433) Signed-off-by: Ryan Olson <[email protected]>
…i-dynamo#4510) Signed-off-by: Indrajit Bhosale <[email protected]> Co-authored-by: krishung5 <[email protected]>
Overview:
Add comprehensive multimodal guides for vLLM, Sglang and TRT-LLM backends documenting architectures, deployment modes, input formats, and known limitations.
Details:
docs/backends/vllm/multimodal_vllm_guide.md- Complete vLLM multimodal reference-New:
docs/backends/trtllm/multimodal_trtllm_guide.md- Complete TRT-LLM multimodal reference-New:
dynamo/docs/backends/sglang/multimodal_sglang_guide.md- Complete SGlang multimodal referenceSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.