docs: Add multimodal documentation vllm, sglang, and trtllm backends #4510

indrajit96 · 2025-11-20T19:27:48Z

Overview:

Add comprehensive multimodal guides for vLLM, Sglang and TRT-LLM backends documenting architectures, deployment modes, input formats, and known limitations.

Details:

New: docs/backends/vllm/multimodal_vllm_guide.md - Complete vLLM multimodal reference
-New: docs/backends/trtllm/multimodal_trtllm_guide.md - Complete TRT-LLM multimodal reference
-New: dynamo/docs/backends/sglang/multimodal_sglang_guide.md - Complete SGlang multimodal reference

Summary by CodeRabbit

Documentation
- Added comprehensive multimodal backend guides for SGLang, TRT-LLM, and vLLM inference backends
- Guides document architecture patterns, deployment modes, component responsibilities, and known limitations
- Includes practical configuration examples and architectural diagrams
- Updated documentation index to surface new guides

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: Indrajit Bhosale <[email protected]>

rmccorm4

Can you update https://github.com/ai-dynamo/dynamo/blob/main/docs/multimodal/multimodal_intro.md at the bottom with links to each of the backend specific docs as a central location?

Signed-off-by: Indrajit Bhosale <[email protected]>

rmccorm4

re: https://github.com/ai-dynamo/dynamo/actions/runs/19580969130/job/56078496698?pr=4510

#15 5.473 checking consistency... /workspace/dynamo/docs/backends/sglang/multimodal_sglang_guide.md: WARNING: document isn't included in any toctree [toc.not_included]
#15 5.474 /workspace/dynamo/docs/backends/trtllm/multimodal_trtllm_guide.md: WARNING: document isn't included in any toctree [toc.not_included]
#15 5.474 /workspace/dynamo/docs/backends/vllm/multimodal_vllm_guide.md: WARNING: document isn't included in any toctree [toc.not_included]

Needs these files added to docs/hidden_toctree.rst

rmccorm4 · 2025-11-21T21:07:10Z

@krishung5 can you help review the docs here? Main point is to clearly document what is supported in each backend today with relation to multimodality, and highlight at least 1 key example or model for each.

docs/backends/trtllm/multimodal_trtllm_guide.md

docs/backends/sglang/multimodal_sglang_guide.md

krishung5

Thanks for putting up this doc, great work! I think one minor comment is that, we have multimodal doc for all three frameworks, so maybe we can link them in the guide here somehow? i.e. vllm, trtllm here and here, and sglang

docs/backends/trtllm/multimodal_trtllm_guide.md

docs/backends/vllm/multimodal_vllm_guide.md

coderabbitai · 2025-12-04T23:01:52Z

Walkthrough

Adds three new comprehensive multimodal backend guides for SGLang, TRT-LLM, and vLLM, documenting support matrices, deployment architectures, component configurations, inter-component communication patterns, and known limitations. Updates documentation registry to include these new guides.

Changes

Cohort / File(s)	Summary
Multimodal Backend Guides `docs/backends/sglang/multimodal_sglang_guide.md`, `docs/backends/trtllm/multimodal_trtllm_guide.md`, `docs/backends/vllm/multimodal_vllm_guide.md`	Adds three new documentation files covering multimodal support architecture, deployment patterns (aggregated/disaggregated modes), component responsibilities, NATS/NIXL inter-component communication, vision encoding specifics, model registration patterns, supported models, and comprehensive gaps/limitations documentation for each backend.
Documentation Registry `docs/hidden_toctree.rst`	Registers four new documentation entries in the hidden toctree: three multimodal backend guides (TRT-LLM, SGLang, vLLM) and one vLLM LMCache integration guide.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Specific areas requiring attention:
- Architectural consistency across the three backend guides (ensure deployment pattern descriptions align conceptually)
- Accuracy of component responsibility descriptions and command-line flags
- Completeness and correctness of gaps/limitations sections, particularly regarding current vs. planned support
- Technical accuracy of NATS/NIXL communication patterns and token expansion mechanics described in each guide

Poem

🐰 Multimodal dreams now documented clear,
Three backends bloom with vision near,
SGLang, TRT-LLM, vLLM aligned,
Deployment patterns beautifully designed,
From aggregated to disaggregated flow,
The hidden toctree helps them glow! ✨

Pre-merge checks

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'docs: Add multimodal documentation vllm, sglang, and trtllm backends' directly and specifically describes the main change: adding multimodal documentation guides for three backends.
Description check	✅ Passed	The PR description covers the required template sections: Overview (clear purpose), Details (specific files added), but lacks 'Where should the reviewer start?' and 'Related Issues' sections.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b8c4a5f and 81402c2.

📒 Files selected for processing (4)

docs/backends/sglang/multimodal_sglang_guide.md (1 hunks)
docs/backends/trtllm/multimodal_trtllm_guide.md (1 hunks)
docs/backends/vllm/multimodal_vllm_guide.md (1 hunks)
docs/hidden_toctree.rst (3 hunks)

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-09-16T19:47:30.312Z

Learnt from: KrishnanPrash
Repo: ai-dynamo/dynamo PR: 3067
File: lib/llm/src/preprocessor/prompt/template/oai.rs:87-134
Timestamp: 2025-09-16T19:47:30.312Z
Learning: In Dynamo, multimodal requests (containing image_url or other non-text content) are processed through a completely different workflow than text-only requests, so the may_be_fix_msg_content function in lib/llm/src/preprocessor/prompt/template/oai.rs will only encounter text-only content arrays.

Applied to files:

docs/backends/trtllm/multimodal_trtllm_guide.md
docs/backends/vllm/multimodal_vllm_guide.md
docs/backends/sglang/multimodal_sglang_guide.md

📚 Learning: 2025-10-28T05:48:37.621Z

Learnt from: ayushag-nv
Repo: ai-dynamo/dynamo PR: 3634
File: components/src/dynamo/vllm/multimodal_utils/model.py:39-42
Timestamp: 2025-10-28T05:48:37.621Z
Learning: In components/src/dynamo/vllm/multimodal_utils/model.py, the AutoModel.from_pretrained call with trust_remote_code=True in the load_vision_model function is intentional and expected for the vLLM multimodal implementation.

Applied to files:

docs/backends/trtllm/multimodal_trtllm_guide.md

🪛 LanguageTool

docs/backends/trtllm/multimodal_trtllm_guide.md

[uncategorized] ~175-~175: Do not mix variants of the same word (‘precompute’ and ‘pre-compute’) within a single text.
Context: ...dding path | No | | Encode → Prefill (Precomputed Embeddings) | NIXL metadata (pre-comp...

(EN_WORD_COHERENCY)

[uncategorized] ~186-~186: Do not mix variants of the same word (‘precompute’ and ‘pre-compute’) within a single text.
Context: ...UCX or NIXL) | | E->P->D Disaggregated (Precomputed Embeddings) | [`examples/backends/trtll...

(EN_WORD_COHERENCY)

docs/backends/sglang/multimodal_sglang_guide.md

[style] ~464-~464: To elevate your writing, try using a synonym here.
Context: ...ng difficulty**: Token count mismatches hard to diagnose Comparison with vLLM: ...

(HARD_TO)

🪛 markdownlint-cli2 (0.18.1)

docs/backends/trtllm/multimodal_trtllm_guide.md

33-33: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

62-62: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

86-86: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

115-115: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

224-224: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

235-235: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

295-295: Multiple headings with the same content

(MD024, no-duplicate-heading)

docs/backends/vllm/multimodal_vllm_guide.md

35-35: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

51-51: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

84-84: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

112-112: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

177-177: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

183-183: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

198-198: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

docs/backends/sglang/multimodal_sglang_guide.md

35-35: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

51-51: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

84-84: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

112-112: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

177-177: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

183-183: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

198-198: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

222-222: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

375-375: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Build and Test - dynamo
GitHub Check: Mirror Repository to GitLab

🔇 Additional comments (2)

docs/hidden_toctree.rst (2)

78-78: Verify LMCache_Integration.md file exists.

Line 78 adds backends/vllm/LMCache_Integration.md to the toctree, but this file was not included in the PR files for review. Confirm that this file exists or is intended to be added in this PR.

54-54: Toctree entries are well-structured.

The four new toctree entries (lines 54, 65, 78, 81) are correctly formatted and logically placed within their respective backend sections. They follow the existing naming and ordering conventions.

Also applies to: 65-65, 78-78, 81-81

docs/backends/sglang/multimodal_sglang_guide.md

docs/backends/trtllm/multimodal_trtllm_guide.md

docs/backends/vllm/multimodal_vllm_guide.md

docs/backends/trtllm/multimodal_trtllm_guide.md

docs/backends/sglang/multimodal_sglang_guide.md

docs/backends/trtllm/multimodal_trtllm_guide.md

docs/backends/vllm/multimodal_vllm_guide.md

krishung5 · 2025-12-06T06:43:42Z

Originally we have two TRTLLM multimodal docs, multimodal_support.md and multimodal_epd.md. I think it will be confusing if we have total 3 docs for multimodal(including this new guide) so I consolidated the first two into the multimodal_support.md.

krishung5 · 2025-12-06T06:43:59Z

/ok to test b2870e2

docs/backends/vllm/multimodal_vllm_guide.md

rmccorm4

Thanks for picking this up Kris!

Only one small comment left for this PR: #4510 (comment)
For a future PR maybe, I think it's a little confusing to users that sglang has a multimodal_epd.md but vllm/trtllm describe it in separate doc. Maybe we can move sglang EPD info into single doc, or split out the other backend EPDs to separate doc -- either way is fine, just to be consistent for all 3.

krishung5 · 2025-12-09T07:13:36Z

For a future PR maybe, I think it's a little confusing to users that sglang has a multimodal_epd.md but vllm/trtllm describe it in separate doc. Maybe we can move sglang EPD info into single doc, or split out the other backend EPDs to separate doc -- either way is fine, just to be consistent for all 3.

I think we could maybe just rename the sglang multimodal_epd.md to multimodal.md. For the other 2 backends, there's only 1 multimodal doc (would need to rename multimodal_support.md to multimodal.md as well for TRTLLM to have a consistent naming across all the backends). Sglang didn't have the non-epd workflow so that's why the doc was named with the epd keyword.

…4510) Signed-off-by: Indrajit Bhosale <[email protected]> Co-authored-by: krishung5 <[email protected]>

Merged 238 commits from main branch to bring the feature branch up to date. Key conflicts resolved: - Removed lib/kvbm-kernels references (deleted in main) - Kept nova/nova-backend/kvbm workspace members from feature branch - Maintained v2 module API refactoring from feature branch - Updated Cargo.lock files to reflect new dependencies Major updates from main include: - LoRA support for vLLM (#4810) - Multimodal documentation (#4510) - Scaling adapter features (#4699, #4825) - Tool calling support (#4822, #4722) - NIXL connect improvements (#4433) Signed-off-by: Ryan Olson <[email protected]>

…i-dynamo#4510) Signed-off-by: Indrajit Bhosale <[email protected]> Co-authored-by: krishung5 <[email protected]>

2/3 Done

94ddb67

Signed-off-by: Indrajit Bhosale <[email protected]>

pull-request-size bot added the size/XL label Nov 20, 2025

rmccorm4 reviewed Nov 20, 2025

View reviewed changes

3/3 Done

cd8ed73

Signed-off-by: Indrajit Bhosale <[email protected]>

pull-request-size bot added size/XXL and removed size/XL labels Nov 21, 2025

rmccorm4 changed the title ~~2/3 Done~~ docs: Add multimodal documentation vllm, sglang, and trtllm backends Nov 21, 2025

github-actions bot added the docs label Nov 21, 2025

rmccorm4 reviewed Nov 21, 2025

View reviewed changes

rmccorm4 requested review from GuanLuo and krishung5 November 21, 2025 21:06

rmccorm4 reviewed Nov 21, 2025

View reviewed changes

docs/backends/trtllm/multimodal_trtllm_guide.md Show resolved Hide resolved

krishung5 reviewed Nov 21, 2025

View reviewed changes

docs/backends/sglang/multimodal_sglang_guide.md Outdated Show resolved Hide resolved

krishung5 reviewed Nov 21, 2025

View reviewed changes

krishung5 added 3 commits December 4, 2025 11:21

Merge remote-tracking branch 'origin' into mm_docs

9f6ff8a

Address comments

63ae834

Merge remote-tracking branch 'origin' into mm_docs

3490d66

copy-pr-bot bot had a problem deploying to GITLAB December 4, 2025 22:18 Failure

krishung5 added 2 commits December 4, 2025 14:55

Fix syntax for renderring

96fd82d

Merge remote-tracking branch 'origin' into mm_docs

81402c2

copy-pr-bot bot had a problem deploying to GITLAB December 4, 2025 22:56 Failure

krishung5 marked this pull request as ready for review December 4, 2025 22:58

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

krishung5 added 2 commits December 4, 2025 21:47

Coderabbit comments

3aa5f29

Revise some parts

79feec6

copy-pr-bot bot had a problem deploying to GITLAB December 5, 2025 20:41 Failure