Skip to content

Conversation

@tanmayv25
Copy link
Contributor

@tanmayv25 tanmayv25 commented Aug 21, 2025

Overview:

There are issues with DeepGEMM on SBSA with VSWA. Upgrading the version can help.

NVIDIA/TensorRT-LLM@0ff8df9

Summary by CodeRabbit

  • Documentation
    • Removed outdated Multi-Token Prediction guidance and build flags from deployment notes, including DeepSeek R1 and Gemma 3 specifics. Latency and MTP caveats remain.
  • Chores
    • Updated defaults to TensorRT-LLM 1.0.0rc6 for builds and optional dependencies.
    • Refreshed experimental baseline used when building without an explicit wheel/commit, so unattended builds use the newer default.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 21, 2025

Walkthrough

Documentation references to experimental TensorRT-LLM build requirements for MTP/VSWA were removed across trtllm docs. The container build script updated the default experimental TensorRT-LLM commit and default wheel to 1.0.0rc6. The Python optional dependency for trtllm was bumped to tensorrt-llm==1.0.0rc6.

Changes

Cohort / File(s) Summary
Docs cleanup: remove MTP/VSWA build notes
components/backends/trtllm/README.md, components/backends/trtllm/deploy/README.md, components/backends/trtllm/gemma3_sliding_window_attention.md
Deleted references to experimental TensorRT-LLM commit, associated build flags/commands, and VSWA compatibility notes. Content about aggregation/serving otherwise unchanged.
Build defaults update
container/build.sh
Updated DEFAULT_EXPERIMENTAL_TRTLLM_COMMIT to a16ba64… and DEFAULT_TENSORRTLLM_PIP_WHEEL to tensorrt-llm==1.0.0rc6; affects default selection when no wheel/commit provided.
Dependency bump
pyproject.toml
Updated optional dependency trtllm: tensorrt-llm from 1.0.0rc4 to 1.0.0rc6; no other changes in that block.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Poem

A hare with a wrench and a version tag,
Hops past rc4 with a jaunty wag.
Docs trimmed neat, no flags to keep,
Build wheels roll to rc6 deep.
Commit carrots fresh—a crunchy heap! 🥕🐇

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
container/build.sh (1)

101-103: Defaulting to a pip wheel conflicts with earlier “build from source” guidance and ABI note.

Comments above state the default should be option 1 (local wheel/build from source) due to ABI incompatibility between upstream wheels and NGC PyTorch, yet the logic defaults to installing a versioned pip wheel when neither a wheel nor commit is provided. If TENSORRTLLM_INDEX_URL remains the public PyPI (line 98), this default can be brittle.

Two options:

  • Keep default = build from source (safer with NGC PyTorch):
-DEFAULT_TENSORRTLLM_PIP_WHEEL="tensorrt-llm==1.0.0rc6"
+DEFAULT_TENSORRTLLM_PIP_WHEEL=""
+# When empty, logic falls back to commit-based local wheel build.
  • Or, if defaulting to a wheel is intended now, update the surrounding comments to explicitly state “default is option 2 (wheel)” and consider switching TENSORRTLLM_INDEX_URL to a vetted internal index.

Also consider following through on the TODO (lines 99–101) to install ai-dynamo[trtllm] in Dockerfile.trtllm so one source of truth drives the version. I can help wire this up in a follow-up.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7d27033 and cdc657d.

📒 Files selected for processing (5)
  • components/backends/trtllm/README.md (0 hunks)
  • components/backends/trtllm/deploy/README.md (0 hunks)
  • components/backends/trtllm/gemma3_sliding_window_attention.md (0 hunks)
  • container/build.sh (2 hunks)
  • pyproject.toml (1 hunks)
💤 Files with no reviewable changes (3)
  • components/backends/trtllm/deploy/README.md
  • components/backends/trtllm/README.md
  • components/backends/trtllm/gemma3_sliding_window_attention.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Build and Test - dynamo
  • GitHub Check: pre-merge-rust (lib/runtime/examples)
  • GitHub Check: pre-merge-rust (lib/bindings/python)
  • GitHub Check: pre-merge-rust (.)
🔇 Additional comments (2)
pyproject.toml (1)

52-52: Bump to tensorrt-llm 1.0.0rc6 — looks good.

Aligns with the PR objective to move to rc6. No functional concerns here.

container/build.sh (1)

92-96: Verify DEFAULT_EXPERIMENTAL_TRTLLM_COMMIT — it doesn’t match the DeepGEMM SBSA fix commit cited in the PR.

PR description references upstream fix commit 0ff8df95b7ccf0412b32be7befddbec3503115b6 (“[fix] DeepGEMM installation on SBSA”), but DEFAULT_EXPERIMENTAL_TRTLLM_COMMIT is set to a16ba6445c61ed70e7aadfe787d6f316bb422652 (a docs-only change). If users pass --use-default-experimental-tensorrtllm-commit, they won’t pick up the DeepGEMM fix.

If the intent is to default to the DeepGEMM fix, change to the cited commit:

-DEFAULT_EXPERIMENTAL_TRTLLM_COMMIT="a16ba6445c61ed70e7aadfe787d6f316bb422652"
+DEFAULT_EXPERIMENTAL_TRTLLM_COMMIT="0ff8df95b7ccf0412b32be7befddbec3503115b6"

Please confirm the desired commit/tag for experimental builds. Reference: DeepGEMM SBSA fix commit, and the currently configured commit. (github.com)

Copy link
Contributor

@nv-kmcgill53 nv-kmcgill53 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to remove the MTP sections from the docs? Should I trust coderabbit when it says they are outdated?

Copy link
Contributor

@indrajit96 indrajit96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from Multimodal side instructions.

@tanmayv25
Copy link
Contributor Author

why do we need to remove the MTP sections from the docs? Should I trust coderabbit when it says they are outdated?

These docs are outdated. MTP support should available to the trtllm version we are using.

@tanmayv25 tanmayv25 merged commit 9ab37d9 into main Aug 22, 2025
14 of 16 checks passed
@tanmayv25 tanmayv25 deleted the tanmayv-update branch August 22, 2025 00:59
tanmayv25 added a commit that referenced this pull request Aug 22, 2025
hhzhang16 pushed a commit that referenced this pull request Aug 27, 2025
nv-anants pushed a commit that referenced this pull request Aug 28, 2025
KrishnanPrash pushed a commit that referenced this pull request Sep 2, 2025
nnshah1 pushed a commit that referenced this pull request Sep 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants