Skip to content

Conversation

@xinhe-nv
Copy link
Collaborator

@xinhe-nv xinhe-nv commented Oct 27, 2025

waive failed cases for model creation OOM in H100.test_e2e.test_ptp_quickstart_advanced_multi_gpus[DeepSeek-V3-671B-FP8-DeepSeek-V3-0324-8].

Summary by CodeRabbit

  • Tests
    • Skipped a test case for advanced multi-GPU configurations due to a known issue.

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

@jieli-matrix jieli-matrix marked this pull request as ready for review October 27, 2025 10:10
@jieli-matrix jieli-matrix enabled auto-merge (squash) October 27, 2025 10:10
@jieli-matrix
Copy link
Collaborator

/bot run --skip-test

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 27, 2025

📝 Walkthrough

Walkthrough

A single skip entry was added to the test waives file for test_ptp_quickstart_advanced_multi_gpus[DeepSeek-V3-671B-FP8-DeepSeek-V3-0324-8] with a reference to an issue tracker ID. No functional or logic changes.

Changes

Cohort / File(s) Summary
Test waives list
tests/integration/test_lists/waives.txt
Added skip entry for DeepSeek-V3 multi-GPU test case (issue reference: nvbugs/5613456)

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

Suggested reviewers

  • crazydemo
  • LarryXFly

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description Check ⚠️ Warning The pull request description is incomplete relative to the repository template. While the PR does include a brief description ("waive failed cases for model creation OOM in H100...") and a PR checklist, the required "Test Coverage" section is entirely missing from the submission. The description section is also minimal, providing only a single sentence rather than explaining the issue and solution in sufficient detail. The template explicitly calls for this information to ensure clear communication of the change's scope and validation. Add a "Test Coverage" section explaining which tests validate this waive addition and confirm there are no affected test paths. Additionally, expand the description section to better explain why this specific test needs to be waived (the OOM issue details) and any relevant context about the DeepSeek-V3 model testing on H100 hardware. Even for routine maintenance PRs like test waivers, following the template ensures consistency and clarity for future reviewers.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "[https://nvbugs/5613456][chore] Skip test_ptp_quickstart_advanced_multi_gpus[DeepSeek-V3-671B-FP8-DeepSeek-V3-0324-8] due to Model Creation OOM" directly and clearly describes the main change in the changeset. The changeset consists of a single modification: adding an entry to waives.txt that skips the specified test due to an out-of-memory error during model creation. The title precisely identifies which test is being skipped and provides the reason for the skip, making it fully related to and a clear summary of the primary change.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jieli-matrix jieli-matrix changed the title [None][chore] Add failed cases into waives.txt [https://nvbugspro.nvidia.com/bug/5613456][chore] Skip test_ptp_quickstart_advanced_multi_gpus[DeepSeek-V3-671B-FP8-DeepSeek-V3-0324-8] due to Model Creation OOM Oct 27, 2025
@jieli-matrix jieli-matrix changed the title [https://nvbugspro.nvidia.com/bug/5613456][chore] Skip test_ptp_quickstart_advanced_multi_gpus[DeepSeek-V3-671B-FP8-DeepSeek-V3-0324-8] due to Model Creation OOM [https://nvbugs/5613456][chore] Skip test_ptp_quickstart_advanced_multi_gpus[DeepSeek-V3-671B-FP8-DeepSeek-V3-0324-8] due to Model Creation OOM Oct 27, 2025
@xinhe-nv
Copy link
Collaborator Author

/bot run --skip-test

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22637 [ run ] triggered by Bot. Commit: 1cca99f

Signed-off-by: xinhe-nv <[email protected]>
@xinhe-nv xinhe-nv force-pushed the user/qa/post_update_waive_20251027_LLM_FUNCTION_TEST_1569 branch from 1cca99f to 9780f25 Compare October 27, 2025 11:48
@tensorrt-cicd
Copy link
Collaborator

PR_Github #22637 [ run ] completed with state SUCCESS. Commit: 1cca99f
/LLM/main/L0_MergeRequest_PR pipeline #17064 (Partly Tested) completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

@xinhe-nv
Copy link
Collaborator Author

/bot reuse-pipeline

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22832 [ reuse-pipeline ] triggered by Bot. Commit: 629f805

@tensorrt-cicd
Copy link
Collaborator

PR_Github #22832 [ reuse-pipeline ] completed with state SUCCESS. Commit: 629f805
Reusing PR_Github #22637 (Partly Tested) for commit 629f805

@jieli-matrix jieli-matrix merged commit 7ba98a6 into NVIDIA:main Oct 29, 2025
5 checks passed
@xinhe-nv xinhe-nv deleted the user/qa/post_update_waive_20251027_LLM_FUNCTION_TEST_1569 branch October 29, 2025 03:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants