Skip to content

Conversation

@Funatiq
Copy link
Collaborator

@Funatiq Funatiq commented Aug 8, 2025

Summary by CodeRabbit

  • Tests

    • Reorganized tests into unit and end-to-end groups; moved and rebalanced multi‑GPU, kernel, layer, runtime, batch-manager, and e2e coverage. Added and removed multiple test targets, adjusted discovery/invocation paths, added new multi‑GPU and e2e tests (including userBuffer), shortened some L0 timeouts, and changed several tests from silent skips to assertion failures.
  • Chores

    • Centralized build-type propagation and unified Google‑tests build step; improved test logging, artifact backup, and CI test-path triggers.

Description

Test Coverage

Cpp tests that don’t seem to run in ci:

  • allReduceKernelTest
  • allReduceFusionTest
  • moeAllReduceFusionTest

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

Details

run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]

Launch build/test pipelines. All previously running jobs will be killed.

--reuse-test (optional)pipeline-id (OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.

--disable-reuse-test (OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-PyTorch-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--test-backend "pytorch, cpp" (OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".

--detailed-log (OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.

--debug (OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in the stage-list parameter to access the appropriate container environment. Note: Does NOT update GitHub check status.

For guidance on mapping tests to stage names, see docs/source/reference/ci-overview.md
and the scripts/test_to_stage_mapping.py helper.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 8, 2025

Caution

Review failed

The head commit changed during the review from 9289779 to 06e5dd6.

📝 Walkthrough

Walkthrough

Reorganizes C++ test layout by introducing e2e_tests, adding a multi_gpu unit-test subtree, moving/regrouping many GoogleTest targets, updating several test sources/assertions, and propagating a new build_type fixture to Python integration fixtures while consolidating google-tests builds and log backup handling.

Changes

Cohort / File(s) Summary of Changes
Top-level tests CMake
cpp/tests/CMakeLists.txt
Replaced individual add_gtest entries and conditional subdirs with add_subdirectory(unit_tests) and add_subdirectory(e2e_tests).
E2E tests tree
cpp/tests/e2e_tests/CMakeLists.txt, cpp/tests/e2e_tests/.../CMakeLists.txt
Added e2e_tests CMakeLists and subdirs; reorganized batch_manager gtest targets (added/removed/reordered targets like guidedDecoderTest, trtEncoderModelTest, retained trtGptModelTest).
Unit tests — batch_manager
cpp/tests/unit_tests/batch_manager/CMakeLists.txt
Added blockKeyTest, peftCacheManagerTest, reordered cacheTransBufferTest and other targets.
Unit tests — kernels
cpp/tests/unit_tests/kernels/CMakeLists.txt
Removed several kernel tests (allReduce*, gemmAllReduceTest, samplingKernelsTest, routingKernelsTest); added moeLoadBalanceKernelTest and eaglePackDataTest.
Unit tests — layers
cpp/tests/unit_tests/layers/CMakeLists.txt, cpp/tests/unit_tests/layers/eagleLayerTest.cpp, .../medusaDecodeLayerTest.cpp
Added eagleLayerTest and medusaDecodeLayerTest targets; adjusted include paths in the two test sources.
Unit tests — runtime
cpp/tests/unit_tests/runtime/CMakeLists.txt
Added gptDecoderBatchedTest, gptDecoderTest, medusaModuleTest, moeLoadBalancerTest, sanitizerTest; removed userBufferTest (moved to multi_gpu).
Unit tests — multi_gpu
cpp/tests/unit_tests/multi_gpu/CMakeLists.txt, .../kernels/CMakeLists.txt, .../kernels/allReduce/*
Introduced multi_gpu tree and kernels subdir; added allReduce* and gemmAllReduceTest (conditionally); replaced early-return skips with ASSERT/GTEST_SKIP and improved perf logging in fusion perf test.
Multi-GPU user buffer & tests
cpp/tests/unit_tests/multi_gpu/userBufferTest.cpp, cpp/tests/unit_tests/multi_gpu/CMakeLists.txt
Added userBufferTest and three multi-GPU tests; replaced early-return branches with GTEST_SKIP() or ASSERT to enforce supported/odd-world-size behavior.
Integration tests / Python fixtures
tests/integration/defs/cpp/conftest.py, tests/integration/defs/cpp/cpp_common.py, tests/integration/defs/cpp/test_e2e.py
Added build_type fixture; changed find_build_dir signature; consolidated google-tests build via build_trt_llm(..., extra_make_targets=["google-tests"]); updated build_dir usage and XML log backup (Path.rglob), and changed e2e test working directories to build_dir/"tests"/"e2e_tests".
Multi-GPU test routing & discovery
tests/integration/defs/cpp/test_multi_gpu.py
Updated test discovery paths to unit_tests/multi_gpu and e2e_tests; adjusted binary paths; added run_user_buffer_tests and parameterized test_user_buffer.
CI / test metadata & timeouts
jenkins/L0_MergeRequest.groovy, tests/integration/defs/.test_durations, tests/integration/test_lists/.../l0_a30.yml, tests/integration/test_lists/.../l0_dgx_h100.yml
Updated watched paths for multi-GPU file-change detection; rebalanced test duration keys (GPT-80 / 90-series adjustments); reduced a couple of TIMEOUTs and added a new multi-GPU test entry in CI lists.

Sequence Diagram(s)

sequenceDiagram
  participant PyTest
  participant Conftest as conftest.py
  participant Builder as build_trt_llm
  participant CMake

  PyTest->>Conftest: request build_google_tests
  Conftest->>Conftest: read build_type (env or default)
  Conftest->>Builder: build_trt_llm(build_type, extra_make_targets=["google-tests"])
  Builder->>CMake: configure & build google-tests target
  CMake-->>Builder: build complete
  Builder-->>Conftest: build status
  Conftest-->>PyTest: tests ready
Loading
sequenceDiagram
  participant PyTest
  participant Conftest as conftest.py
  participant FS as Filesystem

  PyTest->>Conftest: session end -> keep_log_files(build_dir)
  Conftest->>FS: locate XMLs via Path.rglob("*.xml") in build_dir
  alt XML files found
    Conftest->>FS: copy XMLs -> cpp/build_backup
    FS-->>Conftest: copy results
  else
    Conftest->>Conftest: log no XML found
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • chzblych
  • tomeras91
  • niukuo
  • Shixiaowei02
  • litaotju
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@coderabbitai coderabbitai bot changed the title [None] @coderabbitai title [None] [refactor] Reorganize CMake and Python integration test infrastructure for C++ tests Aug 8, 2025
@Funatiq
Copy link
Collaborator Author

Funatiq commented Aug 8, 2025

/bot run --stage-list "A30-CPP-1, A30-CPP-2, A30-CPP-3"

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
tests/integration/defs/cpp/cpp_common.py (1)

85-89: Add type hints, avoid shadowing, set a safe default, and verify get_trt_llm_build_dir signature

tests/integration/defs/cpp/cpp_common.py (around lines 85–89):

  • Rename the local dir variable to build_dir to avoid shadowing Python’s built-in.
  • Add a default build_type="Release" for backward compatibility.
  • Annotate the signature and return type with pathlib.Path.
  • Manually confirm that get_trt_llm_build_dir(None, build_type) accepts None (or Optional[str]) and returns a pathlib.Path.

Suggested diff:

-def find_build_dir(build_type):
-    root_dir = find_root_dir()
-    dir = get_trt_llm_build_dir(None, build_type)
-
-    return dir if dir.is_absolute() else root_dir / dir
+def find_build_dir(build_type: str = "Release") -> pathlib.Path:
+    """Resolve the TRT-LLM CMake build directory for the given build type."""
+    root_dir = find_root_dir()
+    build_dir = get_trt_llm_build_dir(None, build_type)
+    return build_dir if build_dir.is_absolute() else root_dir / build_dir
tests/integration/defs/cpp/conftest.py (3)

22-26: Make build_type configurable via env var (no code edits needed to toggle).

Let CI/users override via TLLM_BUILD_TYPE while defaulting to Release.

 @pytest.fixture(scope="session")
 def build_type():
     # For debugging purposes, we can use the RelWithDebInfo build type.
     # return "RelWithDebInfo"
-    return "Release"
+    # Allow overriding via environment: TLLM_BUILD_TYPE=RelWithDebInfo
+    return _os.environ.get("TLLM_BUILD_TYPE", "Release")

28-31: Nit: annotate the return type for build_dir.

Minor clarity/readability improvement.

 @pytest.fixture(scope="session")
-def build_dir(build_type):
+def build_dir(build_type) -> _pl.Path:
     return _cpp.find_build_dir(build_type)

225-228: Fix docstring punctuation (Ruff D415).

End the first line with punctuation.

-    "Backup previous cpp test results when run multiple ctest"
+    "Backup previous cpp test results when run multiple ctest."
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d45236b and 1b7aeab.

📒 Files selected for processing (12)
  • cpp/tests/CMakeLists.txt (0 hunks)
  • cpp/tests/e2e_tests/CMakeLists.txt (1 hunks)
  • cpp/tests/e2e_tests/batch_manager/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/batch_manager/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/kernels/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/layers/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/layers/eagleLayerTest.cpp (1 hunks)
  • cpp/tests/unit_tests/layers/medusaDecodeLayerTest.cpp (1 hunks)
  • cpp/tests/unit_tests/runtime/CMakeLists.txt (1 hunks)
  • tests/integration/defs/cpp/conftest.py (3 hunks)
  • tests/integration/defs/cpp/cpp_common.py (1 hunks)
💤 Files with no reviewable changes (1)
  • cpp/tests/CMakeLists.txt
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{cpp,h,hpp,cc,cxx}

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

**/*.{cpp,h,hpp,cc,cxx}: Closing braces of namespaces should have a comment saying the namespace it closes (e.g., } // namespace foo).
Prefer const or constexpr variables over #defines whenever possible.
A variable that is not modified after its initialization should be declared as const.
Except 0 (used for checking signness/existence/emptiness), nullptr, true, false, all other literals should only be used for variable initialization.
Use the Allman indentation style for braces in C++ code.
Put the semicolon for an empty for or while loop in a new line.
The statement forming the body of a switch, while, do..while, or for statement shall be a compound statement (use brace-delimited statements).
If and else should always be followed by brace-delimited statements, even if empty or a single statement.
C++ filenames should use camel case with the first letter lowercase (e.g., thisIsAFilename.cpp), and all files involved in a compilation target must have case-insensitive unique filenames.
All types (including class names) should use camel case with uppercase first letter (e.g., FooBarClass).
Local variables, methods, and namespaces should use camel case with first letter lowercase (e.g., localFooBar).
Non-magic-number global variables that are non-static and not defined in anonymous namespace should use camel case prefixed by 'g' (e.g., gDontUseGlobalFoos).
Non-magic-number global variables that are static or defined in an anonymous namespace should use camel case prefixed by 's' (e.g., sMutableStaticGlobal).
Locally visible static variables should use camel case with lowercase prefix 's' as the first letter (e.g., static std::once_flag sFlag;).
Class member variables should use camel case prefixed with 'm' (e.g., mNbFooValues). Public member variables do not require the 'm' prefix but it is encouraged for clarity.
Enumerations, global constants, static constants at class-scope, and function-scope magic-number/literal constants should be uppercase snake case with prefix...

Files:

  • cpp/tests/unit_tests/layers/medusaDecodeLayerTest.cpp
  • cpp/tests/unit_tests/layers/eagleLayerTest.cpp
**/*.{cpp,h,hpp,cc,cxx,cu,py}

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.

Files:

  • cpp/tests/unit_tests/layers/medusaDecodeLayerTest.cpp
  • tests/integration/defs/cpp/cpp_common.py
  • cpp/tests/unit_tests/layers/eagleLayerTest.cpp
  • tests/integration/defs/cpp/conftest.py
**/*.py

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

**/*.py: Python code should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL).
Python constants should use upper snake_case (e.g., MY_CONSTANT).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a Python file, prefer docstrings over comments.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the class docstring.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.

Files:

  • tests/integration/defs/cpp/cpp_common.py
  • tests/integration/defs/cpp/conftest.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.
📚 Learning: 2025-08-06T13:58:07.506Z
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.

Applied to files:

  • cpp/tests/unit_tests/layers/medusaDecodeLayerTest.cpp
  • cpp/tests/e2e_tests/CMakeLists.txt
  • tests/integration/defs/cpp/cpp_common.py
  • cpp/tests/unit_tests/layers/eagleLayerTest.cpp
  • cpp/tests/e2e_tests/batch_manager/CMakeLists.txt
  • cpp/tests/unit_tests/runtime/CMakeLists.txt
📚 Learning: 2025-08-06T21:22:55.018Z
Learnt from: CR
PR: NVIDIA/TensorRT-LLM#0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-08-06T21:22:55.018Z
Learning: Applies to **/*.{cpp,h,hpp,cc,cxx,cu,py} : All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.

Applied to files:

  • cpp/tests/e2e_tests/CMakeLists.txt
  • cpp/tests/unit_tests/batch_manager/CMakeLists.txt
  • cpp/tests/e2e_tests/batch_manager/CMakeLists.txt
📚 Learning: 2025-08-08T05:06:31.537Z
Learnt from: sklevtsov-nvidia
PR: NVIDIA/TensorRT-LLM#3294
File: cpp/tensorrt_llm/cutlass_extensions/include/cutlass_extensions/epilogue/fusion/sm90_visitor_scatter.hpp:36-36
Timestamp: 2025-08-08T05:06:31.537Z
Learning: CUTLASS extension files (under cpp/tensorrt_llm/cutlass_extensions/) follow CUTLASS coding style conventions, including using #pragma once instead of TRTLLM_ prefixed header guards, even though they are .hpp files.

Applied to files:

  • cpp/tests/e2e_tests/CMakeLists.txt
  • cpp/tests/e2e_tests/batch_manager/CMakeLists.txt
📚 Learning: 2025-07-28T17:06:08.621Z
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.

Applied to files:

  • cpp/tests/unit_tests/layers/eagleLayerTest.cpp
  • cpp/tests/e2e_tests/batch_manager/CMakeLists.txt
  • cpp/tests/unit_tests/runtime/CMakeLists.txt
  • tests/integration/defs/cpp/conftest.py
📚 Learning: 2025-08-06T08:18:28.669Z
Learnt from: zhengd-nv
PR: NVIDIA/TensorRT-LLM#6633
File: cpp/tensorrt_llm/batch_manager/dataTransceiverImpl.cpp:145-155
Timestamp: 2025-08-06T08:18:28.669Z
Learning: In cpp/tensorrt_llm/batch_manager/dataTransceiverImpl.cpp, the existing `mMtxForMap` mutex in DataSenderImpl is sufficient to synchronize measurement file operations in the `release` method, as all file operations occur within the same critical section that protects the `mRequestToSession` map access.

Applied to files:

  • cpp/tests/e2e_tests/batch_manager/CMakeLists.txt
📚 Learning: 2025-08-06T21:22:55.018Z
Learnt from: CR
PR: NVIDIA/TensorRT-LLM#0
File: CODING_GUIDELINES.md:0-0
Timestamp: 2025-08-06T21:22:55.018Z
Learning: Applies to **/*.{h,hpp} : Use a preprocessor guard in header files. The guard name must have prefix TRTLLM_ followed by the filename, all in caps, and no trailing underscore.

Applied to files:

  • cpp/tests/e2e_tests/batch_manager/CMakeLists.txt
🪛 Ruff (0.12.2)
tests/integration/defs/cpp/conftest.py

226-226: First line should end with a period, question mark, or exclamation point

Add closing punctuation

(D415)

🔇 Additional comments (15)
cpp/tests/unit_tests/layers/medusaDecodeLayerTest.cpp (1)

17-17: Include path verification complete — ready to approve

All checks passed:

  • medusaDecodeLayerTest.h is located at cpp/tests/unit_tests/layers/medusaDecodeLayerTest.h.
  • No lingering old-style includes found.

The simplified local include is correct.

cpp/tests/unit_tests/layers/eagleLayerTest.cpp (1)

17-17: Include path simplification verified

  • Header located at cpp/tests/unit_tests/layers/eagleLayerTest.h
  • No lingering old‐style includes of "tests/layers/eagleLayerTest.h"

LGTM.

cpp/tests/unit_tests/CMakeLists.txt (1)

12-13: No configure fragility: required test subdirs always exist
A recursive search confirms that both cpp/tests/unit_tests/batch_manager/ and cpp/tests/unit_tests/executor/ are present in the repo, so the unconditional add_subdirectory() calls will never error. No existence guards are needed here.

cpp/tests/e2e_tests/CMakeLists.txt (2)

1-10: SPDX header present and correct

License header is in place and consistent with repository conventions.


12-13: No conditional guards needed for add_subdirectory()

Both cpp/tests/e2e_tests/batch_manager and cpp/tests/e2e_tests/executor are guaranteed to exist in the repository, so the existing

add_subdirectory(batch_manager)
add_subdirectory(executor)

can remain unchanged without risking configure‐time errors.

cpp/tests/unit_tests/layers/CMakeLists.txt (1)

34-38: Adding layer-local gtest targets is consistent with the CMake reorg

Registering eagleLayerTest and medusaDecodeLayerTest here aligns with the include path simplifications and improves locality. LGTM.

cpp/tests/unit_tests/kernels/CMakeLists.txt (2)

95-98: LGTM: kernel unit test targets added as expected

Moving moeLoadBalanceKernelTest and eaglePackDataTest into kernels/unit scope aligns with the reorg.


95-98: No missing test sources; please confirm no duplicate add_gtest() declarations

I ran the verification script and found all 14 .cpp files under cpp/tests/..., but the duplicate-check didn’t pick up any add_gtest() calls (likely a regex escape issue). To be safe, please manually ensure each of these targets appears exactly once across your CMakeLists:

• moeLoadBalanceKernelTest
• eaglePackDataTest
• blockKeyTest
• cacheTransBufferTest
• cacheTransceiverTest
• peftCacheManagerTest
• guidedDecoderTest
• trtEncoderModelTest
• gptDecoderBatchedTest
• gptDecoderTest
• medusaModuleTest
• moeLoadBalancerTest
• mpiUtilsTest
• sanitizerTest

If you prefer to re-run the check with a simpler pattern, for example:

rg -n "add_gtest\\(" -S

…then verify no target name appears twice.

cpp/tests/unit_tests/batch_manager/CMakeLists.txt (2)

12-14: LGTM: promote batch_manager tests to unit scope

blockKeyTest, cacheTransBufferTest, and cacheTransceiverTest belong in unit tests; this aligns with the split from e2e.


22-22: No duplicate add_gtest registrations in e2e
All three tests are only defined here and not re-registered elsewhere:

  • cpp/tests/unit_tests/batch_manager/CMakeLists.txt:
    • add_gtest(blockKeyTest blockKeyTest.cpp)
    • add_gtest(cacheTransceiverTest cacheTransceiverTest.cpp)
    • add_gtest(peftCacheManagerTest peftCacheManagerTest.cpp)

No instances found in e2e configurations.

cpp/tests/e2e_tests/batch_manager/CMakeLists.txt (1)

12-15: LGTM: guidedDecoderTest and trtEncoderModelTest are appropriately scoped as e2e

Comment explains guidedDecoderTest’s dependency on tokenizer info; placement here makes sense.

cpp/tests/unit_tests/runtime/CMakeLists.txt (3)

16-17: LGTM: decoder tests re-registered under runtime unit tests

gptDecoderBatchedTest and gptDecoderTest targets look consistent with the reorg.


25-27: LGTM: runtime tests (medusa, MOE load balancer, MPI utils) added

Names and placement look correct for unit tests after the split.


31-31: LGTM: sanitizerTest added under runtime unit tests

If this test depends on sanitizer builds, ensure CI labels/filters run it under the right configurations.

Please confirm CI includes/excludes sanitizerTest appropriately (e.g., via ctest -L or Jenkins filters).

tests/integration/defs/cpp/conftest.py (1)

179-187: LGTM: honors the new build_type fixture.

The ctest build now aligns with the unified build_type across fixtures. Looks good.

@Funatiq Funatiq force-pushed the dev/ci/restructure_cpp_tests branch from 1b7aeab to 916b786 Compare August 8, 2025 14:59
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
tests/integration/defs/cpp/conftest.py (1)

158-175: Forward build_dir, avoid forced clean, and parameterize TRT/NIXL roots

Not passing build_dir may lead to mismatched build layouts vs. other fixtures. Hardcoded TRT/NIXL roots reduce portability, and clean=True slows CI. Also prefer logger parameter interpolation over f-strings.

Apply:

-@pytest.fixture(scope="session")
-def build_google_tests(request, build_type):
+@pytest.fixture(scope="session")
+def build_google_tests(request, build_dir, build_type):
@@
-    _logger.info(f"Using CUDA arch: {cuda_arch}")
+    _logger.info("Using CUDA arch: %s", cuda_arch)
@@
-    build_trt_llm(
-        build_type=build_type,
+    build_trt_llm(
+        build_dir=build_dir,
+        build_type=build_type,
         cuda_architectures=cuda_arch,
         job_count=12,
         use_ccache=True,
-        clean=True,
+        # Allow opting in to clean builds to reduce CI time by default.
+        clean=_os.environ.get("TRTLLM_CLEAN", "0") == "1",
         generator="Ninja",
-        trt_root="/usr/local/tensorrt",
-        nixl_root="/opt/nvidia/nvda_nixl",
+        trt_root=_os.environ.get("TENSORRT_ROOT", "/usr/local/tensorrt"),
+        nixl_root=_os.environ.get("NIXL_ROOT", "/opt/nvidia/nvda_nixl"),
         skip_building_wheel=True,
         extra_make_targets=["google-tests"],
     )

Optionally parameterize parallelism:

  • job_count=int(_os.environ.get("TRTLLM_JOBS", "12"))

Verification script (expected: build_wheel.main accepts a build_dir argument):

#!/bin/bash
set -euo pipefail

# Show main signature and check for build_dir support
rg -n "def main\\(" scripts/build_wheel.py -A 10 || true
rg -n "build_dir" scripts/build_wheel.py -n -A 2 || true
🧹 Nitpick comments (2)
tests/integration/defs/cpp/conftest.py (2)

22-26: Make build_type configurable via env; default to Release

Allow overriding the build type in CI/dev without editing code.

 @pytest.fixture(scope="session")
-def build_type():
-    # For debugging purposes, we can use the RelWithDebInfo build type.
-    # return "RelWithDebInfo"
-    return "Release"
+def build_type():
+    # Allow override via env var for CI/dev flexibility.
+    return _os.environ.get("TRTLLM_BUILD_TYPE", "Release")

225-227: Minor docstring punctuation fix (Ruff D415)

End the first line with a period to satisfy docstring style.

-def keep_log_files(llm_root, build_dir):
-    "Backup previous cpp test results when run multiple ctest"
+def keep_log_files(llm_root, build_dir):
+    "Backup previous cpp test results when run multiple ctest."
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1b7aeab and 916b786.

📒 Files selected for processing (2)
  • tests/integration/defs/cpp/conftest.py (3 hunks)
  • tests/integration/defs/cpp/cpp_common.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/integration/defs/cpp/cpp_common.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

**/*.py: Python code should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL).
Python constants should use upper snake_case (e.g., MY_CONSTANT).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a Python file, prefer docstrings over comments.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the class docstring.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.

Files:

  • tests/integration/defs/cpp/conftest.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.

Files:

  • tests/integration/defs/cpp/conftest.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
📚 Learning: 2025-07-28T17:06:08.621Z
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.

Applied to files:

  • tests/integration/defs/cpp/conftest.py
🪛 Ruff (0.12.2)
tests/integration/defs/cpp/conftest.py

226-226: First line should end with a period, question mark, or exclamation point

Add closing punctuation

(D415)

🔇 Additional comments (2)
tests/integration/defs/cpp/conftest.py (2)

29-30: LGTM: build_dir now tied to build_type

Returning the build dir via _cpp.find_build_dir(build_type) keeps artifacts consistent across fixtures.


179-186: LGTM: benchmarks honor build_type

Using the fixture-provided build_type with CMake multi-config is correct and consistent.

@Funatiq Funatiq force-pushed the dev/ci/restructure_cpp_tests branch from 916b786 to 748e64e Compare August 11, 2025 08:16
@Funatiq
Copy link
Collaborator Author

Funatiq commented Aug 11, 2025

/bot run --stage-list "A30-CPP-1, A30-CPP-2, A30-CPP-3"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #14779 [ run ] triggered by Bot

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
tests/integration/defs/cpp/conftest.py (1)

1-1: Missing NVIDIA copyright header (required by repository guidelines).

Add the current-year NVIDIA header at the top of this file.

+# Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+
 import glob
♻️ Duplicate comments (1)
tests/integration/defs/cpp/conftest.py (1)

158-175: Forward build_dir, use env-configurable roots/jobs/clean, and prefer parameterized logging.

  • Forward build_dir to build_trt_llm to keep a single source of truth for where artifacts land.
  • Read TENSORRT_ROOT/NIXL_ROOT from env with sensible defaults.
  • Make clean optional via env (default off) to avoid full cleans in CI unless explicitly requested.
  • Make job_count overridable via env.
  • Prefer parameterized logging to avoid premature string formatting.
-@pytest.fixture(scope="session")
-def build_google_tests(request, build_type):
+@pytest.fixture(scope="session")
+def build_google_tests(request, build_dir, build_type):
@@
-    _logger.info(f"Using CUDA arch: {cuda_arch}")
+    _logger.info("Using CUDA arch: %s", cuda_arch)
@@
-    build_trt_llm(
-        build_type=build_type,
-        cuda_architectures=cuda_arch,
-        job_count=12,
-        use_ccache=True,
-        clean=True,
-        generator="Ninja",
-        trt_root="/usr/local/tensorrt",
-        nixl_root="/opt/nvidia/nvda_nixl",
-        skip_building_wheel=True,
-        extra_make_targets=["google-tests"],
-    )
+    build_trt_llm(
+        build_dir=build_dir,
+        build_type=build_type,
+        cuda_architectures=cuda_arch,
+        job_count=int(_os.environ.get("TRT_LLM_JOBS", "12")),
+        use_ccache=True,
+        clean=_os.environ.get("TRT_LLM_CLEAN", "0") == "1",
+        generator="Ninja",
+        trt_root=_os.environ.get("TENSORRT_ROOT", "/usr/local/tensorrt"),
+        nixl_root=_os.environ.get("NIXL_ROOT", "/opt/nvidia/nvda_nixl"),
+        skip_building_wheel=True,
+        extra_make_targets=["google-tests"],
+    )

Run this to verify build_trt_llm supports build_dir and to locate callers:

#!/bin/bash
set -euo pipefail
# Find build_wheel.py and show the main signature/context
fd -a build_wheel.py | xargs -I{} rg -n "def main\\(" {} -A 20
# Confirm build_dir appears in parameters
fd -a build_wheel.py | xargs -I{} rg -n "def main\\(.*build_dir" {} || true
# Show build_trt_llm call sites
rg -n "build_trt_llm\\(" -A 3 -B 2
🧹 Nitpick comments (2)
tests/integration/defs/cpp/conftest.py (2)

22-26: Make build_type configurable via environment.

Allow overriding the build type without code edits.

 @pytest.fixture(scope="session")
 def build_type():
   # For debugging purposes, we can use the RelWithDebInfo build type.
   # return "RelWithDebInfo"
-    return "Release"
+    return _os.environ.get("TRT_LLM_BUILD_TYPE", "Release")

Optionally, we can add a pytest option (e.g., --build-type) in a follow-up.


225-244: Fix docstring punctuation (Ruff D415) and consider pathlib-based paths.

  • Add terminal punctuation and improve wording.
  • Optional: use pathlib for portability and clarity.

Minimal D415 fix:

-    "Backup previous cpp test results when run multiple ctest"
+    "Back up previous C++ test results when running multiple ctest invocations."

Optional pathlib refactor for the whole fixture (outside the selected range):

@pytest.fixture(scope="function", autouse=True)
def keep_log_files(llm_root, build_dir):
    """Back up previous C++ test results when running multiple ctest invocations.

    Args:
        llm_root: Repository root (str or Path).
        build_dir: CMake build directory (Path-like).
    """
    results_dir = _pl.Path(build_dir)
    backup_dir = _pl.Path(llm_root) / "cpp" / "build_backup"
    backup_dir.mkdir(parents=True, exist_ok=True)

    yield

    xml_files = list(results_dir.glob("*.xml"))
    if not xml_files:
        _logger.info("No XML files found in the build directory.")
        return

    for xml in xml_files:
        try:
            shutil.copy(xml, backup_dir)
            _logger.info("Copied %s to %s", xml, backup_dir)
        except Exception as exc:
            _logger.error("Error copying %s: %s", xml, exc)
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 916b786 and 748e64e.

📒 Files selected for processing (12)
  • cpp/tests/CMakeLists.txt (0 hunks)
  • cpp/tests/e2e_tests/CMakeLists.txt (1 hunks)
  • cpp/tests/e2e_tests/batch_manager/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/batch_manager/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/kernels/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/layers/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/layers/eagleLayerTest.cpp (1 hunks)
  • cpp/tests/unit_tests/layers/medusaDecodeLayerTest.cpp (1 hunks)
  • cpp/tests/unit_tests/runtime/CMakeLists.txt (1 hunks)
  • tests/integration/defs/cpp/conftest.py (3 hunks)
  • tests/integration/defs/cpp/cpp_common.py (1 hunks)
💤 Files with no reviewable changes (1)
  • cpp/tests/CMakeLists.txt
✅ Files skipped from review due to trivial changes (1)
  • cpp/tests/unit_tests/layers/CMakeLists.txt
🚧 Files skipped from review as they are similar to previous changes (9)
  • cpp/tests/unit_tests/runtime/CMakeLists.txt
  • cpp/tests/unit_tests/CMakeLists.txt
  • cpp/tests/unit_tests/layers/medusaDecodeLayerTest.cpp
  • tests/integration/defs/cpp/cpp_common.py
  • cpp/tests/unit_tests/batch_manager/CMakeLists.txt
  • cpp/tests/unit_tests/layers/eagleLayerTest.cpp
  • cpp/tests/e2e_tests/CMakeLists.txt
  • cpp/tests/e2e_tests/batch_manager/CMakeLists.txt
  • cpp/tests/unit_tests/kernels/CMakeLists.txt
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

**/*.py: Python code should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL).
Python constants should use upper snake_case (e.g., MY_CONSTANT).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a Python file, prefer docstrings over comments.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the class docstring.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.

Files:

  • tests/integration/defs/cpp/conftest.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.

Files:

  • tests/integration/defs/cpp/conftest.py
🪛 Ruff (0.12.2)
tests/integration/defs/cpp/conftest.py

226-226: First line should end with a period, question mark, or exclamation point

Add closing punctuation

(D415)

🔇 Additional comments (2)
tests/integration/defs/cpp/conftest.py (2)

28-30: LGTM: build_dir derives from build_type consistently.


179-187: LGTM: benchmarks target honors build_type and build_dir.

Using the same build_dir keeps the build consistent across fixtures.

@Funatiq Funatiq changed the title [None] [refactor] Reorganize CMake and Python integration test infrastructure for C++ tests [None] [ci] Reorganize CMake and Python integration test infrastructure for C++ tests Aug 11, 2025
@tensorrt-cicd
Copy link
Collaborator

PR_Github #14779 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #11158 (Partly Tested) completed with status: 'FAILURE'

@Funatiq Funatiq force-pushed the dev/ci/restructure_cpp_tests branch from 748e64e to d142c3c Compare August 11, 2025 12:07
@Funatiq
Copy link
Collaborator Author

Funatiq commented Aug 11, 2025

/bot run --stage-list "A30-CPP-1, A30-CPP-2, A30-CPP-3"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #14808 [ run ] triggered by Bot

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
tests/integration/defs/cpp/conftest.py (1)

158-175: Forward build_dir to the builder, make roots configurable, and guard request.param usage.

  • Ensure the same build_dir is used for both building and later cmake --build steps to avoid mismatches.
  • Parameterize TensorRT and NIXL roots via env vars.
  • Using request.param requires this fixture to be parametrized; add a safe fallback or fail fast with a helpful message.
-@pytest.fixture(scope="session")
-def build_google_tests(request, build_type):
+@pytest.fixture(scope="session")
+def build_google_tests(request, build_dir, build_type):
 
-    cuda_arch = f"{request.param}-real"
+    cuda_arch_base = getattr(request, "param", None) or _os.environ.get("CUDA_ARCH")
+    if not cuda_arch_base:
+        raise pytest.UsageError(
+            "Parametrize the 'build_google_tests' fixture with a CUDA arch "
+            "(e.g., '80') or set CUDA_ARCH env var."
+        )
+    cuda_arch = f"{cuda_arch_base}-real" if not str(cuda_arch_base).endswith("-real") else str(cuda_arch_base)
 
-    _logger.info(f"Using CUDA arch: {cuda_arch}")
+    _logger.info("Using CUDA arch: %s", cuda_arch)
 
     build_trt_llm(
+        build_dir=build_dir,
         build_type=build_type,
         cuda_architectures=cuda_arch,
-        job_count=12,
+        job_count=int(_os.environ.get("TLLM_BUILD_JOBS", "12")),
         use_ccache=True,
-        clean=True,
-        generator="Ninja",
-        trt_root="/usr/local/tensorrt",
-        nixl_root="/opt/nvidia/nvda_nixl",
+        clean=_os.environ.get("TLLM_CLEAN_BUILD", "0") == "1",
+        generator=_os.environ.get("TLLM_CMAKE_GENERATOR", "Ninja"),
+        trt_root=_os.environ.get("TENSORRT_ROOT", "/usr/local/tensorrt"),
+        nixl_root=_os.environ.get("NIXL_ROOT", "/opt/nvidia/nvda_nixl"),
         skip_building_wheel=True,
         extra_make_targets=["google-tests"],
     )

Run this to verify the fixture is actually parametrized somewhere (or used with indirect parametrization):

#!/bin/bash
# 1) Is the fixture itself parametrized?
rg -n $'@pytest\\.fixture\\([^)]*params=' tests

# 2) Is it parametrized indirectly via parametrize?
rg -n $'pytest\\.mark\\.parametrize\\([^)]*["\\\']build_google_tests["\\\']' -A2 -B2

# 3) Any custom pytest_generate_tests hooking it?
rg -n 'pytest_generate_tests' tests -A20 -B5
🧹 Nitpick comments (5)
tests/integration/defs/cpp/cpp_common.py (1)

85-89: Add type hints, docstring, and avoid shadowing builtins in find_build_dir.

Rename local variable dir to build_dir, and add annotations + a short docstring for clarity and maintainability.

-def find_build_dir(build_type):
+def find_build_dir(build_type: str) -> _pl.Path:
+    """Return the CMake build directory for the given build type."""
     root_dir = find_root_dir()
-    dir = get_trt_llm_build_dir(None, build_type)
-
-    return dir if dir.is_absolute() else root_dir / dir
+    build_dir = get_trt_llm_build_dir(None, build_type)
+    return build_dir if build_dir.is_absolute() else root_dir / build_dir
tests/integration/defs/cpp/conftest.py (4)

22-26: Make build_type configurable and add a docstring.

Read the build type from an env var with a sensible default, and document the fixture.

 @pytest.fixture(scope="session")
-def build_type():
-    # For debugging purposes, we can use the RelWithDebInfo build type.
-    # return "RelWithDebInfo"
-    return "Release"
+def build_type():
+    """CMake build type for C++ builds."""
+    # For debugging purposes, we can use the RelWithDebInfo build type.
+    # return "RelWithDebInfo"
+    return _os.environ.get("TLLM_BUILD_TYPE", "Release")

28-31: Document build_dir fixture for clarity.

Minor readability improvement.

 @pytest.fixture(scope="session")
 def build_dir(build_type):
+    """Resolved build directory for the current build_type."""
     return _cpp.find_build_dir(build_type)

179-187: Confirm correctness of --config with Ninja single-config generators.

If the configuration step used "Ninja" (single-config), the --config argument is ignored by CMake --build. It's harmless but can be confusing. Consider omitting it for Ninja, or detecting a multi-config generator before appending it.


225-244: Polish docstring and use pathlib for paths; preserve metadata when copying.

  • Satisfy D415 by ending the first docstring line with punctuation.
  • Prefer pathlib joins over manual string concatenation.
  • Use copy2 to preserve file metadata; log exceptions with traceback.
 @pytest.fixture(scope="function", autouse=True)
 def keep_log_files(llm_root, build_dir):
-    "Backup previous cpp test results when run multiple ctest"
-    results_dir = build_dir
+    "Back up previous C++ test results when running multiple ctest invocations."
+    results_dir = _pl.Path(build_dir)
 
     yield
 
-    backup_dir = f"{llm_root}/cpp/build_backup"
-    _os.makedirs(backup_dir, exist_ok=True)
-    # Copy XML files to backup directory
-    xml_files = glob.glob(f"{results_dir}/*.xml")
+    backup_dir = _pl.Path(llm_root) / "cpp" / "build_backup"
+    backup_dir.mkdir(parents=True, exist_ok=True)
+    # Copy XML files to backup directory
+    xml_files = list(results_dir.glob("*.xml"))
     if xml_files:
         for xml_file in xml_files:
             try:
-                shutil.copy(xml_file, backup_dir)
-                _logger.info(f"Copied {xml_file} to {backup_dir}")
-            except Exception as e:
-                _logger.error(f"Error copying {xml_file}: {str(e)}")
+                shutil.copy2(xml_file, backup_dir / xml_file.name)
+                _logger.info("Copied %s to %s", xml_file, backup_dir)
+            except Exception:
+                _logger.exception("Error copying %s", xml_file)
     else:
         _logger.info("No XML files found in the build directory.")
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 748e64e and d142c3c.

📒 Files selected for processing (12)
  • cpp/tests/CMakeLists.txt (0 hunks)
  • cpp/tests/e2e_tests/CMakeLists.txt (1 hunks)
  • cpp/tests/e2e_tests/batch_manager/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/batch_manager/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/kernels/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/layers/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/layers/eagleLayerTest.cpp (1 hunks)
  • cpp/tests/unit_tests/layers/medusaDecodeLayerTest.cpp (1 hunks)
  • cpp/tests/unit_tests/runtime/CMakeLists.txt (1 hunks)
  • tests/integration/defs/cpp/conftest.py (3 hunks)
  • tests/integration/defs/cpp/cpp_common.py (1 hunks)
💤 Files with no reviewable changes (1)
  • cpp/tests/CMakeLists.txt
🚧 Files skipped from review as they are similar to previous changes (6)
  • cpp/tests/unit_tests/layers/medusaDecodeLayerTest.cpp
  • cpp/tests/unit_tests/layers/eagleLayerTest.cpp
  • cpp/tests/unit_tests/runtime/CMakeLists.txt
  • cpp/tests/unit_tests/CMakeLists.txt
  • cpp/tests/unit_tests/kernels/CMakeLists.txt
  • cpp/tests/e2e_tests/CMakeLists.txt
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

**/*.py: Python code should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL).
Python constants should use upper snake_case (e.g., MY_CONSTANT).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a Python file, prefer docstrings over comments.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the class docstring.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.

Files:

  • tests/integration/defs/cpp/cpp_common.py
  • tests/integration/defs/cpp/conftest.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.

Files:

  • tests/integration/defs/cpp/cpp_common.py
  • tests/integration/defs/cpp/conftest.py
🧬 Code Graph Analysis (2)
tests/integration/defs/cpp/cpp_common.py (1)
tests/integration/defs/cpp/conftest.py (2)
  • build_type (22-25)
  • root_dir (54-55)
tests/integration/defs/cpp/conftest.py (1)
tests/integration/defs/cpp/cpp_common.py (1)
  • find_build_dir (85-89)
🪛 Ruff (0.12.2)
tests/integration/defs/cpp/conftest.py

226-226: First line should end with a period, question mark, or exclamation point

Add closing punctuation

(D415)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pre-commit Check
🔇 Additional comments (4)
cpp/tests/unit_tests/batch_manager/CMakeLists.txt (2)

22-22: peftCacheManagerTest CMake target and source verified
No duplicate add_gtest(peftCacheManagerTest…) definitions were found, and peftCacheManagerTest.cpp exists at cpp/tests/unit_tests/batch_manager/peftCacheManagerTest.cpp.


12-14: Unit test targets are unique and all source files are present

No duplicates of add_gtest for blockKeyTest, cacheTransBufferTest, or cacheTransceiverTest were found, and each corresponding .cpp file exists under cpp/tests/unit_tests/batch_manager/. Everything looks good to merge.

cpp/tests/unit_tests/layers/CMakeLists.txt (1)

34-34: Confirmed: Layer tests correctly localized, no duplicates, headers present

  • No additional add_gtest(eagleLayerTest…) or add_gtest(medusaDecodeLayerTest…) found outside cpp/tests/unit_tests/layers/CMakeLists.txt.
  • Both .cpp and corresponding .h files exist under cpp/tests/unit_tests/layers/.
  • Include paths appear correct for tests; no further CMake adjustments required.
cpp/tests/e2e_tests/batch_manager/CMakeLists.txt (1)

12-16: E2E Tests Confirmed: Names & Linking OK

  • Both guidedDecoderTest and trtEncoderModelTest appear only once and their source files exist.
  • Neither test has any unresolved symbols—no testingUtils dependency was detected in the code.
  • The existing add_gtest(…) invocations are sufficient; no extra target_link_libraries(…, testingUtils) is needed.

Optional: if you’d like to filter these as “e2e” in CTest, you can add after each add_gtest:

set_property(TEST guidedDecoderTest       PROPERTY LABELS e2e)
set_property(TEST trtEncoderModelTest     PROPERTY LABELS e2e)

@tensorrt-cicd
Copy link
Collaborator

PR_Github #14808 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11180 (Partly Tested) completed with status: 'FAILURE'

@Funatiq
Copy link
Collaborator Author

Funatiq commented Aug 11, 2025

/bot run --stage-list "A30-CPP-1, A30-CPP-2, A30-CPP-3"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #14811 [ run ] triggered by Bot

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
tests/integration/defs/cpp/conftest.py (1)

158-176: Parameterize trt_root and nixl_root paths as previously suggested.

The hardcoded paths for trt_root and nixl_root were flagged in a previous review but haven't been addressed. This limits flexibility in different CI/development environments.

Apply this diff to make the paths configurable via environment variables:

 build_trt_llm(
     build_type=build_type,
     cuda_architectures=cuda_arch,
     job_count=12,
     use_ccache=True,
     clean=True,
     generator="Ninja",
-    trt_root="/usr/local/tensorrt",
-    nixl_root="/opt/nvidia/nvda_nixl",
+    trt_root=_os.environ.get("TENSORRT_ROOT", "/usr/local/tensorrt"),
+    nixl_root=_os.environ.get("NIXL_ROOT", "/opt/nvidia/nvda_nixl"),
     skip_building_wheel=True,
     extra_make_targets=["google-tests"],
 )
🧹 Nitpick comments (1)
tests/integration/defs/cpp/conftest.py (1)

225-244: Ensure consistent Path handling for llm_root.

The implementation looks good with proper error handling and logging. However, consider ensuring llm_root is consistently handled as a Path object.

If llm_root fixture returns a Path object, avoid redundant conversion:

-    backup_dir = _pl.Path(llm_root) / "cpp" / "build_backup"
+    backup_dir = (llm_root if isinstance(llm_root, _pl.Path) else _pl.Path(llm_root)) / "cpp" / "build_backup"

Or better yet, ensure the llm_root fixture always returns a Path object to avoid this check.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d142c3c and 7c7dc55.

📒 Files selected for processing (1)
  • tests/integration/defs/cpp/conftest.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

**/*.py: Python code should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL).
Python constants should use upper snake_case (e.g., MY_CONSTANT).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a Python file, prefer docstrings over comments.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the class docstring.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.

Files:

  • tests/integration/defs/cpp/conftest.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.

Files:

  • tests/integration/defs/cpp/conftest.py
🧬 Code Graph Analysis (1)
tests/integration/defs/cpp/conftest.py (1)
tests/integration/defs/cpp/cpp_common.py (1)
  • find_build_dir (85-89)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pre-commit Check
🔇 Additional comments (3)
tests/integration/defs/cpp/conftest.py (3)

21-25: LGTM!

Good addition of the build_type fixture with environment variable override capability and helpful debugging comment.


28-31: LGTM!

Proper refactoring to use the configurable build_type parameter instead of hardcoded values.


179-192: LGTM!

Correctly updated to use the configurable build_type parameter in the cmake build command.

@tensorrt-cicd
Copy link
Collaborator

PR_Github #14811 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11183 (Partly Tested) completed with status: 'FAILURE'

@Funatiq
Copy link
Collaborator Author

Funatiq commented Aug 11, 2025

/bot run --stage-list "A30-CPP-1, A30-CPP-2, A30-CPP-3"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #14822 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #14822 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #11194 (Partly Tested) completed with status: 'FAILURE'

@Funatiq
Copy link
Collaborator Author

Funatiq commented Aug 11, 2025

/bot run --stage-list "A30-CPP-1, A30-CPP-2, A30-CPP-3" --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #14830 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #14830 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #11201 (Partly Tested) completed with status: 'FAILURE'

@Funatiq Funatiq force-pushed the dev/ci/restructure_cpp_tests branch from 7c7dc55 to 1bd733e Compare August 12, 2025 10:27
@Funatiq
Copy link
Collaborator Author

Funatiq commented Aug 12, 2025

/bot run --stage-list "A30-CPP-1, A30-CPP-2, A30-CPP-3"

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
cpp/tests/unit_tests/CMakeLists.txt (1)

12-19: Maintain consistent ordering and grouping for subdirectories

For readability and fewer merge conflicts, keep subdirectories alphabetically sorted or grouped by component hierarchy. Current order is mixed (component + common areas). Optional: reorder or add a short comment explaining the grouping rationale.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7c7dc55 and 1bd733e.

📒 Files selected for processing (12)
  • cpp/tests/CMakeLists.txt (0 hunks)
  • cpp/tests/e2e_tests/CMakeLists.txt (1 hunks)
  • cpp/tests/e2e_tests/batch_manager/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/batch_manager/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/kernels/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/layers/CMakeLists.txt (1 hunks)
  • cpp/tests/unit_tests/layers/eagleLayerTest.cpp (1 hunks)
  • cpp/tests/unit_tests/layers/medusaDecodeLayerTest.cpp (1 hunks)
  • cpp/tests/unit_tests/runtime/CMakeLists.txt (1 hunks)
  • tests/integration/defs/cpp/conftest.py (3 hunks)
  • tests/integration/defs/cpp/cpp_common.py (1 hunks)
💤 Files with no reviewable changes (1)
  • cpp/tests/CMakeLists.txt
🚧 Files skipped from review as they are similar to previous changes (10)
  • cpp/tests/unit_tests/layers/eagleLayerTest.cpp
  • cpp/tests/unit_tests/runtime/CMakeLists.txt
  • cpp/tests/unit_tests/batch_manager/CMakeLists.txt
  • tests/integration/defs/cpp/cpp_common.py
  • cpp/tests/unit_tests/layers/CMakeLists.txt
  • cpp/tests/unit_tests/layers/medusaDecodeLayerTest.cpp
  • cpp/tests/unit_tests/kernels/CMakeLists.txt
  • cpp/tests/e2e_tests/CMakeLists.txt
  • cpp/tests/e2e_tests/batch_manager/CMakeLists.txt
  • tests/integration/defs/cpp/conftest.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pre-commit Check

@Funatiq
Copy link
Collaborator Author

Funatiq commented Aug 22, 2025

/bot run --disable-fail-fast

@Funatiq Funatiq self-assigned this Aug 22, 2025
@tensorrt-cicd
Copy link
Collaborator

PR_Github #16172 [ run ] triggered by Bot

@Funatiq
Copy link
Collaborator Author

Funatiq commented Aug 23, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #16251 [ run ] triggered by Bot

@Funatiq Funatiq force-pushed the dev/ci/restructure_cpp_tests branch from 12622be to 2653ed0 Compare August 23, 2025 06:45
@Funatiq
Copy link
Collaborator Author

Funatiq commented Aug 23, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #16253 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #16251 [ run ] completed with state ABORTED

@tensorrt-cicd
Copy link
Collaborator

PR_Github #16253 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #12218 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

Funatiq added 11 commits August 24, 2025 17:31
…test_list function in cpp_common.py

This check was redundant because GptSession has been removed.

Signed-off-by: Robin Kobus <[email protected]>
- Move several test cases files from the tests directory to the unit_tests directory.
- Update CMakeLists.txt files to reflect the new directory structure.

Signed-off-by: Robin Kobus <[email protected]>
- Move end-to-end test files for batch manager and executor to the e2e_tests directory.
- Introduce new CMakeLists.txt files to organize end-to-end tests for the batch manager and executor components.

Signed-off-by: Robin Kobus <[email protected]>
- Introduced a new fixture to specify build type for C++ tests, defaulting to "Release".
- Updated existing fixtures to utilize the new build type parameter.
- Modified build commands to accommodate the specified build type, improving flexibility in test configurations.

Signed-off-by: Robin Kobus <[email protected]>
- Added documentation for the build_type and build_dir fixtures to clarify their purpose.
- Enhanced the keep_log_files fixture to use pathlib for backup directory creation and improved XML file handling.

Signed-off-by: Robin Kobus <[email protected]>
- Updated test duration entries for C++ tests in .test_durations.
- Adjusted timeout settings for specific tests in l0_a30.yml to improve test reliability.

Signed-off-by: Robin Kobus <[email protected]>
- Moved multi-GPU test files to unit_tests/multi_gpu directory, including cacheTransceiverTest, mpiUtilsTest, and various allReduce tests.
- Updated CMakeLists.txt files to include the new multi-GPU test directories and ensure proper test organization.
- Updated test_multi_gpu.py to reflect the new test directory structure.

Signed-off-by: Robin Kobus <[email protected]>
- Moved userBufferTest to unit_tests/multi_gpu directory.
- Updated CMakeLists.txt and test_multi_gpu.py to reflect the new test structure.
- Added a new test function run_user_buffer_tests to test_multi_gpu.py.

Signed-off-by: Robin Kobus <[email protected]>
- Updated the `keep_log_files` fixture to dynamically create backup directories based on the build directory name.
- Enhanced the `find_build_dir` function with detailed docstring and improved return logic for resolving the C++ build directory.

Signed-off-by: Robin Kobus <[email protected]>
@Funatiq Funatiq force-pushed the dev/ci/restructure_cpp_tests branch from 2653ed0 to 57e2fa8 Compare August 24, 2025 17:32
@Funatiq
Copy link
Collaborator Author

Funatiq commented Aug 24, 2025

/bot reuse-pipeline

@Funatiq Funatiq enabled auto-merge (squash) August 24, 2025 17:42
@tensorrt-cicd
Copy link
Collaborator

PR_Github #16324 [ reuse-pipeline ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #16324 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #16253 for commit cf1d790

@Funatiq Funatiq merged commit 31979ae into NVIDIA:main Aug 24, 2025
4 checks passed
@Funatiq Funatiq deleted the dev/ci/restructure_cpp_tests branch August 25, 2025 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants