Skip to content

Conversation

@cheshirekow
Copy link
Collaborator

@cheshirekow cheshirekow commented Nov 21, 2025

Description

In this change we add a new Dockerfile step which copies third-party source files into the container image. This is necessary for compliance with certain open source licenses which require source code availability. Distributing the sources as-used in the container image is a simple and direct way of ensuring the sources are used and mapped to the version that was built in the container.

The script to perform this step relies on the C++ third-party process which requires third-party libraries to be integrated with CMake's FetchContent modules. The script scans the _deps/ directory and collects all the *-src directories which are created by FetchContent. It archives these into same-named tarballs and places them in /third_party_sources/.

Summary by CodeRabbit

  • Chores
    • Enhanced build process to package and distribute third-party sources with release artifacts, enabling dependency access for benchmarking tools and supporting license compliance requirements.

✏️ Tip: You can customize this high-level summary in your review settings.

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]

Launch build/test pipelines. All previously running jobs will be killed.

--reuse-test (optional)pipeline-id (OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.

--disable-reuse-test (OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-PyTorch-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--test-backend "pytorch, cpp" (OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".

--detailed-log (OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.

--debug (OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in the stage-list parameter to access the appropriate container environment. Note: Does NOT update GitHub check status.

For guidance on mapping tests to stage names, see docs/source/reference/ci-overview.md
and the scripts/test_to_stage_mapping.py helper.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

In this change we add a new Dockerfile step which copies third-party
source files into the container image. This is necessary for compliance
with certain open source licenses which require source code
availability. Distributing the sources as-used in the container image
is a simple and direct way of ensuring the sources are used and mapped
to the version that was built in the container.

The script to perform this step relies on the C++ third-party process
which requires third-party libraries to be integrated with CMake's
FetchContent modules. The script scans the _deps/ directory and collects
all the *-src directories which are created by FetchContent. It
archives these into same-named tarballs and places them in
`/third_party_sources/`.

Signed-off-by: Josh Bialkowski <[email protected]>
@cheshirekow cheshirekow requested review from a team as code owners November 21, 2025 18:23
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 21, 2025

📝 Walkthrough

Walkthrough

Adds a post-build step to the Docker wheel stage that executes a new Python script to extract and package third-party source dependencies as tarballs into /third-party-sources. The final release stage then mounts these sources into the CMake build directory for use in benchmarks.

Changes

Cohort / File(s) Summary
Third-party sources extraction
docker/Dockerfile.multi, scripts/copy_third_party_sources.py
Added post-build step in wheel stage to run copy_third_party_sources.py script; new script scans _deps directory for source folders (ending in -src), creates compressed tarballs, and outputs them; final stage mounts /third-party-sources into CPP_BUILD_DIR/_deps for benchmark usage.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • The Dockerfile modification is a straightforward integration of an additional build step with clear artifact propagation.
  • The new Python script uses standard libraries and follows a simple pattern: argument parsing, directory scanning, tarball creation, and basic logging.
  • Minimal logic density; changes are additive and follow conventional patterns.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description check ❓ Inconclusive The description explains the what (copying third-party sources), why (license compliance), and how (scanning _deps/ and archiving -src directories). However, the Test Coverage section is empty, which is a required section per the template. Please add details to the Test Coverage section explaining what tests safeguard these changes, or clarify if the changes are infrastructure-only with no direct test coverage needed.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: adding a Dockerfile step to copy third-party sources for license compliance.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
docker/Dockerfile.multi (1)

161-162: Clarify the purpose of copying the _deps directory alongside tarballs.

The /third-party-sources directory already contains tarballs of the source code. Copying the entire ${CPP_BUILD_DIR}/_deps directory may include build artifacts, intermediate files, or other non-source content. Consider whether the _deps directory copy is necessary, or if only the tarballs should be distributed for license compliance.

If the intent is solely to distribute sources for compliance, consider removing ${CPP_BUILD_DIR}/_deps from the COPY command and rely only on the /third-party-sources tarballs.

scripts/copy_third_party_sources.py (1)

35-37: Consider whether empty _deps directories should fail the build.

The script raises a ValueError if no -src directories are found. This is appropriate for catching configuration issues, but consider whether there are valid scenarios where the dependencies directory might be empty or not yet populated. If the script is always run post-build, this check is correct.

If there are scenarios where an empty _deps directory is acceptable (e.g., builds with no third-party dependencies), consider making this check optional via a command-line flag like --allow-empty.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9b2abb8 and 7025d14.

📒 Files selected for processing (2)
  • docker/Dockerfile.multi (2 hunks)
  • scripts/copy_third_party_sources.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-20T15:04:42.885Z
Learnt from: dbari
Repo: NVIDIA/TensorRT-LLM PR: 7095
File: docker/Dockerfile.multi:168-168
Timestamp: 2025-08-20T15:04:42.885Z
Learning: In docker/Dockerfile.multi, wildcard COPY for benchmarks (${CPP_BUILD_DIR}/benchmarks/*Benchmark) is intentionally used instead of directory copy because the benchmarks directory contains various other build artifacts during C++ builds, and only specific benchmark executables should be copied to the final image.

Applied to files:

  • docker/Dockerfile.multi
🪛 Ruff (0.14.5)
scripts/copy_third_party_sources.py

37-37: Avoid specifying long messages outside the exception class

(TRY003)


44-44: subprocess call: check for execution of untrusted input

(S603)


45-45: Starting a process with a partial executable path

(S607)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pre-commit Check
🔇 Additional comments (1)
scripts/copy_third_party_sources.py (1)

18-46: LGTM! Implementation is correct and handles errors appropriately.

The script correctly:

  • Locates all FetchContent -src directories via glob
  • Fails fast with a clear error if no source directories are found
  • Creates compressed tarballs with proper directory structure using tar -czf
  • Ensures the output directory exists before writing
  • Propagates errors via check=True in subprocess.run

Note: The static analysis warnings (S603, S607) are false positives—the inputs are derived from filesystem scanning, not user input, and tar is a standard system utility.

Comment on lines +131 to +134
GITHUB_MIRROR=$GITHUB_MIRROR python3 ${BUILD_WHEEL_SCRIPT} ${BUILD_WHEEL_ARGS} \
&& python3 scripts/copy_third_party_sources.py \
--deps-dir ./build/_deps \
--output-dir /third-party-sources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the deps directory path matches the CMake build location.

The script is invoked with --deps-dir ./build/_deps, which resolves to /src/tensorrt_llm/build/_deps. However, line 162 copies ${CPP_BUILD_DIR}/_deps, which expands to /src/tensorrt_llm/cpp/build/_deps. These paths don't match—either the script should use --deps-dir ./cpp/build/_deps, or the COPY command should reference the correct path.

Run the following script to verify which _deps directory is created by the wheel build:


🏁 Script executed:

#!/bin/bash
# Description: Check which _deps directories exist after a typical CMake/wheel build

# Search for _deps directories that would be created during build
fd -t d -a '^_deps$' --max-depth 5

# Also check build scripts to see where CMake is configured
rg -n "CMAKE_BINARY_DIR|cmake.*-B" scripts/build_wheel.py

Length of output: 0


The initial script produced no output, which is inconclusive. Let me gather more information about the build structure and scripts:


🏁 Script executed:

#!/bin/bash
# Locate build_wheel.py and related build scripts
echo "=== Finding build scripts ==="
fd -t f "build_wheel" .
fd -t f "CMakeLists.txt" . | head -20

echo -e "\n=== Checking Dockerfile structure ==="
# Look at the full context around lines 131-134 and 161-162
sed -n '120,170p' docker/Dockerfile.multi | cat -n

echo -e "\n=== Checking for _deps references in Dockerfile ==="
rg -n "_deps" docker/Dockerfile.multi

echo -e "\n=== Checking for BUILD_WHEEL related vars ==="
rg -n "BUILD_WHEEL|CPP_BUILD_DIR" docker/Dockerfile.multi

Length of output: 3939


Now I need to examine the build scripts to understand where the _deps directory is actually created:


🏁 Script executed:

#!/bin/bash
echo "=== Checking build_wheel.py for cmake configuration ==="
rg -n -A5 -B5 "cmake|CMAKE_BINARY_DIR|-B" scripts/build_wheel.py | head -80

echo -e "\n=== Checking CMakeLists.txt for _deps handling ==="
rg -n "_deps|ExternalProject|FetchContent" cpp/CMakeLists.txt | head -30

echo -e "\n=== Checking top-level CMakeLists for build dir config ==="
rg -n "CMAKE_BINARY_DIR|BINARY_DIR" cpp/CMakeLists.txt | head -20

echo -e "\n=== Looking for where wheel build outputs ==="
rg -n "build.*_deps|cpp.*build" scripts/build_wheel.py | head -20

echo -e "\n=== Check copy_third_party_sources.py to understand input/output ==="
head -50 scripts/copy_third_party_sources.py

Length of output: 6892


🏁 Script executed:

#!/bin/bash
echo "=== Checking build_wheel.py for cmake configuration ==="
rg -n -A5 -B5 "cmake|CMAKE_BINARY_DIR|-B" scripts/build_wheel.py | head -100

echo -e "\n=== Checking CMakeLists.txt for _deps handling ==="
rg -n "_deps|ExternalProject|FetchContent" cpp/CMakeLists.txt | head -30

echo -e "\n=== Looking for where wheel build outputs ==="
rg -n "build.*_deps|cpp.*build" scripts/build_wheel.py | head -20

echo -e "\n=== Check copy_third_party_sources.py to understand input/output ==="
head -80 scripts/copy_third_party_sources.py

Length of output: 6712


I need to verify how cmake is actually invoked in build_wheel.py to confirm the CMAKE_BINARY_DIR location:


🏁 Script executed:

#!/bin/bash
echo "=== Find cmake invocation in build_wheel.py ==="
rg -n -A10 "cmake.*-B|subprocess.*cmake" scripts/build_wheel.py | head -50

echo -e "\n=== Check get_build_dir function ==="
rg -n -B5 -A15 "def get_build_dir" scripts/build_wheel.py

echo -e "\n=== Look for where cmake command is built and executed ==="
rg -n "cmake_cmd|run_command.*cmake|Popen.*cmake" scripts/build_wheel.py | head -20

Length of output: 824


Fix deps directory path: change ./build/_deps to ./cpp/build/_deps.

The build configuration in build_wheel.py defaults to creating the CMake binary directory at cpp/build (line 64). CMake then creates third-party dependencies at ${CMAKE_BINARY_DIR}/_deps, which resolves to cpp/build/_deps. However, line 133 invokes the copy script with --deps-dir ./build/_deps, pointing to a non-existent directory. This causes copy_third_party_sources.py to fail with "No source directories found" since it searches for *-src directories in the wrong location.

Line 133 should be updated to: --deps-dir ./cpp/build/_deps

🤖 Prompt for AI Agents
In docker/Dockerfile.multi around lines 131 to 134, the
copy_third_party_sources.py call uses the wrong deps directory path (--deps-dir
./build/_deps) which doesn't match CMake's CMAKE_BINARY_DIR (cpp/build) so the
actual deps live at ./cpp/build/_deps; update the argument to --deps-dir
./cpp/build/_deps so the script points to the correct third-party deps directory
and can find the *-src folders.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant