-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[TRTLLM-9227][infra] Add Dockerfile step to copy sources #9368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
In this change we add a new Dockerfile step which copies third-party source files into the container image. This is necessary for compliance with certain open source licenses which require source code availability. Distributing the sources as-used in the container image is a simple and direct way of ensuring the sources are used and mapped to the version that was built in the container. The script to perform this step relies on the C++ third-party process which requires third-party libraries to be integrated with CMake's FetchContent modules. The script scans the _deps/ directory and collects all the *-src directories which are created by FetchContent. It archives these into same-named tarballs and places them in `/third_party_sources/`. Signed-off-by: Josh Bialkowski <[email protected]>
📝 WalkthroughWalkthroughAdds a post-build step to the Docker wheel stage that executes a new Python script to extract and package third-party source dependencies as tarballs into Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
docker/Dockerfile.multi (1)
161-162: Clarify the purpose of copying the _deps directory alongside tarballs.The
/third-party-sourcesdirectory already contains tarballs of the source code. Copying the entire${CPP_BUILD_DIR}/_depsdirectory may include build artifacts, intermediate files, or other non-source content. Consider whether the _deps directory copy is necessary, or if only the tarballs should be distributed for license compliance.If the intent is solely to distribute sources for compliance, consider removing
${CPP_BUILD_DIR}/_depsfrom the COPY command and rely only on the/third-party-sourcestarballs.scripts/copy_third_party_sources.py (1)
35-37: Consider whether empty _deps directories should fail the build.The script raises a ValueError if no
-srcdirectories are found. This is appropriate for catching configuration issues, but consider whether there are valid scenarios where the dependencies directory might be empty or not yet populated. If the script is always run post-build, this check is correct.If there are scenarios where an empty _deps directory is acceptable (e.g., builds with no third-party dependencies), consider making this check optional via a command-line flag like
--allow-empty.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docker/Dockerfile.multi(2 hunks)scripts/copy_third_party_sources.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-20T15:04:42.885Z
Learnt from: dbari
Repo: NVIDIA/TensorRT-LLM PR: 7095
File: docker/Dockerfile.multi:168-168
Timestamp: 2025-08-20T15:04:42.885Z
Learning: In docker/Dockerfile.multi, wildcard COPY for benchmarks (${CPP_BUILD_DIR}/benchmarks/*Benchmark) is intentionally used instead of directory copy because the benchmarks directory contains various other build artifacts during C++ builds, and only specific benchmark executables should be copied to the final image.
Applied to files:
docker/Dockerfile.multi
🪛 Ruff (0.14.5)
scripts/copy_third_party_sources.py
37-37: Avoid specifying long messages outside the exception class
(TRY003)
44-44: subprocess call: check for execution of untrusted input
(S603)
45-45: Starting a process with a partial executable path
(S607)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (1)
scripts/copy_third_party_sources.py (1)
18-46: LGTM! Implementation is correct and handles errors appropriately.The script correctly:
- Locates all FetchContent
-srcdirectories via glob- Fails fast with a clear error if no source directories are found
- Creates compressed tarballs with proper directory structure using
tar -czf- Ensures the output directory exists before writing
- Propagates errors via
check=Truein subprocess.runNote: The static analysis warnings (S603, S607) are false positives—the inputs are derived from filesystem scanning, not user input, and
taris a standard system utility.
| GITHUB_MIRROR=$GITHUB_MIRROR python3 ${BUILD_WHEEL_SCRIPT} ${BUILD_WHEEL_ARGS} \ | ||
| && python3 scripts/copy_third_party_sources.py \ | ||
| --deps-dir ./build/_deps \ | ||
| --output-dir /third-party-sources |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify the deps directory path matches the CMake build location.
The script is invoked with --deps-dir ./build/_deps, which resolves to /src/tensorrt_llm/build/_deps. However, line 162 copies ${CPP_BUILD_DIR}/_deps, which expands to /src/tensorrt_llm/cpp/build/_deps. These paths don't match—either the script should use --deps-dir ./cpp/build/_deps, or the COPY command should reference the correct path.
Run the following script to verify which _deps directory is created by the wheel build:
🏁 Script executed:
#!/bin/bash
# Description: Check which _deps directories exist after a typical CMake/wheel build
# Search for _deps directories that would be created during build
fd -t d -a '^_deps$' --max-depth 5
# Also check build scripts to see where CMake is configured
rg -n "CMAKE_BINARY_DIR|cmake.*-B" scripts/build_wheel.pyLength of output: 0
The initial script produced no output, which is inconclusive. Let me gather more information about the build structure and scripts:
🏁 Script executed:
#!/bin/bash
# Locate build_wheel.py and related build scripts
echo "=== Finding build scripts ==="
fd -t f "build_wheel" .
fd -t f "CMakeLists.txt" . | head -20
echo -e "\n=== Checking Dockerfile structure ==="
# Look at the full context around lines 131-134 and 161-162
sed -n '120,170p' docker/Dockerfile.multi | cat -n
echo -e "\n=== Checking for _deps references in Dockerfile ==="
rg -n "_deps" docker/Dockerfile.multi
echo -e "\n=== Checking for BUILD_WHEEL related vars ==="
rg -n "BUILD_WHEEL|CPP_BUILD_DIR" docker/Dockerfile.multiLength of output: 3939
Now I need to examine the build scripts to understand where the _deps directory is actually created:
🏁 Script executed:
#!/bin/bash
echo "=== Checking build_wheel.py for cmake configuration ==="
rg -n -A5 -B5 "cmake|CMAKE_BINARY_DIR|-B" scripts/build_wheel.py | head -80
echo -e "\n=== Checking CMakeLists.txt for _deps handling ==="
rg -n "_deps|ExternalProject|FetchContent" cpp/CMakeLists.txt | head -30
echo -e "\n=== Checking top-level CMakeLists for build dir config ==="
rg -n "CMAKE_BINARY_DIR|BINARY_DIR" cpp/CMakeLists.txt | head -20
echo -e "\n=== Looking for where wheel build outputs ==="
rg -n "build.*_deps|cpp.*build" scripts/build_wheel.py | head -20
echo -e "\n=== Check copy_third_party_sources.py to understand input/output ==="
head -50 scripts/copy_third_party_sources.pyLength of output: 6892
🏁 Script executed:
#!/bin/bash
echo "=== Checking build_wheel.py for cmake configuration ==="
rg -n -A5 -B5 "cmake|CMAKE_BINARY_DIR|-B" scripts/build_wheel.py | head -100
echo -e "\n=== Checking CMakeLists.txt for _deps handling ==="
rg -n "_deps|ExternalProject|FetchContent" cpp/CMakeLists.txt | head -30
echo -e "\n=== Looking for where wheel build outputs ==="
rg -n "build.*_deps|cpp.*build" scripts/build_wheel.py | head -20
echo -e "\n=== Check copy_third_party_sources.py to understand input/output ==="
head -80 scripts/copy_third_party_sources.pyLength of output: 6712
I need to verify how cmake is actually invoked in build_wheel.py to confirm the CMAKE_BINARY_DIR location:
🏁 Script executed:
#!/bin/bash
echo "=== Find cmake invocation in build_wheel.py ==="
rg -n -A10 "cmake.*-B|subprocess.*cmake" scripts/build_wheel.py | head -50
echo -e "\n=== Check get_build_dir function ==="
rg -n -B5 -A15 "def get_build_dir" scripts/build_wheel.py
echo -e "\n=== Look for where cmake command is built and executed ==="
rg -n "cmake_cmd|run_command.*cmake|Popen.*cmake" scripts/build_wheel.py | head -20Length of output: 824
Fix deps directory path: change ./build/_deps to ./cpp/build/_deps.
The build configuration in build_wheel.py defaults to creating the CMake binary directory at cpp/build (line 64). CMake then creates third-party dependencies at ${CMAKE_BINARY_DIR}/_deps, which resolves to cpp/build/_deps. However, line 133 invokes the copy script with --deps-dir ./build/_deps, pointing to a non-existent directory. This causes copy_third_party_sources.py to fail with "No source directories found" since it searches for *-src directories in the wrong location.
Line 133 should be updated to: --deps-dir ./cpp/build/_deps
🤖 Prompt for AI Agents
In docker/Dockerfile.multi around lines 131 to 134, the
copy_third_party_sources.py call uses the wrong deps directory path (--deps-dir
./build/_deps) which doesn't match CMake's CMAKE_BINARY_DIR (cpp/build) so the
actual deps live at ./cpp/build/_deps; update the argument to --deps-dir
./cpp/build/_deps so the script points to the correct third-party deps directory
and can find the *-src folders.
Description
In this change we add a new Dockerfile step which copies third-party source files into the container image. This is necessary for compliance with certain open source licenses which require source code availability. Distributing the sources as-used in the container image is a simple and direct way of ensuring the sources are used and mapped to the version that was built in the container.
The script to perform this step relies on the C++ third-party process which requires third-party libraries to be integrated with CMake's FetchContent modules. The script scans the _deps/ directory and collects all the *-src directories which are created by FetchContent. It archives these into same-named tarballs and places them in
/third_party_sources/.Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...Provide a user friendly way for developers to interact with a Jenkins server.
Run
/bot [-h|--help]to print this help message.See details below for each supported subcommand.
run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]Launch build/test pipelines. All previously running jobs will be killed.
--reuse-test (optional)pipeline-id(OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.--disable-reuse-test(OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.--disable-fail-fast(OPTIONAL) : Disable fail fast on build/tests/infra failures.--skip-test(OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.--stage-list "A10-PyTorch-1, xxx"(OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.--gpu-type "A30, H100_PCIe"(OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.--test-backend "pytorch, cpp"(OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.--only-multi-gpu-test(OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.--disable-multi-gpu-test(OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.--add-multi-gpu-test(OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.--post-merge(OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx"(OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".--detailed-log(OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.--debug(OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in thestage-listparameter to access the appropriate container environment. Note: Does NOT update GitHub check status.For guidance on mapping tests to stage names, see
docs/source/reference/ci-overview.mdand the
scripts/test_to_stage_mapping.pyhelper.kill
killKill all running builds associated with pull request.
skip
skip --comment COMMENTSkip testing for latest commit on pull request.
--comment "Reason for skipping build/test"is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.reuse-pipeline
reuse-pipelineReuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.