Nmailhot/docker layer metrics simple #4559

nv-nmailhot · 2025-11-24T20:14:04Z

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

Summary by CodeRabbit

Chores
- Enhanced build diagnostics with detailed logging and metrics collection
- Improved build reporting capabilities for better insights into build performance and resource usage

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-11-24T20:18:53Z

Walkthrough

These changes introduce comprehensive BuildKit integration and build observability. Updates include enabling BuildKit debugging, propagating GitHub identifiers, recording build timing metrics, creating per-build logs, and adding automated report generation via a new Python parser that extracts step-level metadata and produces JSON and text artifacts.

Changes

Cohort / File(s)	Summary
GitHub Actions Workflow `.github/actions/docker-build/action.yml`	Adds BuildKit debugging with `buildkitd-flags: --debug`; propagates `GITHUB_RUN_ID` and `GITHUB_JOB` into build context; introduces build timing metrics (`BUILD_START_EPOCH`, `BUILD_END_EPOCH`); captures build output to dedicated log files (`build-logs/<image>-<platform>.log`); records `BUILD_EXIT_CODE`; computes duration and image size metrics; generates JSON metrics and BuildKit reports via new parser; uploads BuildKit reports as artifacts.
Build Observability Parser `.github/scripts/parse_buildkit_output.py`	New Python script with `BuildKitParser` class that parses BuildKit logs to extract step-level metadata (number, name, command, status, duration, cache usage, size transferred); aggregates steps with computed summary metrics (total steps, cache hit rate, total duration); generates human-readable reports with per-step details, slowest steps, and top transfers; outputs to JSON and text formats; includes command-line interface with file I/O and error handling.
Container Build Automation `container/build.sh`	Replaces standard docker build with BuildKit-enabled `docker buildx` (or fallback with `DOCKER_BUILDKIT=1`); adds per-build logging to centralized `build-logs/` directory (base image, framework image, single-stage variants); implements exit code handling for build failures; adds build summary section with image tags and sizes; integrates report generation by invoking Python parser on all logs to produce JSON and report artifacts under `build-logs/reports/`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

BuildKit parser logic: The parse_buildkit_output.py introduces substantial parsing logic with regex patterns, state management across step transitions, and size/duration extraction. Requires careful validation of step parsing correctness and edge case handling.
GitHub Actions YAML complexity: The action.yml file orchestrates multiple sequential steps with environment variable propagation, timing logic, and artifact uploads. Verify correct order of operations and environment variable availability across steps.
Shell script integration points: The container/build.sh changes involve complex branching for BuildKit availability, log file management, and report generation invocation. Ensure proper error handling and fallback behavior.
Cross-component dependencies: The three files form an integrated pipeline where parsing, logging, and report generation depend on consistent log formats and environment variable contracts.

Poem

🐰 BuildKit logs now dance in debug light,
Step by step we capture the build's flight,
JSON reports bloom where metrics shine bright,
Timing and transfers, all parsed just right!
From container to cloud, observability's might! 🚀

Pre-merge checks

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description only contains template placeholders with no substantive content. All required sections (Overview, Details, Where to start, Related Issues) are empty or unfilled.	Complete all required description sections: describe the purpose, list specific files to review, clarify related issues, and explain the key changes being made.
Title check	❓ Inconclusive	The title is vague and unclear. 'Nmailhot/docker layer metrics simple' uses generic phrasing without clearly conveying the main change; it appears to be a branch name rather than a descriptive PR title.	Revise the title to be more descriptive of the primary change, e.g., 'Add BuildKit metrics collection and reporting to Docker build workflow' or similar.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

Provide your own instructions using the high_level_summary_instructions setting.
Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

📝 Description — Summarize the main change in 50–60 words, explaining what was done.

📓 References — List relevant issues, discussions, documentation, or related PRs.

📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.

📊 Contributor Summary — Include a Markdown table showing contributions:
| Contributor | Lines Added | Lines Removed | Files Changed |

✔️ Additional Notes — Add any extra reviewer context.
Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

.github/actions/docker-build/action.yml (1)
1-293: Fix pre-commit formatting violations.

The pipeline shows trailing whitespace and other formatting issues that were auto-fixed by pre-commit hooks. You'll need to commit the auto-fixed changes.

Run the following to apply formatting fixes locally:
#!/bin/bash
# Run pre-commit on modified files
pre-commit run --files .github/actions/docker-build/action.yml
container/build.sh (1)
1-1069: Fix pre-commit formatting violations.

The pipeline shows trailing whitespace issues that were auto-fixed by pre-commit hooks. Commit the auto-fixed changes.

Run pre-commit locally:
#!/bin/bash
# Run pre-commit on modified files
pre-commit run --files container/build.sh

🧹 Nitpick comments (3)

container/build.sh (2)
1006-1034: Prefer while read over for loop on find output.

Lines 1006 and similar constructs use for loops over find output, which is fragile and doesn't handle filenames with spaces or special characters correctly.

Apply this pattern for better robustness:
-    for log_file in $(find "${BUILD_LOG_DIR}" -name "*.log" -type f 2>/dev/null); do
+    find "${BUILD_LOG_DIR}" -name "*.log" -type f 2>/dev/null | while IFS= read -r log_file; do
         LOG_BASENAME=$(basename "${log_file}" .log)
         
         if [ -n "${GITHUB_RUN_ID:-}" ] && [ -n "${GITHUB_JOB:-}" ]; then
Based on static analysis hints from Shellcheck.

1060-1067: Prefer while read over for loop on find output.

This for loop over find output has the same fragility issue as the earlier loop.

Apply this pattern:
-        for report in $(find "${REPORTS_DIR}" -name "*-report.txt" ! -name "*base-image*" 2>/dev/null); do
+        find "${REPORTS_DIR}" -name "*-report.txt" ! -name "*base-image*" 2>/dev/null | while IFS= read -r report; do
             if [ -f "${report}" ]; then
                 cat "${report}"
                 echo ""
Based on static analysis hints from Shellcheck.
.github/scripts/parse_buildkit_output.py (1)
320-327: Consider adding a comment to justify broad exception handling.

Line 324 catches Exception which is flagged by static analysis (BLE001). In this case, the broad exception is justified for handling various I/O errors, but adding a comment would clarify the intent.
     # Read build log
     try:
         with open(log_file, "r") as f:
             log_content = f.read()
+    # Catch all I/O related errors (file not found, permissions, encoding, etc.)
     except Exception as e:
         print(f"Error reading log file: {e}", file=sys.stderr)
         sys.exit(1)
Based on static analysis hints from Ruff (BLE001).

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e75bcf6 and 744b9b3.

📒 Files selected for processing (3)

.github/actions/docker-build/action.yml (7 hunks)
.github/scripts/parse_buildkit_output.py (1 hunks)
container/build.sh (2 hunks)

🧰 Additional context used

🧠 Learnings (3)

📚 Learning: 2025-08-30T20:43:49.632Z

Learnt from: keivenchang
Repo: ai-dynamo/dynamo PR: 2797
File: container/Dockerfile:437-449
Timestamp: 2025-08-30T20:43:49.632Z
Learning: In the dynamo project's devcontainer setup, the team prioritizes consistency across framework-specific Dockerfiles (like container/Dockerfile, container/Dockerfile.vllm, etc.) by mirroring their structure, even when individual optimizations might be possible, to maintain uniformity in the development environment setup.

Applied to files:

container/build.sh

📚 Learning: 2025-08-30T20:43:10.091Z

Learnt from: keivenchang
Repo: ai-dynamo/dynamo PR: 2797
File: .devcontainer/devcontainer.json:12-12
Timestamp: 2025-08-30T20:43:10.091Z
Learning: In the dynamo project, devcontainer.json files use templated container names (like "dynamo-vllm-devcontainer") that are automatically processed by the copy_devcontainer.sh script to generate framework-specific configurations with unique names, preventing container name collisions.

Applied to files:

container/build.sh

📚 Learning: 2025-08-30T20:43:10.091Z

Learnt from: keivenchang
Repo: ai-dynamo/dynamo PR: 2797
File: .devcontainer/devcontainer.json:12-12
Timestamp: 2025-08-30T20:43:10.091Z
Learning: In the dynamo project's devcontainer setup, hard-coded container names in devcontainer.json files serve as templates that are automatically processed by the copy_devcontainer.sh script to generate framework-specific configurations with unique names, preventing container name collisions.

Applied to files:

container/build.sh

🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/4559/merge) by nv-nmailhot.

.github/actions/docker-build/action.yml

[error] 1-1: Trailing whitespace / formatting issue fixed by pre-commit hook.

[error] 1-1: pre-commit: ruff/other hooks modified action.yml.

[error] 1-1: Command step 'pre-commit run --show-diff-on-failure --color=always --all-files' failed. See modified files above.

.github/scripts/parse_buildkit_output.py

[error] 1-1: Black formatting check failed. Reformatted parse_buildkit_output.py.

[error] 1-1: ruff: code style adjustments applied to parse_buildkit_output.py.

[error] 1-1: pre-commit: multiple hooks made changes (Black, Ruff, Trailing Whitespace).

[error] 1-1: Command step 'pre-commit run --show-diff-on-failure --color=always --all-files' failed. See modified files above.

container/build.sh

[error] 1-1: Trailing whitespace / formatting issue fixed by pre-commit hook.

[error] 1-1: Command step 'pre-commit run --show-diff-on-failure --color=always --all-files' failed. See modified files above.

🪛 Ruff (0.14.5)

.github/scripts/parse_buildkit_output.py

171-171: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

324-324: Do not catch blind exception: Exception

(BLE001)

🪛 Shellcheck (0.11.0)

container/build.sh

[warning] 1006-1006: For loops over find output are fragile. Use find -exec or a while read loop.

(SC2044)

[warning] 1061-1061: For loops over find output are fragile. Use find -exec or a while read loop.

(SC2044)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: sglang (arm64)
GitHub Check: vllm (amd64)
GitHub Check: trtllm (amd64)
GitHub Check: sglang (amd64)
GitHub Check: trtllm (arm64)
GitHub Check: operator (arm64)
GitHub Check: vllm (arm64)
GitHub Check: operator (amd64)
GitHub Check: Build and Test - dynamo

🔇 Additional comments (14)

.github/actions/docker-build/action.yml (7)

65-66: LGTM! BuildKit debugging enabled.

Enabling --debug for BuildKit will provide enhanced metadata useful for observability and troubleshooting.

93-94: LGTM! GitHub context properly propagated.

Propagating GITHUB_RUN_ID and GITHUB_JOB enables correlation of builds with CI runs.

103-115: LGTM! Build timing and logging setup is sound.

The epoch timestamp capture and log file path construction correctly support the new observability features.

131-150: LGTM! Build output capture and exit handling are correct.

Using PIPESTATUS[0] correctly captures the build's exit code while tee streams output to both console and log file. The early exit on failure ensures the action fails when the build fails, which is appropriate.

152-209: LGTM! Metrics capture is well-structured.

The metrics collection correctly computes duration, captures image size, and generates a structured JSON artifact.

210-277: LGTM! BuildKit report generation with robust error handling.

The step correctly handles missing logs and parser failures gracefully with set +e and conditional exit checks. The parser is invoked with appropriate parameters and failures won't block the build.

285-293: LGTM! Artifact upload configuration is appropriate.

Using if: always() ensures BuildKit reports are captured even when builds fail, which is valuable for troubleshooting.

container/build.sh (3)

877-901: LGTM! Framework build follows consistent BuildKit pattern.

The framework image build correctly mirrors the base image build structure with proper exit code handling and logging.

902-924: LGTM! Single-stage build path properly implemented.

The FRAMEWORK=NONE path correctly implements BuildKit logging and exit handling consistent with the two-stage build.

971-995: LGTM! Build summary enhances user experience.

The final build summary with image sizes provides valuable feedback and proper error handling for cases where inspection fails.

.github/scripts/parse_buildkit_output.py (4)

17-24: LGTM! Class initialization is clean.

The BuildKitParser initialization correctly sets up the necessary state for parsing.

25-149: LGTM! Parsing logic is comprehensive and well-structured.

The parse_log method correctly handles BuildKit's output format with appropriate regex patterns for step headers, status lines, and substeps. The aggregate statistics calculation provides valuable summary data.

172-284: LGTM! Report formatting is user-friendly and informative.

The report generation produces well-structured output with summary statistics, detailed steps, and helpful rankings. The truncation of long commands and substeps keeps reports readable.

328-368: LGTM! Output handling is well-structured.

The main function correctly writes JSON data and reports to files while providing immediate feedback via stderr. The dual output approach is appropriate for a CI tool.

coderabbitai · 2025-11-24T20:18:56Z

.github/scripts/parse_buildkit_output.py

+#!/usr/bin/env python3
+# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+"""
+Parse BuildKit output to extract detailed step-by-step metadata.
+BuildKit provides rich information about each build step including timing,
+cache status, sizes, and layer IDs.
+"""
+
+import json
+import re
+import sys
+from datetime import datetime, timezone
+from typing import Any, Dict, List, Optional
+
+
+class BuildKitParser:
+    """Parser for BuildKit output logs"""
+
+    def __init__(self):
+        self.steps: List[Dict[str, Any]] = []
+        self.current_step = None
+        self.step_counter = 0
+
+    def parse_log(self, log_content: str) -> Dict[str, Any]:
+        """
+        Parse BuildKit log output and extract step metadata.
+
+        BuildKit output format (with --progress=plain):
+        #1 [internal] load build definition from Dockerfile
+        #1 transferring dockerfile: 2.34kB done
+        #1 DONE 0.1s
+
+        #2 [internal] load metadata for nvcr.io/nvidia/cuda:12.8...
+        #2 DONE 2.3s
+
+        #3 [1/5] FROM nvcr.io/nvidia/cuda:12.8...
+        #3 resolve nvcr.io/nvidia/cuda:12.8... done
+        #3 CACHED
+
+        #4 [2/5] RUN apt-get update && apt-get install...
+        #4 0.234 Reading package lists...
+        #4 DONE 45.2s
+        """
+        lines = log_content.split("\n")
+        step_data = {}
+        current_step_num = None
+
+        for line in lines:
+            line = line.strip()
+            if not line:
+                continue
+
+            # Match step headers: #N [...]
+            step_match = re.match(r"^#(\d+)\s+\[(.*?)\](.*)$", line)
+            if step_match:
+                step_num = step_match.group(1)
+                step_name = step_match.group(2).strip()
+                step_command = step_match.group(3).strip()
+
+                if step_num not in step_data:
+                    step_data[step_num] = {
+                        "step_number": int(step_num),
+                        "step_name": step_name,
+                        "command": step_command,
+                        "status": "unknown",
+                        "cached": False,
+                        "duration_sec": 0.0,
+                        "size_transferred": 0,
+                        "logs": [],
+                        "substeps": [],
+                    }
+                current_step_num = step_num
+                continue
+
+            # Match step status lines: #N DONE 1.2s, #N CACHED, #N ERROR
+            if current_step_num:
+                # DONE with timing
+                done_match = re.match(
+                    rf"^#{current_step_num}\s+DONE\s+([\d.]+)s?", line
+                )
+                if done_match:
+                    step_data[current_step_num]["status"] = "done"
+                    step_data[current_step_num]["duration_sec"] = float(
+                        done_match.group(1)
+                    )
+                    continue
+
+                # CACHED
+                if re.match(rf"^#{current_step_num}\s+CACHED", line):
+                    step_data[current_step_num]["status"] = "cached"
+                    step_data[current_step_num]["cached"] = True
+                    continue
+
+                # ERROR
+                if re.match(rf"^#{current_step_num}\s+ERROR", line):
+                    step_data[current_step_num]["status"] = "error"
+                    continue
+
+                # Substep information (timing and progress)
+                substep_match = re.match(
+                    rf"^#{current_step_num}\s+([\d.]+)\s+(.*)", line
+                )
+                if substep_match:
+                    timestamp = substep_match.group(1)
+                    message = substep_match.group(2)
+                    step_data[current_step_num]["substeps"].append(
+                        {"timestamp": float(timestamp), "message": message}
+                    )
+
+                    # Extract size information
+                    size_match = re.search(r"([\d.]+)\s*([KMGT]?i?B)", message)
+                    if size_match:
+                        size_bytes = self._parse_size(
+                            size_match.group(1), size_match.group(2)
+                        )
+                        step_data[current_step_num]["size_transferred"] += size_bytes
+                    continue
+
+                # Other step-related information
+                if re.match(rf"^#{current_step_num}\s+", line):
+                    step_data[current_step_num]["logs"].append(
+                        line.replace(f"#{current_step_num} ", "")
+                    )
+
+        # Convert to sorted list
+        steps = [step_data[num] for num in sorted(step_data.keys(), key=int)]
+
+        # Calculate aggregate statistics
+        total_duration = sum(s["duration_sec"] for s in steps)
+        cached_steps = sum(1 for s in steps if s["cached"])
+        total_steps = len(steps)
+        cache_hit_rate = (
+            (cached_steps / total_steps * 100) if total_steps > 0 else 0.0
+        )
+        total_size = sum(s["size_transferred"] for s in steps)
+
+        return {
+            "steps": steps,
+            "summary": {
+                "total_steps": total_steps,
+                "cached_steps": cached_steps,
+                "built_steps": total_steps - cached_steps,
+                "cache_hit_rate_percent": round(cache_hit_rate, 2),
+                "total_duration_sec": round(total_duration, 2),
+                "total_size_transferred_bytes": total_size,
+            },
+            "parsed_at": datetime.now(timezone.utc).isoformat(),
+        }
+
+    def _parse_size(self, value: str, unit: str) -> int:
+        """Convert size string to bytes"""
+        try:
+            val = float(value)
+        except ValueError:
+            return 0
+
+        # Normalize unit
+        unit = unit.upper().replace("I", "")  # Remove 'i' from KiB, MiB, etc.
+
+        multipliers = {
+            "B": 1,
+            "KB": 1024,
+            "MB": 1024**2,
+            "GB": 1024**3,
+            "TB": 1024**4,
+        }
+
+        return int(val * multipliers.get(unit, 1))
+
+    def generate_report(self, build_data: Dict[str, Any], image_name: str = None) -> str:
+        """Generate a human-readable report from parsed build data"""
+        report = []
+        report.append("=" * 80)
+        report.append("📊 BUILDKIT DETAILED BUILD REPORT")
+        if image_name:
+            report.append(f"🐳 Image: {image_name}")
+        report.append("=" * 80)
+        report.append("")
+
+        # Summary section
+        summary = build_data["summary"]
+        report.append("📈 BUILD SUMMARY")
+        report.append("-" * 80)
+        report.append(f"Total Steps:        {summary['total_steps']}")
+        report.append(f"Cached Steps:       {summary['cached_steps']}")
+        report.append(f"Built Steps:        {summary['built_steps']}")
+        report.append(f"Cache Hit Rate:     {summary['cache_hit_rate_percent']:.1f}%")
+        report.append(f"Total Duration:     {summary['total_duration_sec']:.2f}s")
+
+        total_size = summary["total_size_transferred_bytes"]
+        if total_size > 0:
+            report.append(
+                f"Data Transferred:   {self._format_size(total_size)}"
+            )
+        report.append("")
+
+        # Steps section
+        report.append("🔨 DETAILED STEPS")
+        report.append("-" * 80)
+        report.append("")
+
+        for step in build_data["steps"]:
+            step_num = step["step_number"]
+            status_icon = self._get_status_icon(step)
+            cached_indicator = " [CACHED]" if step["cached"] else ""
+
+            report.append(
+                f"{status_icon} Step #{step_num}: {step['step_name']}{cached_indicator}"
+            )
+
+            if step["command"]:
+                # Truncate very long commands
+                cmd = step["command"]
+                if len(cmd) > 100:
+                    cmd = cmd[:97] + "..."
+                report.append(f"   Command: {cmd}")
+
+            report.append(f"   Status:   {step['status'].upper()}")
+
+            if step["duration_sec"] > 0:
+                report.append(f"   Duration: {step['duration_sec']:.2f}s")
+
+            if step["size_transferred"] > 0:
+                report.append(
+                    f"   Size:     {self._format_size(step['size_transferred'])}"
+                )
+
+            # Show interesting substeps
+            if step["substeps"]:
+                interesting = [
+                    s
+                    for s in step["substeps"]
+                    if any(
+                        keyword in s["message"].lower()
+                        for keyword in ["done", "complete", "extracting", "pulling"]
+                    )
+                ]
+                if interesting and len(interesting) <= 5:
+                    for substep in interesting[:3]:  # Show max 3
+                        msg = substep["message"]
+                        if len(msg) > 70:
+                            msg = msg[:67] + "..."
+                        report.append(f"   └─ {msg}")
+
+            report.append("")
+
+        # Top 5 slowest steps
+        slowest = sorted(
+            build_data["steps"], key=lambda s: s["duration_sec"], reverse=True
+        )[:5]
+        slowest = [s for s in slowest if s["duration_sec"] > 0]
+
+        if slowest:
+            report.append("⏱️  TOP 5 SLOWEST STEPS")
+            report.append("-" * 80)
+            for i, step in enumerate(slowest, 1):
+                report.append(
+                    f"{i}. Step #{step['step_number']}: {step['step_name']} "
+                    f"({step['duration_sec']:.2f}s)"
+                )
+            report.append("")
+
+        # Top 5 largest transfers
+        largest = sorted(
+            build_data["steps"], key=lambda s: s["size_transferred"], reverse=True
+        )[:5]
+        largest = [s for s in largest if s["size_transferred"] > 0]
+
+        if largest:
+            report.append("📦 TOP 5 LARGEST DATA TRANSFERS")
+            report.append("-" * 80)
+            for i, step in enumerate(largest, 1):
+                report.append(
+                    f"{i}. Step #{step['step_number']}: {step['step_name']} "
+                    f"({self._format_size(step['size_transferred'])})"
+                )
+            report.append("")
+
+        report.append("=" * 80)
+        report.append(f"Report generated at: {build_data['parsed_at']}")
+        report.append("=" * 80)
+
+        return "\n".join(report)
+
+    def _get_status_icon(self, step: Dict[str, Any]) -> str:
+        """Get emoji icon for step status"""
+        if step["cached"]:
+            return "⚡"
+        elif step["status"] == "done":
+            return "✅"
+        elif step["status"] == "error":
+            return "❌"
+        else:
+            return "⏳"
+
+    def _format_size(self, bytes_size: int) -> str:
+        """Format bytes to human-readable size"""
+        for unit in ["B", "KB", "MB", "GB", "TB"]:
+            if bytes_size < 1024.0:
+                return f"{bytes_size:.2f} {unit}"
+            bytes_size /= 1024.0
+        return f"{bytes_size:.2f} PB"
+
+
+def main():
+    """Main entry point"""
+    if len(sys.argv) < 2:
+        print(
+            "Usage: parse_buildkit_output.py <build_log_file> [output_json] [output_report] [image_name]",
+            file=sys.stderr,
+        )
+        sys.exit(1)
+
+    log_file = sys.argv[1]
+    output_json = sys.argv[2] if len(sys.argv) > 2 else None
+    output_report = sys.argv[3] if len(sys.argv) > 3 else None
+    image_name = sys.argv[4] if len(sys.argv) > 4 else None
+
+    # Read build log
+    try:
+        with open(log_file, "r") as f:
+            log_content = f.read()
+    except Exception as e:
+        print(f"Error reading log file: {e}", file=sys.stderr)
+        sys.exit(1)
+
+    # Parse BuildKit output
+    parser = BuildKitParser()
+    build_data = parser.parse_log(log_content)
+
+    # Output JSON
+    if output_json:
+        with open(output_json, "w") as f:
+            json.dump(build_data, f, indent=2)
+        print(f"✅ Build data written to: {output_json}", file=sys.stderr)
+
+    # Output report
+    report_text = parser.generate_report(build_data, image_name)
+    if output_report:
+        with open(output_report, "w") as f:
+            f.write(report_text)
+        print(f"✅ Build report written to: {output_report}", file=sys.stderr)
+    else:
+        # Print to stdout if no output file specified
+        print(report_text)
+
+    # Also print summary to stderr for immediate feedback
+    print("", file=sys.stderr)
+    print("📊 Build Summary:", file=sys.stderr)
+    print(
+        f"   Steps: {build_data['summary']['total_steps']} total, "
+        f"{build_data['summary']['cached_steps']} cached, "
+        f"{build_data['summary']['built_steps']} built",
+        file=sys.stderr,
+    )
+    print(
+        f"   Cache Hit Rate: {build_data['summary']['cache_hit_rate_percent']:.1f}%",
+        file=sys.stderr,
+    )
+    print(
+        f"   Total Duration: {build_data['summary']['total_duration_sec']:.2f}s",
+        file=sys.stderr,
+    )
+
+
+if __name__ == "__main__":
+    main()
+


⚠️ Potential issue | 🔴 Critical

Fix pre-commit formatting violations.

The pipeline shows Black and Ruff formatting changes were auto-applied. Commit the auto-fixed changes to resolve the pipeline failure.

Run pre-commit locally:

#!/bin/bash # Run pre-commit on modified files pre-commit run --files .github/scripts/parse_buildkit_output.py

🧰 Tools

🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/4559/merge) by nv-nmailhot.

[error] 1-1: Black formatting check failed. Reformatted parse_buildkit_output.py.

[error] 1-1: ruff: code style adjustments applied to parse_buildkit_output.py.

[error] 1-1: pre-commit: multiple hooks made changes (Black, Ruff, Trailing Whitespace).

[error] 1-1: Command step 'pre-commit run --show-diff-on-failure --color=always --all-files' failed. See modified files above.

🪛 Ruff (0.14.5)

171-171: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

324-324: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

.github/scripts/parse_buildkit_output.py lines 1-369: pre-commit detected Black/Ruff formatting fixes that need to be committed; run the pre-commit hook locally (pre-commit run --files .github/scripts/parse_buildkit_output.py), review and stage the modified file, and commit the auto-fixed changes so the pipeline passes (ensure no other unrelated edits are included).

coderabbitai · 2025-11-24T20:18:56Z

.github/scripts/parse_buildkit_output.py

+
+        return int(val * multipliers.get(unit, 1))
+
+    def generate_report(self, build_data: Dict[str, Any], image_name: str = None) -> str:


⚠️ Potential issue | 🟡 Minor

Fix implicit Optional type hint.

Line 171 uses str = None which should be str | None or Optional[str] to properly indicate the parameter can be None.

Apply this diff:

- def generate_report(self, build_data: Dict[str, Any], image_name: str = None) -> str: + def generate_report(self, build_data: Dict[str, Any], image_name: Optional[str] = None) -> str:

Based on static analysis hints from Ruff (RUF013).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def generate_report(self, build_data: Dict[str, Any], image_name: str = None) -> str:

def generate_report(self, build_data: Dict[str, Any], image_name: Optional[str] = None) -> str:

🧰 Tools

🪛 Ruff (0.14.5)

171-171: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

🤖 Prompt for AI Agents

In .github/scripts/parse_buildkit_output.py around line 171, the parameter annotation uses `image_name: str = None` which incorrectly implies a non-optional str; change it to an explicit optional type (`image_name: str | None = None` or `image_name: Optional[str] = None`) and, if you choose `Optional`, add `from typing import Optional` at the top of the file (or switch to the PEP 604 `str | None` form if running on Python 3.10+). Ensure the default remains `= None`.

coderabbitai · 2025-11-24T20:18:57Z

container/build.sh

+        # Define base image tag before using it
+        DYNAMO_BASE_IMAGE="dynamo-base:${VERSION}"
+
        # Define base image tag before using it
        DYNAMO_BASE_IMAGE="dynamo-base:${VERSION}"
        # Start base image build
        echo "======================================"
        echo "Starting Build 1: Base Image"
        echo "======================================"
-        $RUN_PREFIX docker build -f "${SOURCE_DIR}/Dockerfile" --target dev $PLATFORM $BUILD_ARGS $CACHE_FROM $CACHE_TO --tag $DYNAMO_BASE_IMAGE $BUILD_CONTEXT_ARG $BUILD_CONTEXT $NO_CACHE
+
+        # Create build log directory for BuildKit reports
+        BUILD_LOG_DIR="${BUILD_CONTEXT}/build-logs"
+        mkdir -p "${BUILD_LOG_DIR}"
+        BASE_BUILD_LOG="${BUILD_LOG_DIR}/base-image-build.log"
+
+        # Use BuildKit for enhanced metadata
+        if [ -z "$RUN_PREFIX" ]; then
+            if docker buildx version &>/dev/null; then
+                docker buildx build --progress=plain --load -f "${SOURCE_DIR}/Dockerfile" --target dev $PLATFORM $BUILD_ARGS $CACHE_FROM $CACHE_TO --tag $DYNAMO_BASE_IMAGE $BUILD_CONTEXT_ARG $BUILD_CONTEXT $NO_CACHE 2>&1 | tee "${BASE_BUILD_LOG}"
+                BUILD_EXIT_CODE=${PIPESTATUS[0]}
+            else
+                DOCKER_BUILDKIT=1 docker build --progress=plain -f "${SOURCE_DIR}/Dockerfile" --target dev $PLATFORM $BUILD_ARGS $CACHE_FROM $CACHE_TO --tag $DYNAMO_BASE_IMAGE $BUILD_CONTEXT_ARG $BUILD_CONTEXT $NO_CACHE 2>&1 | tee "${BASE_BUILD_LOG}"
+                BUILD_EXIT_CODE=${PIPESTATUS[0]}
+            fi
+
+            if [ ${BUILD_EXIT_CODE} -ne 0 ]; then
+                exit ${BUILD_EXIT_CODE}
+            fi
+        else
+            $RUN_PREFIX docker build -f "${SOURCE_DIR}/Dockerfile" --target dev $PLATFORM $BUILD_ARGS $CACHE_FROM $CACHE_TO --tag $DYNAMO_BASE_IMAGE $BUILD_CONTEXT_ARG $BUILD_CONTEXT $NO_CACHE
+        fi
+


⚠️ Potential issue | 🟡 Minor

Remove duplicate variable definition.

Lines 846 and 849 both define DYNAMO_BASE_IMAGE with the same value. The second definition is redundant.

Apply this diff:

# Define base image tag before using it DYNAMO_BASE_IMAGE="dynamo-base:${VERSION}" - # Define base image tag before using it - DYNAMO_BASE_IMAGE="dynamo-base:${VERSION}" # Start base image build

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Define base image tag before using it

DYNAMO_BASE_IMAGE="dynamo-base:${VERSION}"

# Define base image tag before using it

DYNAMO_BASE_IMAGE="dynamo-base:${VERSION}"

# Start base image build

echo "======================================"

echo "Starting Build 1: Base Image"

echo "======================================"

$RUN_PREFIX docker build -f "${SOURCE_DIR}/Dockerfile" --target dev $PLATFORM $BUILD_ARGS $CACHE_FROM $CACHE_TO --tag $DYNAMO_BASE_IMAGE $BUILD_CONTEXT_ARG $BUILD_CONTEXT $NO_CACHE

# Create build log directory for BuildKit reports

BUILD_LOG_DIR="${BUILD_CONTEXT}/build-logs"

mkdir -p "${BUILD_LOG_DIR}"

BASE_BUILD_LOG="${BUILD_LOG_DIR}/base-image-build.log"

# Use BuildKit for enhanced metadata

if [ -z "$RUN_PREFIX" ]; then

if docker buildx version &>/dev/null; then

docker buildx build --progress=plain --load -f "${SOURCE_DIR}/Dockerfile" --target dev $PLATFORM $BUILD_ARGS $CACHE_FROM $CACHE_TO --tag $DYNAMO_BASE_IMAGE $BUILD_CONTEXT_ARG $BUILD_CONTEXT $NO_CACHE 2>&1 | tee "${BASE_BUILD_LOG}"

BUILD_EXIT_CODE=${PIPESTATUS[0]}

else

DOCKER_BUILDKIT=1 docker build --progress=plain -f "${SOURCE_DIR}/Dockerfile" --target dev $PLATFORM $BUILD_ARGS $CACHE_FROM $CACHE_TO --tag $DYNAMO_BASE_IMAGE $BUILD_CONTEXT_ARG $BUILD_CONTEXT $NO_CACHE 2>&1 | tee "${BASE_BUILD_LOG}"

BUILD_EXIT_CODE=${PIPESTATUS[0]}

fi

if [ ${BUILD_EXIT_CODE} -ne 0 ]; then

exit ${BUILD_EXIT_CODE}

fi

else

$RUN_PREFIX docker build -f "${SOURCE_DIR}/Dockerfile" --target dev $PLATFORM $BUILD_ARGS $CACHE_FROM $CACHE_TO --tag $DYNAMO_BASE_IMAGE $BUILD_CONTEXT_ARG $BUILD_CONTEXT $NO_CACHE

fi

# Define base image tag before using it

DYNAMO_BASE_IMAGE="dynamo-base:${VERSION}"

# Start base image build

echo "======================================"

echo "Starting Build 1: Base Image"

echo "======================================"

# Create build log directory for BuildKit reports

BUILD_LOG_DIR="${BUILD_CONTEXT}/build-logs"

mkdir -p "${BUILD_LOG_DIR}"

BASE_BUILD_LOG="${BUILD_LOG_DIR}/base-image-build.log"

# Use BuildKit for enhanced metadata

if [ -z "$RUN_PREFIX" ]; then

if docker buildx version &>/dev/null; then

docker buildx build --progress=plain --load -f "${SOURCE_DIR}/Dockerfile" --target dev $PLATFORM $BUILD_ARGS $CACHE_FROM $CACHE_TO --tag $DYNAMO_BASE_IMAGE $BUILD_CONTEXT_ARG $BUILD_CONTEXT $NO_CACHE 2>&1 | tee "${BASE_BUILD_LOG}"

BUILD_EXIT_CODE=${PIPESTATUS[0]}

else

DOCKER_BUILDKIT=1 docker build --progress=plain -f "${SOURCE_DIR}/Dockerfile" --target dev $PLATFORM $BUILD_ARGS $CACHE_FROM $CACHE_TO --tag $DYNAMO_BASE_IMAGE $BUILD_CONTEXT_ARG $BUILD_CONTEXT $NO_CACHE 2>&1 | tee "${BASE_BUILD_LOG}"

BUILD_EXIT_CODE=${PIPESTATUS[0]}

fi

if [ ${BUILD_EXIT_CODE} -ne 0 ]; then

exit ${BUILD_EXIT_CODE}

fi

else

$RUN_PREFIX docker build -f "${SOURCE_DIR}/Dockerfile" --target dev $PLATFORM $BUILD_ARGS $CACHE_FROM $CACHE_TO --tag $DYNAMO_BASE_IMAGE $BUILD_CONTEXT_ARG $BUILD_CONTEXT $NO_CACHE

fi

🤖 Prompt for AI Agents

In container/build.sh around lines 845 to 876 there are two identical definitions of DYNAMO_BASE_IMAGE (duplicate at line ~849); remove the redundant second definition so DYNAMO_BASE_IMAGE is defined once before it is used, leaving the first assignment and deleting the subsequent duplicate line.

nv-nmailhot added 9 commits November 20, 2025 17:02

add tons of layer level logging

74b190a

add container logging

670d87b

add buildkit logging

c47ae1c

cleanup layer info

78ed022

make logging cleaner

29d8000

remove extra logging

f1dfbe1

remove extra logging

257abad

change file names

04b40af

remove uploading

744b9b3

nv-nmailhot requested review from a team as code owners November 24, 2025 20:14

pull-request-size bot added the size/XL label Nov 24, 2025

coderabbitai bot reviewed Nov 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Nmailhot/docker layer metrics simple #4559

Nmailhot/docker layer metrics simple #4559

nv-nmailhot commented Nov 24, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 24, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 24, 2025

Uh oh!

coderabbitai bot Nov 24, 2025

Uh oh!

coderabbitai bot Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		return int(val * multipliers.get(unit, 1))

		def generate_report(self, build_data: Dict[str, Any], image_name: str = None) -> str:

Nmailhot/docker layer metrics simple #4559

Are you sure you want to change the base?

Nmailhot/docker layer metrics simple #4559

Conversation

nv-nmailhot commented Nov 24, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 24, 2025

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nv-nmailhot commented Nov 24, 2025 •

edited by coderabbitai bot

Loading