-
Notifications
You must be signed in to change notification settings - Fork 584
ci: Reduce test time by moving compilation off-line #2089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: Reduce test time by moving compilation off-line #2089
Conversation
WalkthroughAdds non-dry-run logic to detect CUDA stream, map JIT architecture, install prebuilt kernel wheels and verify local Python install; and simplifies MAX_JOBS calculation by using a constant memory divisor for JIT-cache wheel builds. Changes
Sequence Diagram(s)sequenceDiagram
participant S as task_test_blackwell_kernels.sh
participant E as Environment
participant FS as dist/ (artifacts)
participant Pip as pip
participant Py as python
rect rgb(248,249,255)
Note over S,E: Entry — only proceed when DRY_RUN unset
S->>E: check DRY_RUN
alt DRY_RUN not set
S->>E: echo CUDA_VERSION
S->>S: derive CUDA_STREAM from CUDA_VERSION
S->>S: compute JIT_ARCH_EFFECTIVE (special-case 12.0/cu129)
S->>S: set DIST_CUBIN_DIR & DIST_JIT_CACHE_DIR
else DRY_RUN set
S-->>E: skip installs
end
end
rect rgb(237,255,240)
Note over S,FS: Install prebuilt artifacts
S->>FS: look for cubin wheel in DIST_CUBIN_DIR
alt found
S->>Pip: pip install <flashinfer-cubin.whl>
else missing
S-->>E: exit error ("missing flashinfer-cubin artifact")
end
S->>FS: look for jit-cache wheel in DIST_JIT_CACHE_DIR
alt found
S->>Pip: pip install <flashinfer-jit-cache.whl>
else missing
S-->>E: exit error ("missing flashinfer-jit-cache artifact")
end
end
rect rgb(255,250,235)
Note over S,Py: Local package install & verify
S->>Pip: pip install -e . -v --no-deps
S->>Py: (cd /tmp && python -m flashinfer show-config)
Py-->>S: config output
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
c9c6768 to
375ca18
Compare
|
/bot run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
scripts/task_test_blackwell_kernels.sh (2)
41-50: Inconsistent verbosity flags in sequential pip installations.Lines 43 and 45 use
-q(quiet) for kernel installations, while line 49 uses-v(verbose) for local source installation. This inconsistency makes it unclear whether the verbosity change is intentional and may make output harder to parse in CI logs.Standardize the verbosity flags across all pip installations in this initialization block:
# Install precompiled kernels echo "Installing flashinfer-cubin from PyPI/index..." - pip install -q flashinfer-cubin + pip install -q flashinfer-cubin echo "Installing flashinfer-jit-cache for ${CUDA_STREAM} from https://flashinfer.ai/whl/${CUDA_STREAM} ..." - pip install -q --extra-index-url "https://flashinfer.ai/whl/${CUDA_STREAM}" flashinfer-jit-cache + pip install -q --extra-index-url "https://flashinfer.ai/whl/${CUDA_STREAM}" flashinfer-jit-cache echo "" # Install local python sources - pip install -e . -v --no-deps + pip install -e . -q --no-depsAlternatively, if verbose output is intentional for debugging local installs, add a comment explaining the choice.
41-50: Verify that the custom PyPI index URL forflashinfer-jit-cacheis reliable.The script hardcodes the index URL
https://flashinfer.ai/whl/${CUDA_STREAM}and expects it to always be available and contain theflashinfer-jit-cachepackage for the detected CUDA stream. If this URL becomes unavailable or if a CUDA stream version is not published, the pip install will fail and halt all subsequent tests.Add error handling and diagnostics to surface issues clearly:
echo "Installing flashinfer-jit-cache for ${CUDA_STREAM} from https://flashinfer.ai/whl/${CUDA_STREAM} ..." - pip install -q --extra-index-url "https://flashinfer.ai/whl/${CUDA_STREAM}" flashinfer-jit-cache + if ! pip install -q --extra-index-url "https://flashinfer.ai/whl/${CUDA_STREAM}" flashinfer-jit-cache; then + echo "❌ ERROR: Failed to install flashinfer-jit-cache for CUDA stream ${CUDA_STREAM}" + echo " Index URL: https://flashinfer.ai/whl/${CUDA_STREAM}" + exit 1 + fiCan you confirm that the custom index URL is stable and that all supported CUDA streams (cu128, cu129, cu130) are consistently published with the corresponding flashinfer-jit-cache package?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
scripts/task_test_blackwell_kernels.sh(1 hunks)
🔇 Additional comments (1)
scripts/task_test_blackwell_kernels.sh (1)
52-55: Verify thatpython -m flashinfer show-configis an appropriate verification step.The verification runs
python -m flashinfer show-configto confirm successful installation. However, this assumes:
- The
show-configsubcommand exists in the flashinfer module- The command is idempotent and doesn't modify the environment
- The command completes quickly without external dependencies
If this command fails (e.g., due to missing dependencies, invalid environment, or a transient issue), the entire test run is aborted before any tests can run, which may be overly strict for a verification step.
Can you confirm:
- That
python -m flashinfer show-configis a lightweight, read-only command that verifies the installation without side effects?- What the expected output is and whether it should be validated beyond the exit code?
- Whether a failed verification should block all tests or only warn/skip?
|
[FAILED] Pipeline #38459095: 3/17 passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kahyunnam! Left a comment about the behavior with jit cache & cubin wheels are not found.
0a7f5c7 to
3f6dc1c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
scripts/task_test_blackwell_kernels.sh (3)
44-45: Validate that relative paths are robust to working directory assumptions.Lines 44–45 construct distribution paths as
../dist/${CUDA_VERSION}/${JIT_ARCH_EFFECTIVE}/.... This assumes the script is invoked from a specific directory (likely the repository root). If the script is called from a different directory, these paths will fail silently or point to unintended locations.Consider either:
- Using
$(dirname "${BASH_SOURCE[0]}")to anchor paths relative to the script location.- Adding explicit validation that
DIST_CUBIN_DIRandDIST_JIT_CACHE_DIRare accessible before attempting installation.- Documenting the expected working directory requirement in a comment.
33-42: Simplify JIT_ARCH mapping logic for clarity.The nested conditional on lines 34–39 is difficult to follow. The logic maps only
12.0to architecture-specific suffixes (12.0afor cu129,12.0fotherwise), while other values pass through unchanged. Consider extracting this into a helper function or adding comments to explain the mapping rules and why12.0is special.
28-29: Add validation or explicit handling for CUDA_VERSION.Line 28 echoes
CUDA_VERSIONbut does not validate that it is set to an expected value (cu128, cu129, cu130, etc.). IfCUDA_VERSIONis unset or malformed, the script will still proceed and construct invalid paths. Consider adding a check to fail fast if the value is unexpected, or document the assumption thatCUDA_VERSIONis always set by the caller.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
scripts/build_flashinfer_jit_cache_whl.sh(1 hunks)scripts/task_test_blackwell_kernels.sh(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Deploy Docs
🔇 Additional comments (3)
scripts/build_flashinfer_jit_cache_whl.sh (1)
14-15: Clarify rationale for changing MAX_JOBS divisor from architecture-dependent to constant.The change removes the conditional logic and always divides by 8, whereas the original divided by 4 on x86_64. While this simplifies the calculation, it may result in fewer parallel jobs on non-aarch64 systems, potentially increasing build time.
Was this change validated on both architectures? If not, consider testing build times on x86_64 to confirm acceptable performance.
scripts/task_test_blackwell_kernels.sh (2)
65-72: Approve artifact installation and verification flow.The addition of local source installation and verification via
python -m flashinfer show-configis well-structured. Running the verification in/tmpisolates side effects and ensures the installed packages work in a clean environment. The error handling is appropriate for this stage.
51-53: Clarify intent regardingexit 1statements at lines 52 and 60.The current code contains
exit 1at both locations (lines 52 and 60). The review comment references a prior resolution where you stated you "removed the 'exit 1' for both the cubin / jit-cache else logic," but I cannot access the prior conversation to verify this claim.Please confirm:
- Was removing the
exit 1statements intentionally reverted?- Is the current behavior (hard error on missing artifacts) the intended behavior?
- If the intent was to warn and continue with JIT compilation fallback, these statements need to be replaced with warnings.
Without access to the prior review thread, I cannot determine whether this is an oversight or intentional. The developer must clarify the design decision.
|
/bot run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
scripts/task_test_blackwell_kernels.sh (2)
28-42: Consider adding a warning when CUDA_VERSION is unset or unsupported.The mapping logic (lines 34–40) silently defaults to
cu130ifCUDA_VERSIONis unset or doesn't match the known patterns (cu*,"12.9.0"). While this fallback is reasonable for local development, it could mask environment issues in CI. Since the script is run after CI produces artifacts, a missing or mismatchedCUDA_VERSIONshould be visible to the operator.This is a low-priority suggestion: if
CUDA_VERSIONis guaranteed to be set by your CI environment (per the PR objectives), this warning is optional.if [[ "${CUDA_VERSION}" == cu* ]]; then CUDA_STREAM="${CUDA_VERSION}" elif [ "${CUDA_VERSION}" = "12.9.0" ]; then CUDA_STREAM="cu129" else + echo "⚠️ WARNING: CUDA_VERSION '${CUDA_VERSION}' not explicitly recognized; defaulting to cu130" >&2 CUDA_STREAM="cu130" fiAlso consider adding a check before line 28:
+if [ -z "${CUDA_VERSION}" ]; then + echo "⚠️ WARNING: CUDA_VERSION not set; defaulting to cu130" >&2 +fi
59-71: Error messages labeled as "ERROR:" but script continues—consider "WARNING:" for consistency.Lines 63 and 70 print messages to stderr prefixed with "ERROR:" but do not exit. This allows the script to fall back to JIT compilation at runtime (intentional per prior review). However, the "ERROR:" label may confuse automated log parsing or monitoring tools that expect non-zero exit codes for error conditions.
Since the fallback is intentional and well-documented, consider relabeling these as warnings:
else - echo "ERROR: flashinfer-cubin wheel not found in ${DIST_CUBIN_DIR}. Ensure the CI build stage produced the artifact." >&2 + echo "⚠️ WARNING: flashinfer-cubin wheel not found in ${DIST_CUBIN_DIR}. Falling back to JIT compilation." >&2 fi if [ -d "${DIST_JIT_CACHE_DIR}" ] && ls "${DIST_JIT_CACHE_DIR}"/*.whl >/dev/null 2>&1; then echo "Installing flashinfer-jit-cache from ${DIST_JIT_CACHE_DIR} ..." pip install -q "${DIST_JIT_CACHE_DIR}"/*.whl else - echo "ERROR: flashinfer-jit-cache wheel not found in ${DIST_JIT_CACHE_DIR} for ${CUDA_VERSION}. Ensure the CI build stage produced the artifact." >&2 + echo "⚠️ WARNING: flashinfer-jit-cache wheel not found in ${DIST_JIT_CACHE_DIR}. Falling back to JIT compilation." >&2 fiAlternatively, if you prefer to retain "ERROR:" as a signal for CI monitoring, that is acceptable given the documented fallback behavior.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
scripts/task_test_blackwell_kernels.sh(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Deploy Docs
🔇 Additional comments (2)
scripts/task_test_blackwell_kernels.sh (2)
76-82: Installation and verification sequence is well-structured.The editable install with
-v(verbose) and--no-depsis appropriate for this context. The verification step (python -m flashinfer show-configin an isolated subprocess) is a good sanity check after kernel and package installation. No changes needed here.
27-83: Dry-run gate is correctly applied to all installation and verification steps.The conditional at line 27 properly isolates all side-effects (artifact download, pip install, verification) from dry-run mode. The structure is clear and safe. No changes needed here.
|
/bot run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
scripts/task_test_blackwell_kernels.sh (1)
73-85: UseWARNING:instead ofERROR:for messages that don't fail the script.Lines 77 and 84 print "ERROR:" but the script continues with fallback JIT compilation (per the prior design discussion). The "ERROR:" prefix is misleading when the script doesn't exit. Use "WARNING:" to match the actual behavior:
- echo "ERROR: flashinfer-cubin wheel not found in ${DIST_CUBIN_DIR}. Ensure the CI build stage produced the artifact." >&2 + echo "⚠️ WARNING: flashinfer-cubin wheel not found in ${DIST_CUBIN_DIR}. Falling back to JIT compilation." >&2- echo "ERROR: flashinfer-jit-cache wheel not found in ${DIST_JIT_CACHE_DIR} for ${CUDA_VERSION}. Ensure the CI build stage produced the artifact." >&2 + echo "⚠️ WARNING: flashinfer-jit-cache wheel not found in ${DIST_JIT_CACHE_DIR}. Falling back to JIT compilation." >&2This clarifies that missing wheels are acceptable and will gracefully fall back to compilation.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
scripts/task_test_blackwell_kernels.sh(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Deploy Docs
🔇 Additional comments (1)
scripts/task_test_blackwell_kernels.sh (1)
89-96: LGTM—verification step is a good safeguard.The local install and verification (running
python -m flashinfer show-configin/tmp) provides a useful sanity check that the installation was successful before running tests.
| if [ "$DRY_RUN" != "true" ]; then | ||
| echo "Using CUDA version: ${CUDA_VERSION}" | ||
| echo "" | ||
|
|
||
| # Install precompiled kernels (require CI build artifacts) | ||
| JIT_ARCH_EFFECTIVE="" | ||
| # Map CUDA_VERSION to CUDA_STREAM for artifact lookup | ||
| if [[ "${CUDA_VERSION}" == cu* ]]; then | ||
| CUDA_STREAM="${CUDA_VERSION}" | ||
| elif [ "${CUDA_VERSION}" = "12.9.0" ]; then | ||
| CUDA_STREAM="cu129" | ||
| else | ||
| CUDA_STREAM="cu130" | ||
| fi | ||
| echo "Using CUDA stream: ${CUDA_STREAM}" | ||
| echo "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add explicit CUDA_VERSION validation to avoid silent fallback.
Line 28 echoes an undefined CUDA_VERSION without checking if it's set. If CUDA_VERSION is unset in the environment, the logic at lines 34-40 silently defaults to cu130 without warning the user. This echoes the critical issue from the previous review that was flagged but not yet resolved.
Add an explicit check for unset CUDA_VERSION:
if [ "$DRY_RUN" != "true" ]; then
+ if [ -z "${CUDA_VERSION}" ]; then
+ echo "⚠️ WARNING: CUDA_VERSION environment variable not set. Defaulting to cu130."
+ fi
echo "Using CUDA version: ${CUDA_VERSION}"This makes the fallback behavior transparent and helps users identify configuration issues.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if [ "$DRY_RUN" != "true" ]; then | |
| echo "Using CUDA version: ${CUDA_VERSION}" | |
| echo "" | |
| # Install precompiled kernels (require CI build artifacts) | |
| JIT_ARCH_EFFECTIVE="" | |
| # Map CUDA_VERSION to CUDA_STREAM for artifact lookup | |
| if [[ "${CUDA_VERSION}" == cu* ]]; then | |
| CUDA_STREAM="${CUDA_VERSION}" | |
| elif [ "${CUDA_VERSION}" = "12.9.0" ]; then | |
| CUDA_STREAM="cu129" | |
| else | |
| CUDA_STREAM="cu130" | |
| fi | |
| echo "Using CUDA stream: ${CUDA_STREAM}" | |
| echo "" | |
| if [ "$DRY_RUN" != "true" ]; then | |
| if [ -z "${CUDA_VERSION}" ]; then | |
| echo "⚠️ WARNING: CUDA_VERSION environment variable not set. Defaulting to cu130." | |
| fi | |
| echo "Using CUDA version: ${CUDA_VERSION}" | |
| echo "" | |
| # Install precompiled kernels (require CI build artifacts) | |
| JIT_ARCH_EFFECTIVE="" | |
| # Map CUDA_VERSION to CUDA_STREAM for artifact lookup | |
| if [[ "${CUDA_VERSION}" == cu* ]]; then | |
| CUDA_STREAM="${CUDA_VERSION}" | |
| elif [ "${CUDA_VERSION}" = "12.9.0" ]; then | |
| CUDA_STREAM="cu129" | |
| else | |
| CUDA_STREAM="cu130" | |
| fi | |
| echo "Using CUDA stream: ${CUDA_STREAM}" | |
| echo "" |
🤖 Prompt for AI Agents
In scripts/task_test_blackwell_kernels.sh around lines 27 to 42, validate
CUDA_VERSION before using it: if CUDA_VERSION is unset or empty, print a clear
error message stating it must be provided and exit non-zero; otherwise echo
"Using CUDA version: ${CUDA_VERSION}" and proceed with the existing mapping
logic. Ensure you use quoted checks (e.g. [ -z "${CUDA_VERSION}" ] or [[ -z
"${CUDA_VERSION}" ]]) so unset/empty values are detected, and do not silently
fall back to cu130 — if you prefer a fallback instead, print an explicit warning
showing the chosen default before continuing.
|
[CANCELING] Pipeline #39174363: canceled |
|
/bot run |
…e-compilation-offline
|
/bot run |
|
Warning: Failed to sync latest changes. Please try again. |
|
/bot run |
|
Warning: Failed to sync latest changes. Please try again. |
|
[FAILED] Pipeline #39184725: 5/20 passed |
📌 Description
Download
flashinfer-cubinandflashinfer-jit-cacheto avoid compilation. (Unless the JIT kernel is not in theflashinfer-jit-cache; then it will still JIT compile during test runtime. We could setexport FLASHINFER_DISABLE_JIT = 1to avoid this, but then it will "skip" a lot of tests that use JIT kernels that aren't found inflashinfer-jit-cache.)🔍 Related Issues
Issue was discussed on slack. "Ideally, we would move that compilation off-line which would reduce test time & make kernel hang detection much easier. "
🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.
✅ Pre-commit Checks
pre-commitby runningpip install pre-commit(or used your preferred method).pre-commit install.pre-commit run --all-filesand fixed any reported issues.🧪 Tests
unittest, etc.).Summary by CodeRabbit
Chores
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.