Skip to content

Commit df5c2e4

Browse files
kahyunnamyzh119
andauthored
ci: Reduce test time by moving compilation off-line (#2089)
<!-- .github/pull_request_template.md --> ## 📌 Description Download `flashinfer-cubin` and `flashinfer-jit-cache `to avoid compilation. (Unless the JIT kernel is not in the `flashinfer-jit-cache`; then it will still JIT compile during test runtime. We could set `export FLASHINFER_DISABLE_JIT = 1 `to avoid this, but then it will "skip" a lot of tests that use JIT kernels that aren't found in `flashinfer-jit-cache`.) ## 🔍 Related Issues Issue was discussed on slack. "Ideally, we would move that compilation off-line which would reduce test time & make kernel hang detection much easier. " ## 🚀 Pull Request Checklist Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete. ### ✅ Pre-commit Checks - [x] I have installed `pre-commit` by running `pip install pre-commit` (or used your preferred method). - [x] I have installed the hooks with `pre-commit install`. - [x] I have run the hooks manually with `pre-commit run --all-files` and fixed any reported issues. > If you are unsure about how to set up `pre-commit`, see [the pre-commit documentation](https://pre-commit.com/). ## 🧪 Tests - [x] Tests have been added or updated as needed. - [x] All tests are passing (`unittest`, etc.). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Improved runtime install flow to detect CUDA, compute an effective JIT architecture mapping, and install matching precompiled kernel artifacts plus local package sources; these steps run only outside dry-run mode and verify installation by showing configuration. * Simplified build parallelism calculation to a constant division by 8 (with existing safety guards retained). * **Bug Fixes** * Missing precompiled kernel artifacts now cause an explicit error/abort instead of a warning. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: yzh119 <[email protected]>
1 parent 18004a8 commit df5c2e4

File tree

2 files changed

+70
-1
lines changed

2 files changed

+70
-1
lines changed

scripts/build_flashinfer_jit_cache_whl.sh

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,8 @@ echo "=========================================="
1111
# MAX_JOBS = min(nproc, max(1, MemAvailable_GB/4))
1212
MEM_AVAILABLE_GB=$(free -g | awk '/^Mem:/ {print $7}')
1313
NPROC=$(nproc)
14-
MAX_JOBS=$(( MEM_AVAILABLE_GB / $([ "$(uname -m)" = "aarch64" ] && echo 8 || echo 4) ))
14+
# MAX_JOBS=$(( MEM_AVAILABLE_GB / $([ "$(uname -m)" = "aarch64" ] && echo 8 || echo 4) ))
15+
MAX_JOBS=$(( MEM_AVAILABLE_GB / 8 ))
1516
if (( MAX_JOBS < 1 )); then
1617
MAX_JOBS=1
1718
elif (( NPROC < MAX_JOBS )); then

scripts/task_test_blackwell_kernels.sh

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,75 @@ if [[ "$1" == "--dry-run" ]] || [[ "${DRY_RUN}" == "true" ]]; then
2525
fi
2626

2727
if [ "$DRY_RUN" != "true" ]; then
28+
echo "Using CUDA version: ${CUDA_VERSION}"
29+
echo ""
30+
31+
# Install precompiled kernels (require CI build artifacts)
32+
JIT_ARCH_EFFECTIVE=""
33+
# Map CUDA_VERSION to CUDA_STREAM for artifact lookup
34+
if [[ "${CUDA_VERSION}" == cu* ]]; then
35+
CUDA_STREAM="${CUDA_VERSION}"
36+
elif [ "${CUDA_VERSION}" = "12.9.0" ]; then
37+
CUDA_STREAM="cu129"
38+
else
39+
CUDA_STREAM="cu130"
40+
fi
41+
echo "Using CUDA stream: ${CUDA_STREAM}"
42+
echo ""
43+
if [ -n "${JIT_ARCH}" ]; then
44+
# 12.0a for CUDA 12.9.0, 12.0f for CUDA 13.0.0
45+
if [ "${JIT_ARCH}" = "12.0" ]; then
46+
if [ "${CUDA_STREAM}" = "cu129" ]; then
47+
JIT_ARCH_EFFECTIVE="12.0a"
48+
else
49+
JIT_ARCH_EFFECTIVE="12.0f"
50+
fi
51+
else
52+
JIT_ARCH_EFFECTIVE="${JIT_ARCH}"
53+
fi
54+
55+
echo "Using JIT_ARCH from environment: ${JIT_ARCH_EFFECTIVE}"
56+
DIST_CUBIN_DIR="../dist/${CUDA_STREAM}/${JIT_ARCH_EFFECTIVE}/cubin"
57+
DIST_JIT_CACHE_DIR="../dist/${CUDA_STREAM}/${JIT_ARCH_EFFECTIVE}/jit-cache"
58+
59+
echo "==== Debug: listing artifact directories ===="
60+
echo "Tree under ../dist:"
61+
(cd .. && ls -al dist) || true
62+
echo ""
63+
echo "Tree under ../dist/${CUDA_STREAM}:"
64+
(cd .. && ls -al "dist/${CUDA_STREAM}") || true
65+
echo ""
66+
echo "Contents of ${DIST_CUBIN_DIR}:"
67+
ls -al "${DIST_CUBIN_DIR}" || true
68+
echo ""
69+
echo "Contents of ${DIST_JIT_CACHE_DIR}:"
70+
ls -al "${DIST_JIT_CACHE_DIR}" || true
71+
echo "============================================="
72+
73+
if [ -d "${DIST_CUBIN_DIR}" ] && ls "${DIST_CUBIN_DIR}"/*.whl >/dev/null 2>&1; then
74+
echo "Installing flashinfer-cubin from ${DIST_CUBIN_DIR} ..."
75+
pip install -q "${DIST_CUBIN_DIR}"/*.whl
76+
else
77+
echo "ERROR: flashinfer-cubin wheel not found in ${DIST_CUBIN_DIR}. Ensure the CI build stage produced the artifact." >&2
78+
fi
79+
80+
if [ -d "${DIST_JIT_CACHE_DIR}" ] && ls "${DIST_JIT_CACHE_DIR}"/*.whl >/dev/null 2>&1; then
81+
echo "Installing flashinfer-jit-cache from ${DIST_JIT_CACHE_DIR} ..."
82+
pip install -q "${DIST_JIT_CACHE_DIR}"/*.whl
83+
else
84+
echo "ERROR: flashinfer-jit-cache wheel not found in ${DIST_JIT_CACHE_DIR} for ${CUDA_VERSION}. Ensure the CI build stage produced the artifact." >&2
85+
fi
86+
echo ""
87+
fi
88+
89+
# Install local python sources
2890
pip install -e . -v --no-deps
91+
echo ""
92+
93+
# Verify installation
94+
echo "Verifying installation..."
95+
(cd /tmp && python -m flashinfer show-config)
96+
echo ""
2997
fi
3098

3199
EXIT_CODE=0

0 commit comments

Comments
 (0)