-
Notifications
You must be signed in to change notification settings - Fork 543
unittest: fix test_artifacts.py #1950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @yzh119, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a critical issue where the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request successfully fixes the broken test_artifacts.py unit test by updating it to align with the new API for handling artifacts, including the introduction of checksum validation. The changes correctly adapt the test to the renamed function, handle the new tuple return format, and mock the necessary checksums.txt files. My review includes a couple of suggestions to refactor the test code to reduce duplication, which will improve its long-term maintainability.
| # Add mock responses for checksums.txt files | ||
| fmha_checksums_url = safe_urljoin( | ||
| test_cubin_repository, | ||
| safe_urljoin(artifact_paths.TRTLLM_GEN_FMHA, "checksums.txt"), | ||
| ) | ||
| responses.add(responses.GET, fmha_checksums_url, body=checksums_fmha, status=200) | ||
|
|
||
| gemm_checksums_url = safe_urljoin( | ||
| test_cubin_repository, | ||
| safe_urljoin(artifact_paths.TRTLLM_GEN_GEMM, "checksums.txt"), | ||
| ) | ||
| responses.add(responses.GET, gemm_checksums_url, body=checksums_gemm, status=200) | ||
|
|
||
| bmm_checksums_url = safe_urljoin( | ||
| test_cubin_repository, | ||
| safe_urljoin(artifact_paths.TRTLLM_GEN_BMM, "checksums.txt"), | ||
| ) | ||
| responses.add(responses.GET, bmm_checksums_url, body=checksums_bmm, status=200) | ||
|
|
||
| deepgemm_checksums_url = safe_urljoin( | ||
| test_cubin_repository, safe_urljoin(artifact_paths.DEEPGEMM, "checksums.txt") | ||
| ) | ||
| responses.add( | ||
| responses.GET, deepgemm_checksums_url, body=checksums_deepgemm, status=200 | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setup for mock responses for checksums.txt files is repetitive. You can refactor this by using a dictionary to map artifact paths to their checksum data and then loop through it to add the responses. This will make the code more concise and easier to maintain if more subdirectories are added in the future.
| # Add mock responses for checksums.txt files | |
| fmha_checksums_url = safe_urljoin( | |
| test_cubin_repository, | |
| safe_urljoin(artifact_paths.TRTLLM_GEN_FMHA, "checksums.txt"), | |
| ) | |
| responses.add(responses.GET, fmha_checksums_url, body=checksums_fmha, status=200) | |
| gemm_checksums_url = safe_urljoin( | |
| test_cubin_repository, | |
| safe_urljoin(artifact_paths.TRTLLM_GEN_GEMM, "checksums.txt"), | |
| ) | |
| responses.add(responses.GET, gemm_checksums_url, body=checksums_gemm, status=200) | |
| bmm_checksums_url = safe_urljoin( | |
| test_cubin_repository, | |
| safe_urljoin(artifact_paths.TRTLLM_GEN_BMM, "checksums.txt"), | |
| ) | |
| responses.add(responses.GET, bmm_checksums_url, body=checksums_bmm, status=200) | |
| deepgemm_checksums_url = safe_urljoin( | |
| test_cubin_repository, safe_urljoin(artifact_paths.DEEPGEMM, "checksums.txt") | |
| ) | |
| responses.add( | |
| responses.GET, deepgemm_checksums_url, body=checksums_deepgemm, status=200 | |
| ) | |
| # Add mock responses for checksums.txt files | |
| checksum_data = { | |
| artifact_paths.TRTLLM_GEN_FMHA: checksums_fmha, | |
| artifact_paths.TRTLLM_GEN_GEMM: checksums_gemm, | |
| artifact_paths.TRTLLM_GEN_BMM: checksums_bmm, | |
| artifact_paths.DEEPGEMM: checksums_deepgemm, | |
| } | |
| for path, body in checksum_data.items(): | |
| url = safe_urljoin( | |
| test_cubin_repository, safe_urljoin(path, "checksums.txt") | |
| ) | |
| responses.add(responses.GET, url, body=body, status=200) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good suggestion
| for expected_file_name in expected_gemm_cubin_files: | ||
| expected_file_path = artifact_paths.TRTLLM_GEN_GEMM + "/" + expected_file_name | ||
| assert any(expected_file_path in url for url in cubin_files), ( | ||
| expected_file_path = safe_urljoin( | ||
| artifact_paths.TRTLLM_GEN_GEMM, expected_file_name | ||
| ) | ||
| assert any(expected_file_path in url for url in cubin_file_paths), ( | ||
| f"Expected cubin file '{expected_file_path}' not found in cubin file list" | ||
| ) | ||
|
|
||
| for expected_file_name in expected_fmha_cubin_files: | ||
| expected_file_path = artifact_paths.TRTLLM_GEN_FMHA + "/" + expected_file_name | ||
| assert any(expected_file_path in url for url in cubin_files), ( | ||
| expected_file_path = safe_urljoin( | ||
| artifact_paths.TRTLLM_GEN_FMHA, expected_file_name | ||
| ) | ||
| assert any(expected_file_path in url for url in cubin_file_paths), ( | ||
| f"Expected cubin file '{expected_file_path}' not found in cubin file list" | ||
| ) | ||
|
|
||
| for expected_file_name in expected_bmm_cubin_files: | ||
| expected_file_path = artifact_paths.TRTLLM_GEN_BMM + "/" + expected_file_name | ||
| assert any(expected_file_path in url for url in cubin_files), ( | ||
| expected_file_path = safe_urljoin( | ||
| artifact_paths.TRTLLM_GEN_BMM, expected_file_name | ||
| ) | ||
| assert any(expected_file_path in url for url in cubin_file_paths), ( | ||
| f"Expected cubin file '{expected_file_path}' not found in cubin file list" | ||
| ) | ||
|
|
||
| for expected_file_name in expected_deepgemm_cubin_files: | ||
| expected_file_path = artifact_paths.DEEPGEMM + "/" + expected_file_name | ||
| assert any(expected_file_path in url for url in cubin_files), ( | ||
| expected_file_path = safe_urljoin(artifact_paths.DEEPGEMM, expected_file_name) | ||
| assert any(expected_file_path in url for url in cubin_file_paths), ( | ||
| f"Expected cubin file '{expected_file_path}' not found in cubin file list" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion logic to check for expected cubin files is duplicated across four loops. This can be consolidated into a single loop by using a dictionary that maps artifact paths to their corresponding expected file sets. This refactoring will improve code readability and maintainability.
assertion_data = {
artifact_paths.TRTLLM_GEN_GEMM: expected_gemm_cubin_files,
artifact_paths.TRTLLM_GEN_FMHA: expected_fmha_cubin_files,
artifact_paths.TRTLLM_GEN_BMM: expected_bmm_cubin_files,
artifact_paths.DEEPGEMM: expected_deepgemm_cubin_files,
}
for path, expected_files in assertion_data.items():
for expected_file_name in expected_files:
expected_file_path = safe_urljoin(path, expected_file_name)
assert any(expected_file_path in url for url in cubin_file_paths), (
f"Expected cubin file '{expected_file_path}' not found in cubin file list"
)|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughThe changes add a new test invocation to the test suite and refactor the artifacts module API. The function Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes The changes involve an API rename with straightforward but multi-faceted updates across test logic. While the scope is focused on a single test file and shell script, the refactoring requires verification of: function rename correctness, environment variable handling, URL construction via Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tests/test_artifacts.py (2)
208-208: Fix the comment to match the variable name.The comment says "Expected BMM cubin files" but the variable is
expected_deepgemm_cubin_files.Apply this diff:
-# Expected BMM cubin files from the mock response +# Expected DeepGEMM cubin files from the mock response expected_deepgemm_cubin_files = {
209-213: Add missing DeepGEMM cubin file to expected set.The mock response and checksums data both include
kernel.fp8_m_grouped_gemm.007404769193.cubin, but theexpected_deepgemm_cubin_filesset only contains 3 of the 4 files. Add the missing file:expected_deepgemm_cubin_files = { + "kernel.fp8_m_grouped_gemm.007404769193.cubin", "kernel.fp8_m_grouped_gemm.007d9ebdca7e.cubin", "kernel.fp8_m_grouped_gemm.02acb2ba71fd.cubin", "kernel.fp8_m_grouped_gemm.0457375eb02f.cubin", }Also fix the comment on line 208: it says "Expected BMM cubin files" but should say "Expected DeepGEMM cubin files".
🧹 Nitpick comments (3)
tests/test_artifacts.py (3)
321-345: Consider refactoring the repetitive mock setup.The setup for mock responses could be simplified using a dictionary to map artifact paths to checksum data, as suggested in previous reviews. This would improve maintainability if more subdirectories are added.
As per previous review suggestions, you could refactor like this:
# Add mock responses for checksums.txt files checksum_data = { artifact_paths.TRTLLM_GEN_FMHA: checksums_fmha, artifact_paths.TRTLLM_GEN_GEMM: checksums_gemm, artifact_paths.TRTLLM_GEN_BMM: checksums_bmm, artifact_paths.DEEPGEMM: checksums_deepgemm, } for path, body in checksum_data.items(): url = safe_urljoin( test_cubin_repository, safe_urljoin(path, "checksums.txt") ) responses.add(responses.GET, url, body=body, status=200)
353-381: Consider consolidating the assertion logic.The assertion loops are duplicated across four subdirectories. As noted in previous reviews, this could be consolidated using a dictionary that maps artifact paths to expected file sets.
As per previous review suggestions:
assertion_data = { artifact_paths.TRTLLM_GEN_GEMM: expected_gemm_cubin_files, artifact_paths.TRTLLM_GEN_FMHA: expected_fmha_cubin_files, artifact_paths.TRTLLM_GEN_BMM: expected_bmm_cubin_files, artifact_paths.DEEPGEMM: expected_deepgemm_cubin_files, } for path, expected_files in assertion_data.items(): for expected_file_name in expected_files: expected_file_path = safe_urljoin(path, expected_file_name) assert any(expected_file_path in url for url in cubin_file_paths), ( f"Expected cubin file '{expected_file_path}' not found in cubin file list" )
347-393: Add assertions for checksums.txt files.Based on the implementation in
flashinfer/artifacts.py(lines 171-174),get_subdir_file_list()yields checksums.txt files for each subdirectory. However, the test only verifies cubin files and meta info headers, missing verification of the checksums.txt entries.Add verification for checksums.txt files:
# After line 393, add: # Check that checksums.txt files are included for each subdirectory expected_checksum_files = [ safe_urljoin(artifact_paths.TRTLLM_GEN_FMHA, "checksums.txt"), safe_urljoin(artifact_paths.TRTLLM_GEN_GEMM, "checksums.txt"), safe_urljoin(artifact_paths.TRTLLM_GEN_BMM, "checksums.txt"), safe_urljoin(artifact_paths.DEEPGEMM, "checksums.txt"), ] for expected_checksum_file in expected_checksum_files: assert any(expected_checksum_file in path for path in cubin_file_paths), ( f"Expected checksums.txt file '{expected_checksum_file}' not found in cubin file list" )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
scripts/task_jit_run_tests_part4.sh(1 hunks)tests/test_artifacts.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_artifacts.py (2)
flashinfer/artifacts.py (1)
get_subdir_file_list(146-184)flashinfer/jit/cubin_loader.py (1)
safe_urljoin(38-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Deploy Docs
🔇 Additional comments (4)
scripts/task_jit_run_tests_part4.sh (1)
19-19: LGTM!The addition of the artifacts test to the test suite is appropriate and correctly formatted.
tests/test_artifacts.py (3)
1-5: LGTM!The import statement correctly reflects the renamed function
get_subdir_file_list.
282-293: LGTM!The test function correctly uses the
tmp_pathfixture to provide a temporary directory for checksum downloads, and properly configures the environment via monkeypatching.
347-350: LGTM!The code correctly handles the new API's tuple return values by unpacking them and extracting just the paths for assertion comparisons.
|
/bot run |
|
[FAILED] Pipeline #36905871: 5/17 passed |
📌 Description
This PR fixes the test suite broken by #1761 , which introduced checksum validation for downloaded artifacts.
Commit #1761 made the following breaking changes:
get_cubin_file_list() -> get_subdir_file_list()checksums.txtdownload requirement for each cubin subdirectoryThe test
test_get_cubin_file_listwas not updated accordingly, this PR updatestests/test_artifacts.pyto align with the new API:get_cubin_file_listtoget_subdir_file_list🔍 Related Issues
#1761
🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.
✅ Pre-commit Checks
pre-commitby runningpip install pre-commit(or used your preferred method).pre-commit install.pre-commit run --all-filesand fixed any reported issues.🧪 Tests
unittest, etc.).Reviewer Notes
cc @jimmyzho @hypdeb
Summary by CodeRabbit
Tests
Refactor