-
Notifications
You must be signed in to change notification settings - Fork 190
E2E test for the experimental compress algorithm based on https://arxiv.org/abs/2411.19146 #464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
E2E test for the experimental compress algorithm based on https://arxiv.org/abs/2411.19146 #464
Conversation
using MIP-based NAS search algorithm. Signed-off-by: Daniel Korzekwa <[email protected]>
Signed-off-by: Daniel Korzekwa <[email protected]>
Signed-off-by: Daniel Korzekwa <[email protected]>
Signed-off-by: Daniel Korzekwa <[email protected]>
Signed-off-by: Daniel Korzekwa <[email protected]>
Signed-off-by: Daniel Korzekwa <[email protected]>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## feature/compress #464 +/- ##
=================================================
Coverage 73.40% 73.40%
=================================================
Files 180 180
Lines 18077 18077
=================================================
Hits 13270 13270
Misses 4807 4807 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
tests/gpu/torch/_compress/resources/configs/bypass/bypass_distillation_defaults.yaml
Outdated
Show resolved
Hide resolved
tests/gpu/torch/_compress/resources/configs/bypass/llama-3_1-8b_bypass.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the resources/tokenizer used as a toy tokenizer for testing instead of using original llama tokenizer?
We can instead re-use test toy models and tokenizers used in other tests. See comment below in gpu test file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
created an internal issue to address this in the next MR: issues/12
tests/experimental/torch/_compress/resources/tokenizer/truncate_tokenizer.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to this PR, but do we also plan to simplify the yaml files as part of roadmap? Currently there are too many things to be configured and in too many yaml files, which we can move to one common base yaml hidden from users and only require user to provide 4-5 most important inputs to keep things simpler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is captured in the Nvidia internal roadmap
Signed-off-by: Daniel Korzekwa <[email protected]>
Signed-off-by: Daniel Korzekwa <[email protected]>
Signed-off-by: Daniel Korzekwa <[email protected]>
…ation. Signed-off-by: Daniel Korzekwa <[email protected]>
Signed-off-by: Daniel Korzekwa <[email protected]>
Signed-off-by: Keval Morabia <[email protected]>
…tmp_path. Signed-off-by: Daniel Korzekwa <[email protected]>
Signed-off-by: Daniel Korzekwa <[email protected]>
Signed-off-by: Daniel Korzekwa <[email protected]>
Signed-off-by: Daniel Korzekwa <[email protected]>
Signed-off-by: Daniel Korzekwa <[email protected]>
Signed-off-by: Daniel Korzekwa <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to merge. Thanks for addressing my comments
What does this PR do?
Type of change: ?
new feature
Overview: ?
E2E test for the experimental compress algorithm based on https://arxiv.org/abs/2411.19146
Usage
See tests/gpu/torch/_compress/test_compress.py
# Add a code snippet demonstrating how to use thisSee tests/gpu/torch/_compress/test_compress.py
Testing
See tests/gpu/torch/_compress/test_compress.py
Before your PR is "Ready for review"
Additional Information