Skip to content

TST: Refactor remaining common tests to use pytest #2491

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

BenjaminBossan
Copy link
Member

@BenjaminBossan BenjaminBossan commented Apr 10, 2025

Follow up to #2462 and #2478

This deals with all the remaining tests that rely/relied on PeftCommonTester. For each test file, I created a separate commit, which may help reviewing.

After this is merged, I'll probably do another PR to do some clean up (e.g. no longer using self.skipTest, eliminating code duplication).

This is what I had to say to each file:

Refactor test_adaption_prompt.py

  • Did not really use PeftCommonTester, thus removed it
  • Removed skip if llama or mistral not available (this would only be necessary for very old transformers versions)
  • Parametrized tests instead of duplicating
  • Use small models from Hub instead of creating new ones
  • Test coverage misses 3 more lines around loading checkpoint, but those seem unrelated to adaption prompt but instead due to using hub models instead of creating new ones

Refactor test_multitask_prompt_tuning.py

Same arguments apply as for test_adaption_prompt.py

Refactor test_feature_extraction.py

Pretty straightforward, test coverage is 100% identical.

Refactor test_stablediffusion.py

This was also pretty straightforward. After refactoring, the test coverage was 100% the same.

I noticed, however, that these tests did not cover LoKr, they only pretended to:

"loha": (LoHaConfig, CONFIG_TESTING_KWARGS[1]),
"lokr": (LoHaConfig, CONFIG_TESTING_KWARGS[1]),

Thus I added LoKr to the test matrix, after which the test coverage if of course different, but is fine. Unfortunately, LoKr merging tests fail on CPU because the outputs change after merging. This is unrelated to the PR, so let's just skip those tests for now.

- Did not really use PeftCommonTester, thus removed it
- Removed skip if llama or mistral not avaiable
- Parametrized tests instead of duplicating
- Use small models from Hub instead of creating new ones
- Test coverage misses 3 more lines around loading checkpoint, most
  likely unrelated to adaption prompt but instead due to using hub models
  instead of creating new ones
Pretty straightforward, test coverage is 100% identical.
Same arguments apply as for test_adaption_prompt.py
This was pretty straightforward. After refactoring, the test coverage
was 100% the same.

I noticed, however, that these tests did not cover LoKr, they only
pretended to:

https://github.com/huggingface/peft/blob/37f8dc3458fefb0c1b4362da6733fcf742a1baa3/tests/test_stablediffusion.py#L113-L114

Thus I added LoKr to the test matrix, after which the test coverage if
of course different, but is fine.
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

For some reason, the outputs differ after merging. However, I locally
verified that this is already true before this refactor, so let's just
skip for now, as it is out of scope.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants