-
Notifications
You must be signed in to change notification settings - Fork 2k
TST: Refactor remaining common tests to use pytest #2491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
BenjaminBossan
merged 7 commits into
huggingface:main
from
BenjaminBossan:tst-refactor-remaining-tests-pytest
May 2, 2025
Merged
TST: Refactor remaining common tests to use pytest #2491
BenjaminBossan
merged 7 commits into
huggingface:main
from
BenjaminBossan:tst-refactor-remaining-tests-pytest
May 2, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Did not really use PeftCommonTester, thus removed it - Removed skip if llama or mistral not avaiable - Parametrized tests instead of duplicating - Use small models from Hub instead of creating new ones - Test coverage misses 3 more lines around loading checkpoint, most likely unrelated to adaption prompt but instead due to using hub models instead of creating new ones
Pretty straightforward, test coverage is 100% identical.
Same arguments apply as for test_adaption_prompt.py
This was pretty straightforward. After refactoring, the test coverage was 100% the same. I noticed, however, that these tests did not cover LoKr, they only pretended to: https://github.com/huggingface/peft/blob/37f8dc3458fefb0c1b4362da6733fcf742a1baa3/tests/test_stablediffusion.py#L113-L114 Thus I added LoKr to the test matrix, after which the test coverage if of course different, but is fine.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
For some reason, the outputs differ after merging. However, I locally verified that this is already true before this refactor, so let's just skip for now, as it is out of scope.
githubnemo
approved these changes
Apr 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice :)
BenjaminBossan
added a commit
to BenjaminBossan/peft
that referenced
this pull request
Jun 5, 2025
This is a follow up to huggingface#2462, huggingface#2478, and huggingface#2491. Finish the refactor from unittest-style tests to pytest-style tests to now also include the last big file to still use the old style, test_custom_models.py. This file was already mostly written with pytest in mind, so the changes were rather minimal. With this class refactored, we can finally remove ClassInstantier, which made understanding test parametrization much more difficult. There are still a few unittest.TestCase classes found throughout, but they are pretty isolated, so no big need to refactor them right now. Note that the test coverage is the same except for one difference. For some reason, I found that module level __getattr__ is no longer called when moving from unittest to pytest, such as here: https://github.com/huggingface/peft/blob/62c9cf30319ca219e1d6754626c17f510fc76441/src/peft/tuners/adalora/__init__.py#L32 This was pretty strange. I found out that these lines are invoked by usage of self.assertWarnsRegex, which is strange. The reason seems to be the unittest code here: https://github.com/python/cpython/blob/9258f3da9175134d03f2c8c7c7eed223802ad945/Lib/unittest/case.py#L296-L305 Therefore, I think it's safe to ignore this reduction in test coverage.
BenjaminBossan
added a commit
that referenced
this pull request
Jun 6, 2025
This is a follow up to #2462, #2478, and #2491. Finish the refactor from unittest-style tests to pytest-style tests to now also include the last big file to still use the old style, test_custom_models.py. This file was already mostly written with pytest in mind, so the changes were rather minimal. With this class refactored, we can finally remove ClassInstantier, which made understanding test parametrization much more difficult.
efraimdahl
pushed a commit
to efraimdahl/peft
that referenced
this pull request
Jul 12, 2025
* Refactor test_adaption_prompt.py - Did not really use PeftCommonTester, thus removed it - Removed skip if llama or mistral not avaiable - Parametrized tests instead of duplicating - Use small models from Hub instead of creating new ones - Test coverage misses 3 more lines around loading checkpoint, most likely unrelated to adaption prompt but instead due to using hub models instead of creating new ones * Refactor test_feature_extraction.py Pretty straightforward, test coverage is 100% identical. * Refactor test_multitask_prompt_tuning Same arguments apply as for test_adaption_prompt.py * Refactor test_stablediffusion.py This was pretty straightforward. After refactoring, the test coverage was 100% the same. I noticed, however, that these tests did not cover LoKr, they only pretended to: https://github.com/huggingface/peft/blob/37f8dc3458fefb0c1b4362da6733fcf742a1baa3/tests/test_stablediffusion.py#L113-L114 Thus I added LoKr to the test matrix, after which the test coverage if of course different, but is fine. * Skip LoKr merging tests when not CUDA For some reason, the outputs differ after merging. However, I locally verified that this is already true before this refactor, so let's just skip for now, as it is out of scope.
efraimdahl
pushed a commit
to efraimdahl/peft
that referenced
this pull request
Jul 12, 2025
This is a follow up to huggingface#2462, huggingface#2478, and huggingface#2491. Finish the refactor from unittest-style tests to pytest-style tests to now also include the last big file to still use the old style, test_custom_models.py. This file was already mostly written with pytest in mind, so the changes were rather minimal. With this class refactored, we can finally remove ClassInstantier, which made understanding test parametrization much more difficult.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Follow up to #2462 and #2478
This deals with all the remaining tests that rely/relied on
PeftCommonTester
. For each test file, I created a separate commit, which may help reviewing.After this is merged, I'll probably do another PR to do some clean up (e.g. no longer using
self.skipTest
, eliminating code duplication).This is what I had to say to each file:
Refactor
test_adaption_prompt.py
PeftCommonTester
, thus removed itRefactor
test_multitask_prompt_tuning.py
Same arguments apply as for
test_adaption_prompt.py
Refactor
test_feature_extraction.py
Pretty straightforward, test coverage is 100% identical.
Refactor
test_stablediffusion.py
This was also pretty straightforward. After refactoring, the test coverage was 100% the same.
I noticed, however, that these tests did not cover LoKr, they only pretended to:
peft/tests/test_stablediffusion.py
Lines 113 to 114 in 37f8dc3
Thus I added LoKr to the test matrix, after which the test coverage if of course different, but is fine. Unfortunately, LoKr merging tests fail on CPU because the outputs change after merging. This is unrelated to the PR, so let's just skip those tests for now.