TST: Refactor remaining common tests to use pytest #2491

BenjaminBossan · 2025-04-10T14:44:12Z

Follow up to #2462 and #2478

This deals with all the remaining tests that rely/relied on PeftCommonTester. For each test file, I created a separate commit, which may help reviewing.

After this is merged, I'll probably do another PR to do some clean up (e.g. no longer using self.skipTest, eliminating code duplication).

This is what I had to say to each file:

Refactor `test_adaption_prompt.py`

Did not really use PeftCommonTester, thus removed it
Removed skip if llama or mistral not available (this would only be necessary for very old transformers versions)
Parametrized tests instead of duplicating
Use small models from Hub instead of creating new ones
Test coverage misses 3 more lines around loading checkpoint, but those seem unrelated to adaption prompt but instead due to using hub models instead of creating new ones

Refactor `test_multitask_prompt_tuning.py`

Same arguments apply as for test_adaption_prompt.py

Refactor `test_feature_extraction.py`

Pretty straightforward, test coverage is 100% identical.

Refactor `test_stablediffusion.py`

This was also pretty straightforward. After refactoring, the test coverage was 100% the same.

I noticed, however, that these tests did not cover LoKr, they only pretended to:

peft/tests/test_stablediffusion.py

Lines 113 to 114 in 37f8dc3

    
           "loha": (LoHaConfig, CONFIG_TESTING_KWARGS[1]), 
        
           "lokr": (LoHaConfig, CONFIG_TESTING_KWARGS[1]),

Thus I added LoKr to the test matrix, after which the test coverage if of course different, but is fine. Unfortunately, LoKr merging tests fail on CPU because the outputs change after merging. This is unrelated to the PR, so let's just skip those tests for now.

- Did not really use PeftCommonTester, thus removed it - Removed skip if llama or mistral not avaiable - Parametrized tests instead of duplicating - Use small models from Hub instead of creating new ones - Test coverage misses 3 more lines around loading checkpoint, most likely unrelated to adaption prompt but instead due to using hub models instead of creating new ones

Pretty straightforward, test coverage is 100% identical.

Same arguments apply as for test_adaption_prompt.py

This was pretty straightforward. After refactoring, the test coverage was 100% the same. I noticed, however, that these tests did not cover LoKr, they only pretended to: https://github.com/huggingface/peft/blob/37f8dc3458fefb0c1b4362da6733fcf742a1baa3/tests/test_stablediffusion.py#L113-L114 Thus I added LoKr to the test matrix, after which the test coverage if of course different, but is fine.

HuggingFaceDocBuilderDev · 2025-04-10T14:48:12Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

For some reason, the outputs differ after merging. However, I locally verified that this is already true before this refactor, so let's just skip for now, as it is out of scope.

githubnemo

Nice :)

This is a follow up to huggingface#2462, huggingface#2478, and huggingface#2491. Finish the refactor from unittest-style tests to pytest-style tests to now also include the last big file to still use the old style, test_custom_models.py. This file was already mostly written with pytest in mind, so the changes were rather minimal. With this class refactored, we can finally remove ClassInstantier, which made understanding test parametrization much more difficult. There are still a few unittest.TestCase classes found throughout, but they are pretty isolated, so no big need to refactor them right now. Note that the test coverage is the same except for one difference. For some reason, I found that module level __getattr__ is no longer called when moving from unittest to pytest, such as here: https://github.com/huggingface/peft/blob/62c9cf30319ca219e1d6754626c17f510fc76441/src/peft/tuners/adalora/__init__.py#L32 This was pretty strange. I found out that these lines are invoked by usage of self.assertWarnsRegex, which is strange. The reason seems to be the unittest code here: https://github.com/python/cpython/blob/9258f3da9175134d03f2c8c7c7eed223802ad945/Lib/unittest/case.py#L296-L305 Therefore, I think it's safe to ignore this reduction in test coverage.

This is a follow up to #2462, #2478, and #2491. Finish the refactor from unittest-style tests to pytest-style tests to now also include the last big file to still use the old style, test_custom_models.py. This file was already mostly written with pytest in mind, so the changes were rather minimal. With this class refactored, we can finally remove ClassInstantier, which made understanding test parametrization much more difficult.

* Refactor test_adaption_prompt.py - Did not really use PeftCommonTester, thus removed it - Removed skip if llama or mistral not avaiable - Parametrized tests instead of duplicating - Use small models from Hub instead of creating new ones - Test coverage misses 3 more lines around loading checkpoint, most likely unrelated to adaption prompt but instead due to using hub models instead of creating new ones * Refactor test_feature_extraction.py Pretty straightforward, test coverage is 100% identical. * Refactor test_multitask_prompt_tuning Same arguments apply as for test_adaption_prompt.py * Refactor test_stablediffusion.py This was pretty straightforward. After refactoring, the test coverage was 100% the same. I noticed, however, that these tests did not cover LoKr, they only pretended to: https://github.com/huggingface/peft/blob/37f8dc3458fefb0c1b4362da6733fcf742a1baa3/tests/test_stablediffusion.py#L113-L114 Thus I added LoKr to the test matrix, after which the test coverage if of course different, but is fine. * Skip LoKr merging tests when not CUDA For some reason, the outputs differ after merging. However, I locally verified that this is already true before this refactor, so let's just skip for now, as it is out of scope.

This is a follow up to huggingface#2462, huggingface#2478, and huggingface#2491. Finish the refactor from unittest-style tests to pytest-style tests to now also include the last big file to still use the old style, test_custom_models.py. This file was already mostly written with pytest in mind, so the changes were rather minimal. With this class refactored, we can finally remove ClassInstantier, which made understanding test parametrization much more difficult.

BenjaminBossan added 4 commits April 10, 2025 15:51

Refactor test_feature_extraction.py

3a9f597

Pretty straightforward, test coverage is 100% identical.

Refactor test_multitask_prompt_tuning

9471108

Same arguments apply as for test_adaption_prompt.py

BenjaminBossan added 3 commits April 10, 2025 17:14

Skip LoKr merging tests when not CUDA

ae15012

For some reason, the outputs differ after merging. However, I locally verified that this is already true before this refactor, so let's just skip for now, as it is out of scope.

Merge branch 'main' into tst-refactor-remaining-tests-pytest

c8995ac

Merge branch 'main' into tst-refactor-remaining-tests-pytest

5565c81

githubnemo approved these changes Apr 30, 2025

View reviewed changes

BenjaminBossan merged commit 6c8c3c3 into huggingface:main May 2, 2025
14 checks passed

BenjaminBossan deleted the tst-refactor-remaining-tests-pytest branch May 2, 2025 09:19

BenjaminBossan mentioned this pull request Jun 5, 2025

TST: Refactor unittest to pytest style custom tests #2573

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TST: Refactor remaining common tests to use pytest #2491

TST: Refactor remaining common tests to use pytest #2491

Uh oh!

BenjaminBossan commented Apr 10, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 10, 2025

Uh oh!

githubnemo left a comment

Uh oh!

Uh oh!

Uh oh!

	"loha": (LoHaConfig, CONFIG_TESTING_KWARGS[1]),
	"lokr": (LoHaConfig, CONFIG_TESTING_KWARGS[1]),

TST: Refactor remaining common tests to use pytest #2491

TST: Refactor remaining common tests to use pytest #2491

Uh oh!

Conversation

BenjaminBossan commented Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Refactor test_adaption_prompt.py

Refactor test_multitask_prompt_tuning.py

Refactor test_feature_extraction.py

Refactor test_stablediffusion.py

Uh oh!

HuggingFaceDocBuilderDev commented Apr 10, 2025

Uh oh!

githubnemo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

BenjaminBossan commented Apr 10, 2025 •

edited

Loading

Refactor `test_adaption_prompt.py`

Refactor `test_multitask_prompt_tuning.py`

Refactor `test_feature_extraction.py`

Refactor `test_stablediffusion.py`