FIX: Multiple active adapters with auxiliary layers #2758

BenjaminBossan · 2025-08-28T15:29:33Z

Description

This PR fixes a few issues with the handling of active adapters for auxiliary modules.

1. Calling `set_adapter` on the `model.base_model`

When calling peft_model.set_adapter, it is not possible to activate more than one adapter, as not all PEFT methods support that. However, many PEFT methods like LoRA do, in which case users should call peft_model.base_model.set_adapter(['default', 'other']).

Now the issue was that the activation of auxiliary modules was only done on PeftModel.set_adapter. This means that if users are calling peft_model.base_model.set_adapter (i.e. LoraModel.set_adapter etc.), the auxiliary adapters were not activated.

This PR fixes this issue by ensuring that even if the user activates adapters like this, the auxiliary modules are activated. When users activate more than one adapter, additional checks are performed to ensure that they are not activating multiple auxiliary modules on the same module.

Note that some existing PEFT code could start raising errors now because of the change. However, this PEFT code is buggy right now so IMO it is fine to raise an error.

2. Adding multiple adapters with non-overlapping auxiliary modules

Furthermore, I found an activation issue that could occur when adding multiple adapters with non-overlapping auxiliary modules. Normally, when the second/third/... adapter are being added, they are not automatically activated. However, when these additional adapters target new auxiliary modules, those would be incorrectly activated (because they look like they're the first adapter). This has also been fixed.

Open question

Right now, we don't allow users to activate multiple auxiliary adapters on the same module. However, this limitation could be considered too strict:

For trainable tokens, as long as the indices don't overlap, there is no conflict.
For modules_to_save, we could theoretically determine the "delta_weight" as new_weight - original_weight, then add up all delta_weights.

This is not implemented in the PR for now to prevent it becoming even more complex.

Note to maintainers

This PR requires the addition of self.set_auxiliary_adapters to the set_adapters method of all non-prompt learning PEFT method model classes. If this PR is merged, all open PRs that add such PEFT methods must be updated to include this call.

Description This PR fixes a few issues with the handling of active adapters for auxiliary modules. 1. Calling set_adapter on the base_model When calling peft_model.set_adapter, it is not possible to activate more than one adapter, as not all PEFT methods support that. However, many PEFT methods like LoRA do, in which case users should call peft_model.base_model.set_adapter(['default', 'other']). Now the issue was that the activation of auxiliary modules was only done on PeftModel.set_adapter. This means that if users are calling peft_model.base_model.set_adapter (i.e. LoraModel.set_adapter etc.), the auxiliary adapters were not activated. This PR fixes this issue by ensuring that even if the user activates adapters like this, the auxiliary modules are activated. When users activate more than one adapter, additional checks are performed to ensure that they are not activating multiple auxiliary modules on the same module. Note that some existing PEFT code could start raising errors now because of the change. However, this PEFT code is buggy right now so IMO it is fine to raise an error. 2. Adding multiple adapters with non-overlapping auxiliary modules Furthermore, I found an activation issue that could occur when adding multiple adapters with non-overlapping auxiliary modules. Normally, when the second/third/... adapter are being added, they are not automatically activated. However, when there when these additional adapters target new auxiliary modules, those would be incorrectly activated (because they look like they're the first adapter). This has also been fixed. Open question Right now, we don't allow users to activate multiple auxiliary adapters on the same module. However, this limitation could be considered too strict: 1. For trainable tokens, as long as the indices don't overlap, there is no conflict. 2. For modules_to_save, we could theoretically determine the delta_weight as new_weight - original_weight, then add up all delta_weights. This is not implemented in the PR for now to prevent it becoming even more complex. Note to maintainers This PR requires the addition of self.set_auxiliary_adapters to the set_adapters method of all non-prompt learning PEFT method model classes. If this PR is merged, all open PRs that add such PEFT methods must be updated to include this call.

HuggingFaceDocBuilderDev · 2025-08-28T15:34:18Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

githubnemo

Oof, thanks.

Right now, we don't allow users to activate multiple
auxiliary adapters on the same module. However, this limitation could be considered too strict:

The merging behavior you describe (summation of delta weights) can be done for overlapping trainable tokens and modules to save alike. I think there are at least three cases here:

user just wants to load adapters and if they're overlapping, receive an error because the assumed orthogonality of the adapters is not given
user wants to override specific parts of a pre-trained adapter and wants to load another adapter to overwrite specific parts (without merging)
user is aware of the overlap between adapters and wants to merge the deltas in some way

Ideally we would support all cases (defaulting to the first case) with additional arguments to add_adapter but I'm not yet sure what this means for other places that use add_adapter and how to propagate such an option through possible abstractions.

I would see these as out of scope for this PR and I don't see anything anything blocking a possible implementation.

src/peft/peft_model.py

src/peft/tuners/boft/model.py

src/peft/tuners/tuners_utils.py

src/peft/utils/other.py

githubnemo · 2025-08-29T11:07:31Z

src/peft/utils/other.py

+        if isinstance(module, TrainableTokensWrapper):
+            # TODO In theory, multiple active trainable tokens is fine when the indices don't overlap
+            adapter_names_in_module = [n for n in adapter_name if n in module.token_adapter.trainable_tokens_delta]
+        elif isinstance(module, ModulesToSaveWrapper):
+            adapter_names_in_module = [n for n in adapter_name if n in module.modules_to_save]
+        else:
+            raise NotImplementedError(f"Please implement the handling for {type(module)}")
+
+        if len(adapter_names_in_module) > 1:
+            raise ValueError(f"Only one adapter can be set at a time for {module}, got {len(adapter_names_in_module)}")
+
+        if not adapter_names_in_module:
+            adapter_name_to_set = None
+        else:
+            adapter_name_to_set = adapter_names_in_module[0]
+
+        return adapter_name_to_set


I wonder if we should delegate the check (and potential checking of capabilities like merging given a method - out of scope but forseeable) to the wrapper classes themselves, e.g. AuxiliaryTrainingWrapper.check_new_adapter(). WDYT?

Good idea, I refactored it.

BenjaminBossan

The merging behavior you describe (summation of delta weights) can be done for overlapping trainable tokens and modules to save alike. I think there are at least three cases here:

user just wants to load adapters and if they're overlapping, receive an error because the assumed orthogonality of the adapters is not given

user wants to override specific parts of a pre-trained adapter and wants to load another adapter to overwrite specific parts (without merging)

user is aware of the overlap between adapters and wants to merge the deltas in some way

Ideally we would support all cases (defaulting to the first case) with additional arguments to add_adapter but I'm not yet sure what this means for other places that use add_adapter and how to propagate such an option through possible abstractions.

Good summary, although I'd say that case 2 (override) would be surprising behavior and would be fine not supporting it. I think the expectation for a user would be that this behaves the same as for LoRA et al., i.e. the effects are aggregated (whatever that may mean exactly).

I would see these as out of scope for this PR and I don't see anything anything blocking a possible implementation.

So as is, it's not allowed and I think we both agree that suggestion 3. is the most useful one. Let's implement that in a future PR.

src/peft/peft_model.py

src/peft/tuners/boft/model.py

src/peft/tuners/tuners_utils.py

BenjaminBossan · 2025-08-29T12:42:55Z

src/peft/utils/other.py

+        if isinstance(module, TrainableTokensWrapper):
+            # TODO In theory, multiple active trainable tokens is fine when the indices don't overlap
+            adapter_names_in_module = [n for n in adapter_name if n in module.token_adapter.trainable_tokens_delta]
+        elif isinstance(module, ModulesToSaveWrapper):
+            adapter_names_in_module = [n for n in adapter_name if n in module.modules_to_save]
+        else:
+            raise NotImplementedError(f"Please implement the handling for {type(module)}")
+
+        if len(adapter_names_in_module) > 1:
+            raise ValueError(f"Only one adapter can be set at a time for {module}, got {len(adapter_names_in_module)}")
+
+        if not adapter_names_in_module:
+            adapter_name_to_set = None
+        else:
+            adapter_name_to_set = adapter_names_in_module[0]
+
+        return adapter_name_to_set


Good idea, I refactored it.

src/peft/utils/other.py

githubnemo

LGTM :)

BenjaminBossan added 2 commits August 28, 2025 17:16

Fix errors in comments

6bb694b

Fix test

34e3f97

BenjaminBossan requested a review from githubnemo August 28, 2025 17:00

githubnemo reviewed Aug 29, 2025

View reviewed changes

Address reviewer comments

bebb551

BenjaminBossan commented Aug 29, 2025

View reviewed changes

githubnemo approved these changes Aug 29, 2025

View reviewed changes

BenjaminBossan merged commit a3197b1 into huggingface:main Aug 29, 2025
14 checks passed

BenjaminBossan deleted the fix-issue-multiple-active-auxiliary-modules branch August 29, 2025 15:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FIX: Multiple active adapters with auxiliary layers #2758

FIX: Multiple active adapters with auxiliary layers #2758

Uh oh!

BenjaminBossan commented Aug 28, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Aug 28, 2025

Uh oh!

githubnemo left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

githubnemo Aug 29, 2025

Uh oh!

BenjaminBossan Aug 29, 2025

Uh oh!

BenjaminBossan left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BenjaminBossan Aug 29, 2025

Uh oh!

Uh oh!

githubnemo left a comment

Uh oh!

Uh oh!

Uh oh!

FIX: Multiple active adapters with auxiliary layers #2758

FIX: Multiple active adapters with auxiliary layers #2758

Uh oh!

Conversation

BenjaminBossan commented Aug 28, 2025

Description

1. Calling set_adapter on the model.base_model

2. Adding multiple adapters with non-overlapping auxiliary modules

Open question

Note to maintainers

Uh oh!

HuggingFaceDocBuilderDev commented Aug 28, 2025

Uh oh!

githubnemo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

githubnemo Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

BenjaminBossan Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

githubnemo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

1. Calling `set_adapter` on the `model.base_model`