-
Notifications
You must be signed in to change notification settings - Fork 2k
PiSSA, OLoRA: Delete initial adapter after conversion instead of the active adapter #1933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -323,9 +323,13 @@ def test_lora_pissa_conversion_same_output_after_loading(self, data, tmp_path): | |
) | ||
|
||
# save the model with conversion | ||
peft_config_keys_before = list(peft_model.peft_config.keys()) | ||
peft_model.save_pretrained( | ||
tmp_path / "pissa-model-converted", path_initial_model_for_weight_conversion=tmp_path / "init-model" | ||
) | ||
peft_config_keys_after = list(peft_model.peft_config.keys()) | ||
assert peft_config_keys_before == peft_config_keys_after | ||
|
||
Comment on lines
+326
to
+332
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For my understanding. Why would this hold given we're deleting an adapter? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To do the weight conversion, we have to temporarily load the initial adapter (i.e. from before it was trained). Before this PR, we would keep that initial adapter around but delete the actually trained adapter. This is surprising for users, who would rather keep the trained adapter around, e.g. to perform evals (see the linked issue #1860). With this PR, instead we keep the trained adapter around and delete the initial adapter, which was only really needed for the conversion step. As such, I added the assert that the loaded adapters are the same before and after saving, which is also true when saving any other kind of model. |
||
model_converted = PeftModel.from_pretrained(deepcopy(model), tmp_path / "pissa-model-converted") | ||
output_converted = model_converted(data)[0] | ||
|
||
|
@@ -408,9 +412,13 @@ def test_olora_conversion_same_output_after_loading(self, data, tmp_path): | |
) | ||
|
||
# save the model with conversion | ||
peft_config_keys_before = list(peft_model.peft_config.keys()) | ||
peft_model.save_pretrained( | ||
tmp_path / "olora-model-converted", path_initial_model_for_weight_conversion=tmp_path / "init-model" | ||
) | ||
peft_config_keys_after = list(peft_model.peft_config.keys()) | ||
assert peft_config_keys_before == peft_config_keys_after | ||
|
||
model_converted = PeftModel.from_pretrained(deepcopy(model), tmp_path / "olora-model-converted") | ||
output_converted = model_converted(data)[0] | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for sanity: it's 100 percent okay to assume that it's not against the user to always delete the initial adapter and not letting them choose it instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is pretty much no reason why users would want to load the initial model and delete the trained model. I pondered keeping both, but users could be surprised by the extra memory required, resulting for instance in evals that are run after training and saving the model to OOM.
The new behavior is documented in the release notes: https://github.com/huggingface/peft/releases/tag/untagged-f3a302cd5b4f9d08a46b
(Release planned later today)