-
Notifications
You must be signed in to change notification settings - Fork 441
Description
I’m trying to fine-tune Phi 3.5 Vision using transformers. However, I’m running into an issue trying to save the model during or after training. See below for a minimal reproducible example.
My example below seems to be essentially what's happening in the official "cookbook" example: https://github.com/microsoft/Phi-3CookBook/blob/main/code/04.Finetuning/vision_finetuning/finetune_hf_trainer_docvqa.py#L482-L485.
However, I also see from this other example (https://github.com/microsoft/Phi-3CookBook/blob/6566572c38d53f384801a09dabdd26ad4f7bf76a/code/04.Finetuning/Phi-3-vision-Trainingscript.py#L256) that safe_serialization=False
is used....is that strictly required? The example from finetune_hf_trainer_docvqa.py
doesn't seem to use it, and it's not clear to me how that works successfully.
Does anyone have any pointers? This issue has been reported in a few other locations, but I haven't come across any solutions - see below.
- Saving Phi 3 vision fails due to tensor sharing huggingface/transformers#32354
- https://discuss.huggingface.co/t/using-trainer-to-save-a-bartforsequenceclassification-model/81606
- https://discuss.huggingface.co/t/runtimeerror-when-saving-phi-3-5-vision-due-to-shared-tensors/116457/1 (My own post on the HF forums earlier today)
The error suggests “saving using safe_serialization=False”…but I’m not sure what the implications of that are.
Minimal Reproducible Example
from transformers import AutoModelForCausalLM
model_id = "microsoft/Phi-3.5-vision-instruct"
model = AutoModelForCausalLM.from_pretrained(
model_id, device_map="cuda", trust_remote_code=True, torch_dtype="auto"
)
model.save_pretrained("out")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ubuntu/AWSBedrockScienceModelDistillationTraining/.venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 2958, in save_pretrained
raise RuntimeError(
RuntimeError: The weights trying to be saved contained shared tensors [{'model.embed_tokens.weight', 'model.vision_embed_tokens.wte.weight'}] that are mismatching the transformers base configuration. Try saving using `safe_serialization=False` or remove this tensor sharing.