-
Notifications
You must be signed in to change notification settings - Fork 141
Open
Description
Thank you for this amazing project!
While I am going to conduct the QwenVL classification task, I have some questions about the parameters used in finetune_cls.sh:
- What is the global batch size used for the default learning rate?
- Why is the head_lr different from learning_rate? Is this the empirical set?
- Why are you setting the llm frozen while the vision tower and merger module trainable, which is in contrast to the setup of sft the mllm (usually we only finetune the LLM part).
Metadata
Metadata
Assignees
Labels
No labels