-
Notifications
You must be signed in to change notification settings - Fork 65
feat: Adds embed_tokens, lm_head as trainable for vocab expansion in peft and enables tying of adapters #625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: romit <romit@ibm.com>
|
Thanks for making a pull request! 😃 |
Signed-off-by: romit <romit@ibm.com>
pyproject.toml
Outdated
| "tqdm>=4.66.2,<5.0", | ||
| "trl>=0.19.1,<0.20.0", | ||
| "peft @ git+https://github.com/huggingface/peft.git@293aea5df6db240856a77f89955d1a89ce38b50d", | ||
| "peft @ git+https://github.com/romitjain/peft.git@8388aa869473a60589a01e6950ea0583d3612783", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a temporary path to my fork so that the tests can be validated. Once my PR in PEFT is merged, we can point to huggingface peft
Signed-off-by: romit <romit@ibm.com>
Signed-off-by: romit <romit@ibm.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added copy.deepcopy for PEFT_LORA_ARGS as the trainer was updating this inplace affecting other tests.
| "pad_token_id": 0, | ||
| "rms_norm_eps": 1e-06, | ||
| "tie_word_embeddings": false, | ||
| "tie_word_embeddings": true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was all that was required to use this model in my tests. This does not break any other test (AFAIK)
Signed-off-by: romit <romit@ibm.com>
Description of the change
embed_tokensandlm_headas trainable layersmodules_to_save, enable the flag to ensure weight tying between adapters of the tied layerstarget_modules, enable the flag to ensure weight tying between adapters of the tied layersRelated issue number
Closes https://github.ibm.com/ai-foundation/watson-fm-stack-tracker/issues/1673
How to verify the PR
For 1
For 2 and 3
Was the PR tested