feat: Adds embed_tokens, lm_head as trainable for vocab expansion in peft and enables tying of adapters #625

romitjain · 2025-10-30T08:53:52Z

Description of the change

If the embedding layer has been resized, this PR adds embed_tokens and lm_head as trainable layers
If any of the tied layers are added in modules_to_save, enable the flag to ensure weight tying between adapters of the tied layers
If any of the tied layers are added in target_modules, enable the flag to ensure weight tying between adapters of the tied layers

Related issue number

Closes https://github.ibm.com/ai-foundation/watson-fm-stack-tracker/issues/1673

How to verify the PR

For 1

 python -m pytest tests/test_sft_trainer.py::test_run_causallm_lora_add_special_tokens

For 2 and 3

 python -m pytest tests/test_sft_trainer.py -k "tied_weights"

Was the PR tested

I have added >=1 unit test(s) for every new method I have added.
I have ensured all unit tests pass

Signed-off-by: romit <romit@ibm.com>

github-actions · 2025-10-30T08:54:00Z

Thanks for making a pull request! 😃
One of the maintainers will review and advise on the next steps.

tuning/sft_trainer.py

Signed-off-by: romit <romit@ibm.com>

tests/artifacts/language_models/__init__.py

romitjain · 2025-11-04T12:22:04Z

pyproject.toml

 "tqdm>=4.66.2,<5.0",
 "trl>=0.19.1,<0.20.0",
-"peft @ git+https://github.com/huggingface/peft.git@293aea5df6db240856a77f89955d1a89ce38b50d",
+"peft @ git+https://github.com/romitjain/peft.git@8388aa869473a60589a01e6950ea0583d3612783",


This is a temporary path to my fork so that the tests can be validated. Once my PR in PEFT is merged, we can point to huggingface peft

Signed-off-by: romit <romit@ibm.com>

romitjain · 2025-11-07T07:31:03Z

tests/test_sft_trainer.py

I have added copy.deepcopy for PEFT_LORA_ARGS as the trainer was updating this inplace affecting other tests.

romitjain · 2025-11-07T07:31:40Z

tests/artifacts/language_models/maykeye-tinyllama-v0/config.json

  "pad_token_id": 0,
  "rms_norm_eps": 1e-06,
-  "tie_word_embeddings": false,
+  "tie_word_embeddings": true,


This was all that was required to use this model in my tests. This does not break any other test (AFAIK)

Signed-off-by: romit <romit@ibm.com>

Added implementation

2ad3840

Signed-off-by: romit <romit@ibm.com>

github-actions bot added the feat label Oct 30, 2025

romitjain commented Oct 30, 2025

View reviewed changes

tuning/sft_trainer.py Outdated Show resolved Hide resolved

Added test artifact and all tests

4fff10b

Signed-off-by: romit <romit@ibm.com>

romitjain marked this pull request as ready for review November 4, 2025 12:21

romitjain requested review from aluu317, anhuong, dushyantbehl, fabianlim and kmehant as code owners November 4, 2025 12:21

romitjain commented Nov 4, 2025

View reviewed changes

tests/artifacts/language_models/__init__.py Outdated Show resolved Hide resolved

romitjain commented Nov 4, 2025

View reviewed changes

romitjain added 2 commits November 6, 2025 04:39

Moved to llama model for all testing

a5252fe

Signed-off-by: romit <romit@ibm.com>

Added deppcopy for peft config

931c087

Signed-off-by: romit <romit@ibm.com>

romitjain commented Nov 7, 2025

View reviewed changes

Linting fixes

4f16cc8

Signed-off-by: romit <romit@ibm.com>

dushyantbehl added the on hold This PR is on hold and will not be merged right away label Nov 7, 2025

Merge branch 'main' into feat/embed-lmhead-tying-and-expansion

eeae8b7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Adds embed_tokens, lm_head as trainable for vocab expansion in peft and enables tying of adapters #625

feat: Adds embed_tokens, lm_head as trainable for vocab expansion in peft and enables tying of adapters #625

romitjain commented Oct 30, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 30, 2025

Uh oh!

Uh oh!

Uh oh!

romitjain Nov 4, 2025

Uh oh!

romitjain Nov 7, 2025

Uh oh!

romitjain Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Adds embed_tokens, lm_head as trainable for vocab expansion in peft and enables tying of adapters #625

Are you sure you want to change the base?

feat: Adds embed_tokens, lm_head as trainable for vocab expansion in peft and enables tying of adapters #625

Conversation

romitjain commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of the change

Related issue number

How to verify the PR

Was the PR tested

Uh oh!

github-actions bot commented Oct 30, 2025

Uh oh!

Uh oh!

Uh oh!

romitjain Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

romitjain Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

romitjain Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

romitjain commented Oct 30, 2025 •

edited

Loading