Skip to content

Improve LoRA Compatibility by Renaming value_proj and query_proj in PaliGemmaVitAttention #2104

@b05505027

Description

@b05505027

Hi
I was working on my project to finetune PaliGemma model and noticed that the attention layers inside paligemma.backbone.vit_encoder are not being LoRA enabled when calling:

paligemma.backbone.enable_lora(rank=LORA_RANK)

(where LORA_RANK is an integer value representing the LoRA rank).

I found that the PaliGemmaVitAttention class uses names like value_proj and query_proj for its query and value layers. However, the enable_lora function defined in the Backbone class only enables layers named as value_dense, query_dense, query, or value.

To ensure consistency and enable LoRA properly for these layers, I propose renaming:

  • value_projvalue_dense
  • query_projquery_dense

in the PaliGemmaVitAttention's query and value layers. This ensures that the enable_lora function correctly applies LoRA modifications to the ViT encoder layers without requiring additional modifications to the backbone class.

If there are alternative naming preferences that better fit the project’s conventions, I’m happy to adjust accordingly. Thanks!

Metadata

Metadata

Assignees

Labels

GemmaGemma model specific issues

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions