Skip to content

[feat] LoRA #149

Closed
@jlamypoirier

Description

@jlamypoirier

🎯 Goal (What & Why)

Add LoRA (Low-Rank Adaptation) support to Fast-LLM for flexible and memory-efficient fine-tuning.

Motivations:

🚀 Execution Plan

Step 1: What is the smallest working version?

  1. Minimal Integration: Add optional LoRA layers to Wq and Wv of each transformer layer in Fast-LLM.
  2. Configuration Design: Implement a minimal LoraConfig similar to PEFT's LoraConfig, focusing only on the essential parameters:
    • r (int): Lora attention dimension (the "rank").
    • lora_alpha (int): The alpha parameter for Lora scaling.
  3. MVP Approach: Keep the implementation simple:

Step 2: What additional optimizations are possible (later, out-of-scope for now)?

  1. Loading HF LoRA Models: Convert LoRA weights from HF hub to Fast-LLM LoRA weights.
  2. Advanced Configurations: Introduce more advanced LoRA configurations from PEFT's LoreConfig, e.g. to define which weights get LoRA adapters.
  3. Performance Optimization: Improve speed and memory efficiency. We shouldn't over-invest here, because LoRA is fast and memory-efficient by design already.
  4. Support for Complex Architectures: Extend LoRA to support token-switching (Phi-4) and MoEs, supplementing Fast-LLM's existing MoE approach.

📌 Acceptance Criteria (Must-Haves for Completion)

  • LoRA layers must be functional and tested in Fast-LLM.
  • The implementation must include clear documentation explaining the minimal viable setup and configurations.
  • The PR must include a tutorial for LoRA based fine-tuning.
  • The PR must provide a performance/impact summary demonstrating memory savings and fine-tuning flexibility.
  • No refactors unless directly necessary for feature completion.

🛠️ Project Management

  • Assign the project to the Fast-LLM project.
  • Set the Estimate field (in days) in the GitHub project.
  • Use the Size field to categorize the PR size (Small/Medium/Large).
  • Assign an owner when opening the issue.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions