Skip to content

Option to vary configuration parameters across layers #155

@jlamypoirier

Description

@jlamypoirier

🎯 Goal (What & Why)

We have several use-cases for varying parameters across layers (#147, #153) and will likely have many more in the future.

Best and simplest way to implement this would be a per-layer override mechanism based on #154, ex

transformer:
  [...]
  window_size: 8192
  overrides:
    - layers: 0:24:2
      config:
        window_size: null

🚀 Execution Plan

This is relatively simple to do once we have an override mechanism (#154)

Step 1: What is the smallest working version?

(Describe the simplest way to implement this feature with minimal effort.)

Step 2: What additional optimizations are possible (but optional)?

(List potential refinements that can be added in later PRs if needed.)

📌 Acceptance Criteria (Must-Haves for Completion)

  • The feature must be functional and tested.
  • The implementation must be documented in practical terms.
  • The PR must include a performance/impact summary.
  • No refactors unless directly necessary for feature completion.

🛠️ Project Management

  • Assign the project to the Fast-LLM project.
  • Set the Estimate field (in days) in the GitHub project.
  • Use the Size field to categorize the PR size (Small/Medium/Large).
  • Assign an owner when opening the issue.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions