Description
🎯 Goal (What & Why)
We have several use cases for Fast-LLM configuration where we would like to define a default configuration, then override specific parameters in specific cases:
- Dataset-dependent sampling config ([feat] Option to configure sampling independently for each datasets #131). Configure dataset sampling globally, then optionally update specific parameters (seed, shuffling, etc.) for specific phases. A hacky solution was implemented in Improve dataset sampling #138 but is not viable in the long-term.
- Staged training ([feat] Staged training #151): Have a default training configuration, then override specific parameters in stages.
- Variable layer configuration (#xx): Use a different configuration for specific layers.
The current configuration mechanism allows for configuration overrides in code, but there is no equivalent for yaml configuration. Some options:
- Define the override as a config class. Already possible, but can only support which means excessively verbose and error-prone duplicate configurations.
- Use free-form dicts (
dict[Any, Any]
). Would work already, but config validation would mean applying the override and checking that it works - Define a
ConfigurationOverride[ConfigClass]
generic that performs basic validation for defined parameters. In it simplest form this would be a minimalistic class built on top of the free-form dice defined above.
🚀 Execution Plan
This doesn't really need work by itself, it will present itself as a sub-issue when we need it (#131, #151, #xx)
Step 1: What is the smallest working version?
(Describe the simplest way to implement this feature with minimal effort.)
Step 2: What additional optimizations are possible (but optional)?
(List potential refinements that can be added in later PRs if needed.)
📌 Acceptance Criteria (Must-Haves for Completion)
- The feature must be functional and tested.
- The implementation must be documented in practical terms.
- The PR must include a performance/impact summary.
- No refactors unless directly necessary for feature completion.
🛠️ Project Management
- Assign the project to the Fast-LLM project.
- Set the
Estimate
field (in days) in the GitHub project. - Use the
Size
field to categorize the PR size (Small/Medium/Large). - Assign an owner when opening the issue.