Request for SFT Training scripts and implementation details

Thank you for sharing your research work. I have a question related to the supervised fine-tuning step, which, according to the paper, is used to initialize the `base` model before running SimPO. While the SFT configuration file is provided at `training_configs/llama-3-8b-base-sft.yaml`, may I ask for the SFT training script itself?

In issue #27, there is 1 comment asking about how `HuggingFaceH4/ultrachat_200k` is processed for SFT. I would like to know this too. `HuggingFaceH4/ultrachat_200k` samples are multi-turn dialogues. Therefore, I am curious about what labels are used for SFT.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Request for SFT Training scripts and implementation details #71

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Request for SFT Training scripts and implementation details #71

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions