Skip to content

Conversation

@naufalso
Copy link

After the LLaMA model finetuning using the existing training code, I realized that the model never outputs the EOS token, which causes the generation never stop until max_new_token is reached.

I tried to debug the code and found that tokenizer.eos_token, tokenizer.bos_token, and tokenizer.unk_token are all '' (empty string).

Since '' (empty string) is not equal to None, the custom tokens in the training code will not be added. So I would suggest fixing using the current code changes.

I have tested that after the training using the modified code, the model can output EOS token correctly.

After the LLaMA model finetuning using the existing training code, I realized that the model never outputs the EOS token, which causes the generation never stop until max_new_token is reached.

I tried to debug the code and found that `tokenizer.eos_token`, `tokenizer.bos_token`, and `tokenizer.unk_token` are all `'' (empty string).`

Since `'' (empty string)` is not equal to `None`, the custom tokens in the training code will not be added. So I would  suggest fixing using the current code changes.

I have tested that after the training using the modified code, the model can output EOS token correctly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant