rinna/qwen2.5-bakeneko-32b · Changed EOS token from Qwen2.5?

7 days ago

•

Hello, I tried finetuning the model earlier, but the resulting model keeps on producing text until it reaches the limit.
After some investigation, I noticed that, according to the config, you seem to have changed the original Qwen2.5 model's EOS token from "<|im_end|>" to "<|endoftext|>". Is this correct and intended, or some kind of mistake?

Keely0419

rinna Co., Ltd. org 7 days ago

Hi, thanks for your question!

Both our model and the original Qwen2.5 base model use <|endoftext|> as EOS. Please kindly check it at tokenizer_config_#L200.

For the problem of producing endless text, since it is a base model, we are not recommended to use it directly for any instruction-following task.
If your fine-tuned model resulted in the same problem, could you share more details about how you're handling tokenization and decoding? Ensuring that the model properly recognizes the EOS token during training and generation might help resolve the issue.

Nightoo

6 days ago

Sorry, seems like that was my misunderstanding.

The problem occurred when finetuning with the SFTTrainer, even though I appended the EOS token to every example.
I will try the instruct version for now. Thanks a lot!

Nightoo changed discussion status to closed 6 days ago