Skip to content

Conversation

@patrik-bartak
Copy link
Contributor

@patrik-bartak patrik-bartak commented Apr 25, 2025

#390

Right now the RL and SFT datasets use the chat template of the tokenizer to format the prompt. Some models do not have a chat template (for example deepseek-ai/deepseek-coder-1.3b-base), and sometimes it can be useful to override this template. This adds the option to override the template when loading the tokenizer.

For example, setting to "{{ bos_token }}{{ messages }}" and assigning "string" to the dataset prompt key, will make the prompt_with_chat_template <bos>string

By default the config value is null, so that the default model template is used.

I have added the required code to make this work with the RL dataset, SFT dataset, and main_ppo / main_ppo_split. If there are other parts of the code that should be updated to make this option work, let me know.

More info here https://huggingface.co/docs/transformers/main/en/chat_templating

@vermouth1992
Copy link
Collaborator

Could you help modify one test to use custom chat template? Maybe passing the chat template from the tokenizer just for testing?

@patrik-bartak
Copy link
Contributor Author

patrik-bartak commented Apr 30, 2025

@vermouth1992
I don't really see a test that I could modify. This PR just lets you override the chat_template using a key in the config. It is then automatically used when the code runs apply_chat_template.

Maybe the function hf_tokenizer can include a warning/info print that says if the default template is being used, or if it is being overwritten? When I first used verl, it was not clear to me why the prompt was different from what I set up in the data_preprocess. Then I found out the chat template was being applied.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants