Add RoPE scaling feature to SFT trainer #1152

xingyaoww · 2025-04-18T19:09:58Z

Add RoPE Scaling Feature to SFT Trainer

Description

This PR adds support for RoPE (Rotary Position Embedding) scaling in the SFT trainer. RoPE scaling is a technique that allows models to handle longer context lengths than they were originally trained on by scaling the position embeddings.

Changes

Added RoPE scaling configuration support in the FSDP SFT trainer
Implemented model config override mechanism for RoPE scaling parameters
Added appropriate logging for RoPE scaling configuration

Usage

To use this feature, add a rope_scaling configuration in your model config:

model:
  rope_scaling:
    type: "linear"  # or "dynamic"
    factor: 2.0     # scaling factor

Testing
Tested with various models that support RoPE scaling, including Llama and Qwen models.

eric-haibin-lin · 2025-04-20T05:05:40Z

verl/trainer/fsdp_sft_trainer.py

+        override_config_kwargs = {}
+        if 'rope_scaling' in self.config.model and self.config.model.rope_scaling is not None:
+            override_config_kwargs['rope_scaling'] = dict(self.config.model.rope_scaling)
+            print(f'rope_scaling setted. rope_scaling={override_config_kwargs["rope_scaling"]}')


setted -> set

eric-haibin-lin · 2025-04-20T05:06:13Z

@openhands-agent could you merge with main and fix potential pre-commit errors?

xingyaoww · 2025-04-20T15:11:37Z

@eric-haibin-lin i think you need to "@OpenHands" for OpenHands Cloud :)

Add RoPE scaling feature to SFT trainer

dc9a124

eric-haibin-lin reviewed Apr 20, 2025

View reviewed changes

ZihengJiang added the status: need review label Apr 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add RoPE scaling feature to SFT trainer #1152

Add RoPE scaling feature to SFT trainer #1152

Uh oh!

xingyaoww commented Apr 18, 2025

Uh oh!

eric-haibin-lin Apr 20, 2025

Uh oh!

eric-haibin-lin commented Apr 20, 2025

Uh oh!

xingyaoww commented Apr 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add RoPE scaling feature to SFT trainer #1152

Are you sure you want to change the base?

Add RoPE scaling feature to SFT trainer #1152

Uh oh!

Conversation

xingyaoww commented Apr 18, 2025

Add RoPE Scaling Feature to SFT Trainer

Description

Changes

Usage

Uh oh!

eric-haibin-lin Apr 20, 2025

Choose a reason for hiding this comment

Uh oh!

eric-haibin-lin commented Apr 20, 2025

Uh oh!

xingyaoww commented Apr 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants