[feat] add validation shuffle (verl-project#1886)

chenjiaoAngel · chenjiaoAngel · commit 10fd9ba23269 · 2025-06-07T13:12:03.000+08:00
### Checklist Before Starting - [x] Search for similar PR(s). ### What does this PR do? In scenarios involving multiple validation sets, where the difficulty levels of these sets differ significantly and the generated content lengths vary notably, the order in which the validation sets are processed can have a substantial impact on the validation speed. ### High-Level Design add validation shuffle ### Usage Example > Provide usage example(s) for easier usage. ```python validation_shuffle: True ``` ### Test Validation speed increase of over 10%. ### Checklist Before Submitting - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting). - [ ] Add `[BREAKING]` to the PR title if it breaks any API. - [ ] Update the documentation about your changes in the [docs](https://github.com/volcengine/verl/tree/main/docs). - [ ] New CI unit test(s) are added to cover the code path. - [ ] Rely on existing unit tests on CI that covers the code path.
diff --git a/verl/trainer/config/ppo_trainer.yaml b/verl/trainer/config/ppo_trainer.yaml
@@ -13,6 +13,7 @@ data:
   return_raw_chat: False
   return_full_prompt: False
   shuffle: True
+  validation_shuffle: False
   filter_overlong_prompts: False # for large-scale dataset, filtering overlong prompts could be timeconsuming. You cat set the filter_overlong_prompts_workers to use multiprocessing to speed up.
   filter_overlong_prompts_workers: 1
   truncation: error
diff --git a/verl/trainer/ppo/ray_trainer.py b/verl/trainer/ppo/ray_trainer.py
@@ -550,7 +550,7 @@ def _create_dataloader(self, train_dataset, val_dataset, collate_fn, train_sampl
             dataset=self.val_dataset,
             batch_size=val_batch_size,
             num_workers=self.config.data.get("dataloader_num_workers", 8),
-            shuffle=False,
+            shuffle=self.config.data.get("validation_shuffle", True),
             drop_last=False,
             collate_fn=collate_fn,
         )

Original file line number	Diff line number	Diff line change
`@@ -550,7 +550,7 @@ def _create_dataloader(self, train_dataset, val_dataset, collate_fn, train_sampl`
`550`	`550`	`dataset=self.val_dataset,`
`551`	`551`	`batch_size=val_batch_size,`
`552`	`552`	`num_workers=self.config.data.get("dataloader_num_workers", 8),`
`553`		`- shuffle=False,`
	`553`	`+ shuffle=self.config.data.get("validation_shuffle", True),`
`554`	`554`	`drop_last=False,`
`555`	`555`	`collate_fn=collate_fn,`
`556`	`556`	`)`