Skip to content
Merged
Changes from 1 commit
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
fd7a569
[FEAT] Support async multi-turn rollout with simulation feedback
kinza99 May 21, 2025
1c13235
[DOC] Update sglang multi-turn rollout doc
kinza99 May 22, 2025
6c2cf1b
[Update] update user interaction design
kinza99 May 28, 2025
c0176c1
[Update] add testing and fix bugs
kinza99 May 29, 2025
d560cf3
[Fix] fix some problems
kinza99 May 30, 2025
5fbdd11
Fix unit-test and separate examples from previous tool
SwordFaith May 30, 2025
cb4baa7
Fix megatron workers and formatting
SwordFaith May 30, 2025
ed070cc
[Update] merge the latest main version
kinza99 Jun 4, 2025
fbfdcd0
Add training script
SwordFaith Jun 4, 2025
4ea2f1a
Fix assertion
SwordFaith Jun 4, 2025
4b18b69
Fix max_turns
SwordFaith Jun 4, 2025
878b1aa
Fix init interaction missing
SwordFaith Jun 4, 2025
8023dcb
Fix interface
SwordFaith Jun 4, 2025
dc3157e
Lower gpu mem foot print
SwordFaith Jun 4, 2025
3104159
Merge remote-tracking branch 'upstream/main' into multi_turns_with_fe…
SwordFaith Jun 4, 2025
cc31550
Fix init interaction missing issue
SwordFaith Jun 4, 2025
ae9217a
[Fix] fix problem with exceeding max_new_tokens
kinza99 Jun 5, 2025
cbddd2e
Update training data path
SwordFaith Jun 5, 2025
0c4338f
Fix prompt in preprocess interaction
SwordFaith Jun 6, 2025
eec48a9
Add 0.5b train script
SwordFaith Jun 7, 2025
564c832
Fix gsm8k reward in multi-turn scene
SwordFaith Jun 7, 2025
b75af4e
Try fix race condition in sampling params update
SwordFaith Jun 9, 2025
ca47f66
Fix sglang rollout sampling params
SwordFaith Jun 9, 2025
64a8ad7
Merge remote-tracking branch 'upstream/main' into multi_turns_with_fe…
SwordFaith Jun 10, 2025
34378ce
Fix bug and redundant error
SwordFaith Jun 10, 2025
ddb4881
Remove format config
SwordFaith Jun 10, 2025
9592979
Merge remote-tracking branch 'upstream/main' into multi_turns_with_fe…
SwordFaith Jun 17, 2025
92a47ba
Fix interaction config default value bug
SwordFaith Jun 17, 2025
8f45437
Fix default value judge logic
SwordFaith Jun 17, 2025
98adc05
Fix arg issue
SwordFaith Jun 17, 2025
2d43631
Fix format error
SwordFaith Jun 17, 2025
9a86042
Fix other format errors
SwordFaith Jun 17, 2025
5cb7a60
Clean training scripts
SwordFaith Jun 17, 2025
7e83b72
Merge remote-tracking branch 'upstream/main' into multi_turns_with_fe…
SwordFaith Jun 18, 2025
7f426f3
Merge branch 'main' into duhe/multi_turns_with_feedback
SwordFaith Jun 18, 2025
13a9615
Fix sf tool test
SwordFaith Jun 18, 2025
3c4a351
Fix ci error
SwordFaith Jun 18, 2025
ffc9366
Fix aglang tests
SwordFaith Jun 18, 2025
c0a035c
Merge upstream/main into multi_turns_with_feedback
SwordFaith Jun 20, 2025
21110df
Add test and doc for interaction
SwordFaith Jun 20, 2025
fd11f7b
Try fix mcp tool test
SwordFaith Jun 20, 2025
a1a021f
Fix sglang mcp tools test
SwordFaith Jun 20, 2025
df5de70
Fix pre-commit run issue
SwordFaith Jun 20, 2025
2abda2e
Fix doc and test
SwordFaith Jun 20, 2025
432eeff
Merge branch 'main' into duhe/multi_turns_with_feedback
zhaochenyang20 Jun 20, 2025
c5fe07a
Try fix ci issues
SwordFaith Jun 21, 2025
df42678
Fix chat completion arg bug
SwordFaith Jun 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update training data path
  • Loading branch information
SwordFaith committed Jun 5, 2025
commit cbddd2ef80894a21138a0110423b364b82eddbad
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ ulimit -n 65535
PROJECT_DIR="$(pwd)"
CONFIG_PATH="$PROJECT_DIR/examples/sglang_multiturn/config"
TRAIN_BATCH_SIZE=${TRAIN_BATCH_SIZE:-512}
MICRO_BATCH_SIZE=${MICRO_BATCH_SIZE:-16}
MICRO_BATCH_SIZE=${MICRO_BATCH_SIZE:-8}
OFFLOAD=${OFFLOAD:-True}
HOME=/user/longxiang1

Expand Down Expand Up @@ -65,8 +65,8 @@ python3 -m verl.trainer.main_ppo \
trainer.nnodes=1 \
trainer.save_freq=-1 \
trainer.test_freq=20 \
data.train_files=$HOME/data/gsm8k_verl_sgl_multi_turn_preprocessed_v2/train.parquet \
data.val_files=$HOME/data/gsm8k_verl_sgl_multi_turn_preprocessed_v2/test.parquet \
data.train_files=$HOME/data/gsm8k_verl_sgl_multi_turn_w_interaction/train.parquet \
data.val_files=$HOME/data/gsm8k_verl_sgl_multi_turn_w_interaction/test.parquet \
actor_rollout_ref.rollout.multi_turn.interaction_config_path="$PROJECT_DIR/examples/sglang_multiturn/config/interaction_config/gsm8k_interaction_config.yaml" \
trainer.total_epochs=15 $@