We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent a3c8b3c commit 0adca22Copy full SHA for 0adca22
EPO/LLaMA-Factory/README.md
@@ -16,7 +16,7 @@ pip install -e.
16
Take Sotopia as an example, the process of constructing training data for RL is as follows:
17
```bash
18
1. Collect strategy and dialogue data from sotopia_pi
19
-2. cd utils
+2. cd data/utils
20
2. python prm.py
21
3. python preprocessing.py
22
```
0 commit comments