docs: Show how to start RL from an existing SFT LoRA adapter #325

philippnormann · 2025-08-08T09:08:21Z

Summary

Add docs showing how to initialize RL from an existing SFT LoRA by passing the adapter directory as the base_model when constructing art.TrainableModel. Includes a minimal example and concise motivation.

Changes

docs/fundamentals/art-client.mdx: add “Initializing from an existing SFT LoRA” section with TrainableModel example and “Why this?” (warm-start, small-model stability).
docs/getting-started/faq.mdx: add FAQ entry with short TrainableModel snippet.

Motivation

Many users fine-tune with SFT (e.g., Unsloth/PEFT) and want to continue with RL; pointing base_model on TrainableModel to the adapter directory is the simplest path and improves early training, especially for small models.

bradhilton · 2025-08-08T12:34:59Z

LGTM, any feedback @arcticfly?

arcticfly · 2025-08-11T07:27:44Z

Looks good! Thanks :)

docs: Show how to start RL from an existing SFT LoRA adapter

fd8d16b

bradhilton requested a review from arcticfly August 8, 2025 12:34

arcticfly merged commit 75c761a into OpenPipe:main Aug 11, 2025
1 check passed

JonesAndrew mentioned this pull request Aug 12, 2025

Release v0.4.6 #331

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: Show how to start RL from an existing SFT LoRA adapter #325

docs: Show how to start RL from an existing SFT LoRA adapter #325

Uh oh!

philippnormann commented Aug 8, 2025 •

edited

Loading

Uh oh!

bradhilton commented Aug 8, 2025

Uh oh!

arcticfly commented Aug 11, 2025

Uh oh!

Uh oh!

Uh oh!

docs: Show how to start RL from an existing SFT LoRA adapter #325

docs: Show how to start RL from an existing SFT LoRA adapter #325

Uh oh!

Conversation

philippnormann commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Motivation

Uh oh!

bradhilton commented Aug 8, 2025

Uh oh!

arcticfly commented Aug 11, 2025

Uh oh!

Uh oh!

Uh oh!

philippnormann commented Aug 8, 2025 •

edited

Loading