feat(gepa): GEPA/GEPA-Flow Pareto optimizers + docs alignment by monotykamary · Pull Request #341 · ax-llm/ax

monotykamary · 2025-09-05T18:41:29Z

What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)
- Feature + Docs: Implement and document GEPA (single-module) and GEPA-Flow (multi-module) Pareto optimizers; migrate docs to the new compile-based multi-objective flow.
What is the current behavior? (You can also link to an open issue here)
- No GEPA optimizer existed prior; only MiPROv2
- Tracking: Resolves Add GEPA Optimizer #316
What is the new behavior (if this is a feature change)?
- Adds GEPA and GEPA-Flow optimizers using compile for multi-objective optimization (Pareto sampling, per-instance selection, hypervolume reporting).
- Introduces options (e.g., maxMetricCalls, tieEpsilon, paretoMetricKey/paretoScalarize) and strict acceptance aligned with the GEPA paper and repo.
- Updates docs/OPTIMIZE.md to provide GEPA/GEPA-Flow examples, result interpretation, and performance notes.

Other information:

References:
- Paper: https://arxiv.org/pdf/2507.19457
- Reference code: https://github.com/gepa-ai/gepa

Example runs (for reproducibility):

GEPA example (src/examples/gepa.ts)

$ npm run tsx src/examples/gepa.ts

> @ax-llm/ax-monorepo@14.0.20 tsx
> node --env-file=.env --import=tsx src/examples/gepa.ts

🚀 Running GEPA Pareto optimization (accuracy + brevity)...

● Optimization Started
──────────────────────────────────────────────────
  Optimizer: GEPA
  Examples: 11 training, 6 validation
  Config: {"numTrials":20,"minibatch":true}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

● Round 1/20
  Score: 16.200 (best: 16.200)
  Config: instructionLen=484.00, parent=0.00, totalRounds=20.00

● Round 2/20
  Score: 16.200 (best: 16.200)
  Config: instructionLen=437.00, parent=0.00, totalRounds=20.00


✅ Pareto optimization complete
Front size: 1
Hypervolume (2D): 0.19999999999999998

Top Pareto points:
  #1: accuracy=0.500, brevity=0.400, config={"candidate":0}

🎯 Chosen compromise score (0.7*acc + 0.3*brev): 0.470
Chosen configuration: {"candidate":0}

GEPA-Flow example (src/examples/gepa-flow.ts)

$ npm run tsx src/examples/gepa-flow.ts

> @ax-llm/ax-monorepo@14.0.20 tsx
> node --env-file=.env --import=tsx src/examples/gepa-flow.ts

[AxFlow] new AxFlow() is deprecated. Use flow() factory instead.
🚀 Running GEPA-Flow Pareto optimization (accuracy + brevity)...

● Optimization Started
──────────────────────────────────────────────────
  Optimizer: GEPA-Flow
  Examples: 11 training, 6 validation
  Config: {"numTrials":16,"minibatch":true}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

● Round 1/16
  Score: 6.000 (best: 6.000)
  Config: modules=2.00, totalRounds=16.00

● Round 2/16
  Score: 6.850 (best: 6.850)
  Config: modules=2.00, totalRounds=16.00

● Round 3/16
  Score: 5.700 (best: 5.700)
  Config: modules=2.00, totalRounds=16.00

● Round 4/16
  Score: 6.300 (best: 6.300)
  Config: modules=2.00, totalRounds=16.00

● Round 5/16
  Score: 5.850 (best: 5.850)
  Config: modules=2.00, totalRounds=16.00

● Round 6/16
  Score: 6.500 (best: 6.500)
  Config: modules=2.00, totalRounds=16.00


✅ Pareto optimization complete
Front size: 2
Hypervolume (2D): 0.3

Top Pareto points:
  #1: accuracy=0.667, brevity=0.450, config={"candidate":0}
  #2: accuracy=0.667, brevity=0.450, config={"candidate":1}

…ling Adds a sample-efficient optimizer with reflective mutation, Pareto-based candidate selection, optional crossover, and multi-objective support. Includes typed options and examples, and improves progress logging of total rounds.

Implements a Flow-aware GEPA variant that selects modules round-robin and supports system-aware merge across candidates. Adds AxFlow helpers to expose/set node instructions and exports the optimizer.

…e paper and simplify the API; refactor examples to remove basic variants and promote Pareto examples

…e GEPA-Pareto to GEPA in logs/metrics

…iPRO continues to use compilePareto

…g; align labels to GEPA-Flow

Implement paper’s Algorithm 2 candidate selection: - Track per-instance scalar scores on validation/Pareto set (S matrix) - Sample parent from non-dominated set weighted by per-instance wins - Compute scalar as mean of multi-objective metrics per instance - GEPA-Flow also samples second parent for merge via Algorithm 2 - Persist S for each accepted candidate to drive subsequent sampling Rationale: align behavior with the GEPA paper and improve exploration vs. archive crowding-distance selection; sets the stage for optional scalar acceptance gating in a follow-up.

Extract shared multi-objective utilities to paretoUtils and replace inline duplicates in gepa.ts and gepaFlow.ts. No functional changes; simplifies maintenance and sets up for paper parity.

…to size GEPA: enable periodic instruction merge with cap and progress reporting. GEPA-Flow: add merge caps, ancestry/desirability guards per Appendix F, and make Pareto set size configurable via args. Keeps acceptance via minibatch Pareto dominance.

…ards)\n\nIntroduce explicit D_feedback/D_pareto splits to control rollout budget; plumb evaluator textual feedback μ_f into reflection; default to σ-based minibatch acceptance with configurable epsilon; add scalarizer/metric-key for per-instance S and Pareto selection; implement system-aware merge guards (ancestor/outperforms, desirability, tried merges).

- Schedule merges via mergesDue/lastIterFoundNewProgram and skip reflective on merge attempts - Dominator-based pair + ancestor selection with desirability filter and duplicate-merge guard - Targeted subsample for merge acceptance (new_sum ≥ max(parent sums)); full eval on accept - Stricter minibatch acceptance; when adapter provided, also require minibatch sum(child) > sum(parent) - Parent selection via per-instance fronts; honor maxMetricCalls; preserve fallback behavior without adapter This aligns both GEPA and GEPA-Flow with the reference engine while keeping public API stable.

… source parity Seed RNG across selection/minibatching, enforce maxMetricCalls budget, and add Flow merge guards/de-dup for stable improvements; re-export adapter types and update examples.

Add domain terms (GEPA, Traj, etc.) to cspell and ignore dist to keep spelling checks green. Rename unused vars and drop unused imports in GEPA optimizers to satisfy lint without behavior changes.

…aults Align GEPA merges with source: replace LLM merge with parent-pick, add ancestor/desirability guards and de-dup, use seeded sampling, schedule merges after accepted improvements; default merges off and skipPerfectScore on to match reference behavior.

…d defaults Bring GEPA in line with source parity by tolerating score ties, removing the hard budget requirement, and defaulting skip‑perfect in flow to match single‑module; Pareto frontier now respects epsilon.

Enforce a positive `options.maxMetricCalls` in GEPA/GEPA-Flow compile loops to match the source implementation and avoid unbounded optimization runs. BREAKING CHANGE: compile now throws if `options.maxMetricCalls` is absent or non-positive.

…lign single-module merge gating with the reference engine so reflective mutation is skipped only when a merge is actually attempted, improving behavioral parity and avoiding lost reflective iterations when no valid merge pair exists.

…compile (remove compilePareto)

dosco · 2025-09-07T01:33:38Z

sorry for the delay, was travling, will take a look later today.

…#341) * feat(optimizer): introduce GEPA reflective evolution with Pareto sampling Adds a sample-efficient optimizer with reflective mutation, Pareto-based candidate selection, optional crossover, and multi-objective support. Includes typed options and examples, and improves progress logging of total rounds. * feat(optimizer): add GEPA-Flow for multi-module reflective evolution Implements a Flow-aware GEPA variant that selects modules round-robin and supports system-aware merge across candidates. Adds AxFlow helpers to expose/set node instructions and exports the optimizer. * feat(optimizer): make GEPA and GEPA-Flow Pareto-only to align with the paper and simplify the API; refactor examples to remove basic variants and promote Pareto examples * refactor(optimizer): remove legacy single-objective compile and rename GEPA-Pareto to GEPA in logs/metrics * docs(optimize): clarify that GEPA/GEPA-Flow use compile for Pareto; MiPRO continues to use compilePareto * chore(gepa-flow): use flow() factory and add OptimizationStart logging; align labels to GEPA-Flow * feat(gepa,gepa-flow): adopt per-instance Pareto selection (Alg. 2) Implement paper’s Algorithm 2 candidate selection: - Track per-instance scalar scores on validation/Pareto set (S matrix) - Sample parent from non-dominated set weighted by per-instance wins - Compute scalar as mean of multi-objective metrics per instance - GEPA-Flow also samples second parent for merge via Algorithm 2 - Persist S for each accepted candidate to drive subsequent sampling Rationale: align behavior with the GEPA paper and improve exploration vs. archive crowding-distance selection; sets the stage for optional scalar acceptance gating in a follow-up. * refactor(optimizer): centralize Pareto helpers for GEPA/GEPA-Flow Extract shared multi-objective utilities to paretoUtils and replace inline duplicates in gepa.ts and gepaFlow.ts. No functional changes; simplifies maintenance and sets up for paper parity. * feat(gepa,gepa-flow): add Merge strategy and guards; parametrize Pareto size GEPA: enable periodic instruction merge with cap and progress reporting. GEPA-Flow: add merge caps, ancestry/desirability guards per Appendix F, and make Pareto set size configurable via args. Keeps acceptance via minibatch Pareto dominance. * feat(gepa,gepa-flow): align with GEPA paper (splits, μf, σ-accept, guards)\n\nIntroduce explicit D_feedback/D_pareto splits to control rollout budget; plumb evaluator textual feedback μ_f into reflection; default to σ-based minibatch acceptance with configurable epsilon; add scalarizer/metric-key for per-instance S and Pareto selection; implement system-aware merge guards (ancestor/outperforms, desirability, tried merges). * feat(gepa,gepa-flow): source-parity merges, acceptance, and adapter path - Schedule merges via mergesDue/lastIterFoundNewProgram and skip reflective on merge attempts - Dominator-based pair + ancestor selection with desirability filter and duplicate-merge guard - Targeted subsample for merge acceptance (new_sum ≥ max(parent sums)); full eval on accept - Stricter minibatch acceptance; when adapter provided, also require minibatch sum(child) > sum(parent) - Parent selection via per-instance fronts; honor maxMetricCalls; preserve fallback behavior without adapter This aligns both GEPA and GEPA-Flow with the reference engine while keeping public API stable. * feat(gepa,gepa-flow): deterministic selection + strict acceptance for source parity Seed RNG across selection/minibatching, enforce maxMetricCalls budget, and add Flow merge guards/de-dup for stable improvements; re-export adapter types and update examples. * chore(cspell,gepa,gepa-flow): add GEPA terms; resolve lint warnings Add domain terms (GEPA, Traj, etc.) to cspell and ignore dist to keep spelling checks green. Rename unused vars and drop unused imports in GEPA optimizers to satisfy lint without behavior changes. * feat(gepa): source-parity single-module merges, guards, and safer defaults Align GEPA merges with source: replace LLM merge with parent-pick, add ancestor/desirability guards and de-dup, use seeded sampling, schedule merges after accepted improvements; default merges off and skipPerfectScore on to match reference behavior. * feat(gepa,gepa-flow,optimizer): epsilon ties, optional budget, aligned defaults Bring GEPA in line with source parity by tolerating score ties, removing the hard budget requirement, and defaulting skip‑perfect in flow to match single‑module; Pareto frontier now respects epsilon. * feat(gepa,gepa-flow): require maxMetricCalls for strict parity Enforce a positive `options.maxMetricCalls` in GEPA/GEPA-Flow compile loops to match the source implementation and avoid unbounded optimization runs. BREAKING CHANGE: compile now throws if `options.maxMetricCalls` is absent or non-positive. * fix(gepa): only skip reflective after an evaluated merge attempt\n\nAlign single-module merge gating with the reference engine so reflective mutation is skipped only when a merge is actually attempted, improving behavioral parity and avoiding lost reflective iterations when no valid merge pair exists. * docs(optimize): migrate multi-objective docs to GEPA/GEPA-Flow using compile (remove compilePareto) --------- Co-authored-by: Spacy <832235+dosco@users.noreply.github.com>

monotykamary added 18 commits September 5, 2025 19:28

feat(optimizer): add GEPA-Flow for multi-module reflective evolution

8d014b8

Implements a Flow-aware GEPA variant that selects modules round-robin and supports system-aware merge across candidates. Adds AxFlow helpers to expose/set node instructions and exports the optimizer.

feat(optimizer): make GEPA and GEPA-Flow Pareto-only to align with th…

ab3f030

…e paper and simplify the API; refactor examples to remove basic variants and promote Pareto examples

refactor(optimizer): remove legacy single-objective compile and renam…

59c5e95

…e GEPA-Pareto to GEPA in logs/metrics

docs(optimize): clarify that GEPA/GEPA-Flow use compile for Pareto; M…

8882fd4

…iPRO continues to use compilePareto

chore(gepa-flow): use flow() factory and add OptimizationStart loggin…

59bc8dd

…g; align labels to GEPA-Flow

refactor(optimizer): centralize Pareto helpers for GEPA/GEPA-Flow

9139a4c

Extract shared multi-objective utilities to paretoUtils and replace inline duplicates in gepa.ts and gepaFlow.ts. No functional changes; simplifies maintenance and sets up for paper parity.

feat(gepa,gepa-flow): deterministic selection + strict acceptance for…

4dc7f0f

… source parity Seed RNG across selection/minibatching, enforce maxMetricCalls budget, and add Flow merge guards/de-dup for stable improvements; re-export adapter types and update examples.

chore(cspell,gepa,gepa-flow): add GEPA terms; resolve lint warnings

41035b9

Add domain terms (GEPA, Traj, etc.) to cspell and ignore dist to keep spelling checks green. Rename unused vars and drop unused imports in GEPA optimizers to satisfy lint without behavior changes.

docs(optimize): migrate multi-objective docs to GEPA/GEPA-Flow using …

3f59df3

…compile (remove compilePareto)

monotykamary mentioned this pull request Sep 6, 2025

Add GEPA Optimizer #316

Closed

Merge branch 'main' into feat/gepa-optimizer

fc4b0ef

dosco merged commit f61c18a into ax-llm:main Sep 10, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(gepa): GEPA/GEPA-Flow Pareto optimizers + docs alignment#341

feat(gepa): GEPA/GEPA-Flow Pareto optimizers + docs alignment#341
dosco merged 19 commits intoax-llm:mainfrom
monotykamary:feat/gepa-optimizer

monotykamary commented Sep 5, 2025 •

edited

Loading

Uh oh!

dosco commented Sep 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

monotykamary commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dosco commented Sep 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

monotykamary commented Sep 5, 2025 •

edited

Loading