Skip to content

feat(gepa): GEPA/GEPA-Flow Pareto optimizers + docs alignment#341

Merged
dosco merged 19 commits intoax-llm:mainfrom
monotykamary:feat/gepa-optimizer
Sep 10, 2025
Merged

feat(gepa): GEPA/GEPA-Flow Pareto optimizers + docs alignment#341
dosco merged 19 commits intoax-llm:mainfrom
monotykamary:feat/gepa-optimizer

Conversation

@monotykamary
Copy link
Contributor

@monotykamary monotykamary commented Sep 5, 2025

  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)

    • Feature + Docs: Implement and document GEPA (single-module) and GEPA-Flow (multi-module) Pareto optimizers; migrate docs to the new compile-based multi-objective flow.
  • What is the current behavior? (You can also link to an open issue here)

  • What is the new behavior (if this is a feature change)?

    • Adds GEPA and GEPA-Flow optimizers using compile for multi-objective optimization (Pareto sampling, per-instance selection, hypervolume reporting).
    • Introduces options (e.g., maxMetricCalls, tieEpsilon, paretoMetricKey/paretoScalarize) and strict acceptance aligned with the GEPA paper and repo.
    • Updates docs/OPTIMIZE.md to provide GEPA/GEPA-Flow examples, result interpretation, and performance notes.
  • Other information:

    • References:

    • Example runs (for reproducibility):

      GEPA example (src/examples/gepa.ts)

      $ npm run tsx src/examples/gepa.ts
      
      > @ax-llm/ax-monorepo@14.0.20 tsx
      > node --env-file=.env --import=tsx src/examples/gepa.ts
      
      🚀 Running GEPA Pareto optimization (accuracy + brevity)...
      
      ● Optimization Started
      ──────────────────────────────────────────────────
        Optimizer: GEPA
        Examples: 11 training, 6 validation
        Config: {"numTrials":20,"minibatch":true}
      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
      
      ● Round 1/20
        Score: 16.200 (best: 16.200)
        Config: instructionLen=484.00, parent=0.00, totalRounds=20.00
      
      ● Round 2/20
        Score: 16.200 (best: 16.200)
        Config: instructionLen=437.00, parent=0.00, totalRounds=20.00
      
      
      ✅ Pareto optimization complete
      Front size: 1
      Hypervolume (2D): 0.19999999999999998
      
      Top Pareto points:
        #1: accuracy=0.500, brevity=0.400, config={"candidate":0}
      
      🎯 Chosen compromise score (0.7*acc + 0.3*brev): 0.470
      Chosen configuration: {"candidate":0}
      

      GEPA-Flow example (src/examples/gepa-flow.ts)

      $ npm run tsx src/examples/gepa-flow.ts
      
      > @ax-llm/ax-monorepo@14.0.20 tsx
      > node --env-file=.env --import=tsx src/examples/gepa-flow.ts
      
      [AxFlow] new AxFlow() is deprecated. Use flow() factory instead.
      🚀 Running GEPA-Flow Pareto optimization (accuracy + brevity)...
      
      ● Optimization Started
      ──────────────────────────────────────────────────
        Optimizer: GEPA-Flow
        Examples: 11 training, 6 validation
        Config: {"numTrials":16,"minibatch":true}
      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
      
      ● Round 1/16
        Score: 6.000 (best: 6.000)
        Config: modules=2.00, totalRounds=16.00
      
      ● Round 2/16
        Score: 6.850 (best: 6.850)
        Config: modules=2.00, totalRounds=16.00
      
      ● Round 3/16
        Score: 5.700 (best: 5.700)
        Config: modules=2.00, totalRounds=16.00
      
      ● Round 4/16
        Score: 6.300 (best: 6.300)
        Config: modules=2.00, totalRounds=16.00
      
      ● Round 5/16
        Score: 5.850 (best: 5.850)
        Config: modules=2.00, totalRounds=16.00
      
      ● Round 6/16
        Score: 6.500 (best: 6.500)
        Config: modules=2.00, totalRounds=16.00
      
      
      ✅ Pareto optimization complete
      Front size: 2
      Hypervolume (2D): 0.3
      
      Top Pareto points:
        #1: accuracy=0.667, brevity=0.450, config={"candidate":0}
        #2: accuracy=0.667, brevity=0.450, config={"candidate":1}
      

…ling

Adds a sample-efficient optimizer with reflective mutation, Pareto-based candidate selection, optional crossover, and multi-objective support. Includes typed options and examples, and improves progress logging of total rounds.
Implements a Flow-aware GEPA variant that selects modules round-robin and supports system-aware merge across candidates. Adds AxFlow helpers to expose/set node instructions and exports the optimizer.
…e paper and simplify the API; refactor examples to remove basic variants and promote Pareto examples
Implement paper’s Algorithm 2 candidate selection:
- Track per-instance scalar scores on validation/Pareto set (S matrix)
- Sample parent from non-dominated set weighted by per-instance wins
- Compute scalar as mean of multi-objective metrics per instance
- GEPA-Flow also samples second parent for merge via Algorithm 2
- Persist S for each accepted candidate to drive subsequent sampling

Rationale: align behavior with the GEPA paper and improve exploration vs. archive crowding-distance selection; sets the stage for optional scalar acceptance gating in a follow-up.
Extract shared multi-objective utilities to paretoUtils and replace inline duplicates in gepa.ts and gepaFlow.ts. No functional changes; simplifies maintenance and sets up for paper parity.
…to size

GEPA: enable periodic instruction merge with cap and progress reporting. GEPA-Flow: add merge caps, ancestry/desirability guards per Appendix F, and make Pareto set size configurable via args. Keeps acceptance via minibatch Pareto dominance.
…ards)\n\nIntroduce explicit D_feedback/D_pareto splits to control rollout budget; plumb evaluator textual feedback μ_f into reflection; default to σ-based minibatch acceptance with configurable epsilon; add scalarizer/metric-key for per-instance S and Pareto selection; implement system-aware merge guards (ancestor/outperforms, desirability, tried merges).
- Schedule merges via mergesDue/lastIterFoundNewProgram and skip reflective on merge attempts
- Dominator-based pair + ancestor selection with desirability filter and duplicate-merge guard
- Targeted subsample for merge acceptance (new_sum ≥ max(parent sums)); full eval on accept
- Stricter minibatch acceptance; when adapter provided, also require minibatch sum(child) > sum(parent)
- Parent selection via per-instance fronts; honor maxMetricCalls; preserve fallback behavior without adapter

This aligns both GEPA and GEPA-Flow with the reference engine while keeping public API stable.
… source parity

Seed RNG across selection/minibatching, enforce maxMetricCalls budget, and add Flow merge guards/de-dup for stable improvements; re-export adapter types and update examples.
Add domain terms (GEPA, Traj, etc.) to cspell and ignore dist to keep spelling checks green. Rename unused vars and drop unused imports in GEPA optimizers to satisfy lint without behavior changes.
…aults

Align GEPA merges with source: replace LLM merge with parent-pick, add ancestor/desirability guards and de-dup, use seeded sampling, schedule merges after accepted improvements; default merges off and skipPerfectScore on to match reference behavior.
…d defaults

Bring GEPA in line with source parity by tolerating score ties, removing the hard budget requirement, and defaulting skip‑perfect in flow to match single‑module; Pareto frontier now respects epsilon.
Enforce a positive `options.maxMetricCalls` in GEPA/GEPA-Flow compile loops to match the source implementation and avoid unbounded optimization runs.

BREAKING CHANGE: compile now throws if `options.maxMetricCalls` is absent or non-positive.
…lign single-module merge gating with the reference engine so reflective mutation is skipped only when a merge is actually attempted, improving behavioral parity and avoiding lost reflective iterations when no valid merge pair exists.
@monotykamary monotykamary mentioned this pull request Sep 6, 2025
@dosco
Copy link
Collaborator

dosco commented Sep 7, 2025

sorry for the delay, was travling, will take a look later today.

@dosco dosco merged commit f61c18a into ax-llm:main Sep 10, 2025
1 check passed
joshvfleming pushed a commit to joshvfleming/ax that referenced this pull request Oct 14, 2025
…#341)

* feat(optimizer): introduce GEPA reflective evolution with Pareto sampling

Adds a sample-efficient optimizer with reflective mutation, Pareto-based candidate selection, optional crossover, and multi-objective support. Includes typed options and examples, and improves progress logging of total rounds.

* feat(optimizer): add GEPA-Flow for multi-module reflective evolution

Implements a Flow-aware GEPA variant that selects modules round-robin and supports system-aware merge across candidates. Adds AxFlow helpers to expose/set node instructions and exports the optimizer.

* feat(optimizer): make GEPA and GEPA-Flow Pareto-only to align with the paper and simplify the API; refactor examples to remove basic variants and promote Pareto examples

* refactor(optimizer): remove legacy single-objective compile and rename GEPA-Pareto to GEPA in logs/metrics

* docs(optimize): clarify that GEPA/GEPA-Flow use compile for Pareto; MiPRO continues to use compilePareto

* chore(gepa-flow): use flow() factory and add OptimizationStart logging; align labels to GEPA-Flow

* feat(gepa,gepa-flow): adopt per-instance Pareto selection (Alg. 2)

Implement paper’s Algorithm 2 candidate selection:
- Track per-instance scalar scores on validation/Pareto set (S matrix)
- Sample parent from non-dominated set weighted by per-instance wins
- Compute scalar as mean of multi-objective metrics per instance
- GEPA-Flow also samples second parent for merge via Algorithm 2
- Persist S for each accepted candidate to drive subsequent sampling

Rationale: align behavior with the GEPA paper and improve exploration vs. archive crowding-distance selection; sets the stage for optional scalar acceptance gating in a follow-up.

* refactor(optimizer): centralize Pareto helpers for GEPA/GEPA-Flow

Extract shared multi-objective utilities to paretoUtils and replace inline duplicates in gepa.ts and gepaFlow.ts. No functional changes; simplifies maintenance and sets up for paper parity.

* feat(gepa,gepa-flow): add Merge strategy and guards; parametrize Pareto size

GEPA: enable periodic instruction merge with cap and progress reporting. GEPA-Flow: add merge caps, ancestry/desirability guards per Appendix F, and make Pareto set size configurable via args. Keeps acceptance via minibatch Pareto dominance.

* feat(gepa,gepa-flow): align with GEPA paper (splits, μf, σ-accept, guards)\n\nIntroduce explicit D_feedback/D_pareto splits to control rollout budget; plumb evaluator textual feedback μ_f into reflection; default to σ-based minibatch acceptance with configurable epsilon; add scalarizer/metric-key for per-instance S and Pareto selection; implement system-aware merge guards (ancestor/outperforms, desirability, tried merges).

* feat(gepa,gepa-flow): source-parity merges, acceptance, and adapter path

- Schedule merges via mergesDue/lastIterFoundNewProgram and skip reflective on merge attempts
- Dominator-based pair + ancestor selection with desirability filter and duplicate-merge guard
- Targeted subsample for merge acceptance (new_sum ≥ max(parent sums)); full eval on accept
- Stricter minibatch acceptance; when adapter provided, also require minibatch sum(child) > sum(parent)
- Parent selection via per-instance fronts; honor maxMetricCalls; preserve fallback behavior without adapter

This aligns both GEPA and GEPA-Flow with the reference engine while keeping public API stable.

* feat(gepa,gepa-flow): deterministic selection + strict acceptance for source parity

Seed RNG across selection/minibatching, enforce maxMetricCalls budget, and add Flow merge guards/de-dup for stable improvements; re-export adapter types and update examples.

* chore(cspell,gepa,gepa-flow): add GEPA terms; resolve lint warnings

Add domain terms (GEPA, Traj, etc.) to cspell and ignore dist to keep spelling checks green. Rename unused vars and drop unused imports in GEPA optimizers to satisfy lint without behavior changes.

* feat(gepa): source-parity single-module merges, guards, and safer defaults

Align GEPA merges with source: replace LLM merge with parent-pick, add ancestor/desirability guards and de-dup, use seeded sampling, schedule merges after accepted improvements; default merges off and skipPerfectScore on to match reference behavior.

* feat(gepa,gepa-flow,optimizer): epsilon ties, optional budget, aligned defaults

Bring GEPA in line with source parity by tolerating score ties, removing the hard budget requirement, and defaulting skip‑perfect in flow to match single‑module; Pareto frontier now respects epsilon.

* feat(gepa,gepa-flow): require maxMetricCalls for strict parity

Enforce a positive `options.maxMetricCalls` in GEPA/GEPA-Flow compile loops to match the source implementation and avoid unbounded optimization runs.

BREAKING CHANGE: compile now throws if `options.maxMetricCalls` is absent or non-positive.

* fix(gepa): only skip reflective after an evaluated merge attempt\n\nAlign single-module merge gating with the reference engine so reflective mutation is skipped only when a merge is actually attempted, improving behavioral parity and avoiding lost reflective iterations when no valid merge pair exists.

* docs(optimize): migrate multi-objective docs to GEPA/GEPA-Flow using compile (remove compilePareto)

---------

Co-authored-by: Spacy <832235+dosco@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add GEPA Optimizer

2 participants