Skip to content

fix: retry transient errors before falling through combo chain#337

Open
East-rayyy wants to merge 1 commit intodecolua:masterfrom
East-rayyy:fix/retry-transient-before-combo-fallthrough
Open

fix: retry transient errors before falling through combo chain#337
East-rayyy wants to merge 1 commit intodecolua:masterfrom
East-rayyy:fix/retry-transient-before-combo-fallthrough

Conversation

@East-rayyy
Copy link
Copy Markdown
Contributor

@East-rayyy East-rayyy commented Mar 18, 2026

Summary

  • When all API keys for a provider are temporarily locked (e.g., 503 "No capacity"), wait for the short cooldown and retry the same model before falling through to the next combo model
  • Adds extractRetryDelayMs() helper to read Retry-After header and retryAfter body field
  • Configurable: max 10s delay, max 2 retry attempts per model

Fixes #335

Problem

A combo like antigravity/opus → github/opus would immediately fall through to GitHub when both Antigravity keys got transient 503s with 1-2 second cooldowns. Simply waiting 1-2 seconds would have let the request succeed on Antigravity, but instead it hit GitHub (which may have no active credentials), killing the client.

How it works

Inside handleComboChat, after a model fails with a fallbackable error:

  1. Extract retry delay from Retry-After header or retryAfter in the JSON body
  2. If delay ≤ COMBO_RETRY_MAX_DELAY_MS (10s) and attempts < COMBO_RETRY_MAX_ATTEMPTS (2): wait, then retry the same model
  3. If delay is too long, retries exhausted, or no retry info: fall through to next model (existing behavior)
  4. Permanent errors (401, 403, etc.) are never retried — shouldFallback check runs first

Change

1 file changedopen-sse/services/combo.js

The for loop over models now wraps each attempt in a while loop that handles retries. All existing behavior is preserved when no Retry-After is present.

Test plan

  • Combo with a provider that returns 503 with short Retry-After → should retry and succeed
  • Combo with a provider that returns 503 with long Retry-After (>10s) → should fall through immediately
  • Combo with a provider that returns 401 → should not retry, fall through immediately
  • Combo where retries are exhausted → should fall through to next model
  • Combo with no Retry-After header → existing behavior preserved (immediate fallthrough)
  • Normal (non-combo) requests → unaffected

When all API keys for a provider are temporarily locked (e.g., 503
"No capacity available"), the combo handler now waits for the short
cooldown to expire and retries the same model before falling through
to the next model in the chain.

This prevents unnecessary combo fallthrough to providers that may not
work, when simply waiting 1-2 seconds would have resolved the issue.

Behavior:
- Checks Retry-After header and retryAfter body field from failed responses
- Only retries if the delay is ≤10 seconds (COMBO_RETRY_MAX_DELAY_MS)
- Maximum 2 retry attempts per model (COMBO_RETRY_MAX_ATTEMPTS)
- Permanent errors (401, 403, etc.) are not retried
- Falls through to next model if retries are exhausted or delay is too long

Fixes decolua#335
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Combo falls through to next model on transient 503 instead of retrying

1 participant