Skip to content

fix(server): retry cloud requests on HTTP 500#33718

Merged
mschile merged 13 commits intodevelopfrom
mschile/amazing-hodgkin-06f561
May 6, 2026
Merged

fix(server): retry cloud requests on HTTP 500#33718
mschile merged 13 commits intodevelopfrom
mschile/amazing-hodgkin-06f561

Conversation

@mschile
Copy link
Copy Markdown
Collaborator

@mschile mschile commented Apr 30, 2026

  • Closes n/a

Additional details

isRetryableError in packages/server/lib/cloud/network/is_retryable_error.ts was treating HTTP 500 responses as non-retryable, while the sibling isRetryableCloudError in packages/server/lib/cloud/api/cloud_request.ts already retries on 500 (with the comment "retry on 500 - according to system test, this is expected behavior"). The legacy retry logic in packages/server/lib/cloud/api/index.ts also retries the entire 5xx range.

This change adds 500 to the retryable status list in isRetryableError so the fetch-based cloud request paths (protocol artifact upload, studio bundle/session, cy-prompt bundle/session) retry transient cloud 500s, matching behavior of the axios-based path.

The unit test was updated to move 500 from the non-retryable list to the retryable list.


Note

Medium Risk
Changes cloud request retry behavior to treat HTTP 500 as retryable for idempotent methods, which could affect Cloud traffic patterns and error reporting. Scope is limited to retry classification and a few call sites plus test/snapshot updates.

Overview
Improves Cypress Cloud resiliency by retrying transient HTTP 500s for idempotent requests. isRetryableError now accepts an optional HTTP method and will retry 500 only for idempotent methods (while keeping existing retry behavior for 408/429/502/503/504).

Cloud API callers using asyncRetry (putProtocolArtifact, Studio and cy-prompt bundle/session requests) now pass their HTTP method into isRetryableError so uploads/downloads get the new behavior. Tests and system-test snapshots were updated to assert/reflect 3 retry attempts and the new aggregated error messaging when an upload keeps returning 500.

Reviewed by Cursor Bugbot for commit d2e1c63. Bugbot is set up for automated code reviews on this repo. Configure here.

Steps to test

  1. Run yarn workspace @packages/server test-unit -- test/unit/cloud/network/is_retryable_error_spec.ts and confirm all assertions pass, including 500 now appearing in the retryable list.

How has the user experience changed?

Transient HTTP 500 responses from Cypress Cloud during protocol artifact uploads, studio bundle fetches/sessions, and cy-prompt bundle fetches/sessions are now retried (3 attempts, linear 500ms backoff) instead of failing on the first occurrence.

PR Tasks

Add 500 to the list of retryable HTTP status codes in `isRetryableError`
to align with the sibling `isRetryableCloudError` in `cloud_request.ts`,
which already treats 500 as a transient/retryable response.
@mschile mschile requested a review from ryanthemanuel April 30, 2026 22:36
@mschile mschile requested a review from cacieprins April 30, 2026 22:36
@mschile mschile self-assigned this Apr 30, 2026
mschile added 3 commits April 30, 2026 16:40
Capture-protocol upload now retries 3 times on HTTP 500 (matching the
503 path) since 500 is now part of the retryable status list. The
test description, assertion, and snapshot are updated accordingly.
…dgkin-06f561

# Conflicts:
#	cli/CHANGELOG.md
@mschile mschile marked this pull request as draft May 1, 2026 02:58
@mschile mschile marked this pull request as ready for review May 5, 2026 15:43
Comment thread packages/server/lib/cloud/network/is_retryable_error.ts Outdated
mschile added 4 commits May 5, 2026 10:55
Per review feedback, only retry on 500 for the Test Replay artifact
upload (which is idempotent). Other cloud requests using the shared
`isRetryableError` predicate may be non-idempotent — retrying after a
500 risks duplicating partially-applied work.

The 500 retry is now applied as a local override at the upload call
site in `put_protocol_artifact.ts`. The shared `isRetryableError` is
restored to its previous status list.
isRetryableError now accepts an optional HTTP method. For idempotent
methods (GET/HEAD/PUT/DELETE/OPTIONS), HTTP 500 is also retryable
since replaying the request is safe. For non-idempotent methods
(POST/PATCH) — or when the method is unknown — 500 is excluded to
avoid duplicating partially-applied work.

Updated each cloud API call site to pass its method, so:

- Test Replay artifact upload (PUT)
- Studio bundle fetch (GET)
- cy.prompt bundle fetch (GET)

now retry on 500 in addition to the always-retryable statuses, while:

- Studio session creation (POST)
- cy.prompt session creation (POST)

continue to fail fast on 500 as before.
Comment thread cli/CHANGELOG.md Outdated
@mschile mschile requested a review from cacieprins May 5, 2026 17:14
@mschile mschile merged commit c06792f into develop May 6, 2026
76 checks passed
@mschile mschile deleted the mschile/amazing-hodgkin-06f561 branch May 6, 2026 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants