Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
fix: also exempt rest_protocol_mismatch_error on hedge_failed retry path
Canary soak after the previous commit showed 5/33 honest-error events
still triggering circuit breaks. Cause: shouldCircuitBreak has two code
paths — one with the structured heuristicResult, one with only the
wrapped lastErr string. The hedge_failed retry path goes through the
second; the heuristicResult is dropped during error propagation, and
the lastErr fallback only matches archival/over-serviced patterns by
substring. The new rest_protocol_mismatch_error reason had no entry in
that substring list, so it fell through to circuit-break.

Add "rest_protocol_mismatch_error" to capabilityLimitationSubstrings.
Belt-and-braces with the previous MatchedPattern tag: structured-result
path catches it via the matched pattern, hedge_failed path catches it
via the substring. Substring is the literal reason name and includes
the "_error" suffix, so it cannot match the gaming variant
"rest_protocol_mismatch" (which still must circuit-break).

Strike-decay change is meanwhile working as intended — zero new
cooldowns observed in a 30-min soak window post-deploy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
  • Loading branch information
oten91 and claude committed May 5, 2026
commit 99a93ab4ded1caa24e6555fde02e854b55b561f3
34 changes: 34 additions & 0 deletions qos/heuristic/analyzer_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -1558,3 +1558,37 @@ func TestExtractResponseID(t *testing.T) {
})
}
}

// TestErrorContainsArchivalPattern_RestProtocolMismatchError verifies the
// fallback path used when a heuristic-detected REST/JSON-RPC capability
// mismatch surfaces through the hedge_failed retry layer (heuristicResult
// is lost; only the wrapped error string remains). The substring check
// must catch the honest-error reason but NOT the gaming-variant reason.
func TestErrorContainsArchivalPattern_RestProtocolMismatchError(t *testing.T) {
tests := []struct {
name string
errStr string
expected bool
}{
{
name: "honest error wrapped through hedge_failed path is matched",
errStr: `retry: hedge_failed: relay: ... heuristic detected rest_protocol_mismatch_error (method=): backend returned error response`,
expected: true,
},
{
name: "gaming variant must NOT match (no _error suffix)",
errStr: `retry: hedge_failed: ... heuristic detected rest_protocol_mismatch (method=): supplier returned canned result`,
expected: false,
},
{
name: "unrelated error must not match",
errStr: `relay: timeout sending request to endpoint`,
expected: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
assert.Equal(t, tt.expected, ErrorContainsArchivalPattern(tt.errStr))
})
}
}
7 changes: 7 additions & 0 deletions qos/heuristic/indicators.go
Original file line number Diff line number Diff line change
Expand Up @@ -358,6 +358,13 @@ var capabilityLimitationSubstrings = []string{
// Capability limitation (e.g., Tron lite fullnodes)
"lite fullnode",
"api is not supported",
// rest_protocol_mismatch_error: heuristic-detected honest JSON-RPC error
// returned to a REST-shaped request (supplier's backend doesn't speak REST).
// The structured AnalysisResult is lost when this surfaces through the
// hedge_failed retry path; we match on the embedded reason string instead.
// Distinct from "rest_protocol_mismatch" (canned-response gaming) — the
// suffix prevents the substring check from matching the gaming variant.
"rest_protocol_mismatch_error",
}

// ErrorContainsArchivalPattern checks if an error string contains any archival-related
Expand Down
Loading