Skip to content

fix(assemblyai_asr): route unknown messages to on_event not on_error#2188

Merged
plutoless merged 5 commits into
mainfrom
fix/assemblyai-asr-unknown-message-handling
Jun 16, 2026
Merged

fix(assemblyai_asr): route unknown messages to on_event not on_error#2188
plutoless merged 5 commits into
mainfrom
fix/assemblyai-asr-unknown-message-handling

Conversation

@plutoless

Copy link
Copy Markdown
Contributor

Problem

The AssemblyAI ASR extension crashed on normal streaming traffic with:

E _handle_message@recognition.py:130 [asr] [AssemblyAI] Error processing message: 1 validation error for ModuleError
message
  Input should be a valid string [type=string_type, input_value={'type': 'SpeechStarted',..., 'confidence': 0.38146}, input_type=dict]

Root cause

AssemblyAI's v3 streaming API sends informational messages such as SpeechStarted. In recognition.py:_handle_message only Begin/Turn/Termination are handled; everything else fell into the else branch which called self.callback.on_error(message_data) — passing the raw dict. In extension.py:on_error that dict was fed into ModuleError(message=...), whose message field is typed str, so Pydantic raised the string_type validation error (caught and logged at recognition.py:130).

Two defects:

  • SemanticSpeechStarted is not an error; routing it to the error path would also emit a spurious ASR error downstream.
  • Type — a dict was passed where on_error contractually requires a str.

Fix

  • recognition.py — unrecognized messages are now routed to on_event (accepts a dict, logs only) instead of on_error.
  • extension.py — fixed a missing f prefix on the on_event log line (now on the active path) so the payload interpolates.
  • Added tests/test_handle_message.py asserting SpeechStarted and other unknown message types route to on_event, not on_error.

Verification

The local environment lacks pytest and the runtime deps (websockets, ten_runtime, ten_ai_base), so the project harness could not run here. The fix was verified by loading the real recognition.py with those deps stubbed and exercising _handle_message:

  • Before: on_error called with a dict (reproduces the crash trigger).
  • After: on_error not called; on_event called with the SpeechStarted dict.

The added unit tests run in CI where the deps are available.

Unrecognized AssemblyAI v3 streaming messages (e.g. SpeechStarted)
were passed as a dict to on_error, which feeds ModuleError(message=str)
and raises a Pydantic validation error. Treat them as informational
vendor events via on_event instead, and fix the on_event log f-string.
@plutoless plutoless requested a review from halajohn as a code owner June 13, 2026 03:47
AssemblyAI v3 reports errors via WebSocket close codes (no in-band error
message type), but _message_handler passed the raw exception object and an
int code to on_error, which feeds ModuleError(message=str)/
ModuleErrorVendorInfo(code=str) and raises a Pydantic validation error.
Stringify both, matching sibling extensions, and let CancelledError
propagate instead of referencing an undefined name.
@diyuyi-agora

Copy link
Copy Markdown
Contributor

The lint needs to be fixed first

@diyuyi-agora

Copy link
Copy Markdown
Contributor

tests/test_handle_message.py::test_connection_closed_passes_string_args_to_on_error failed.

@github-actions

Copy link
Copy Markdown

Review — fix(assemblyai_asr): route unknown messages to on_event not on_error

Thanks for the clear writeup. The diagnosis is solid and the production-code changes are all correct. The core routing fix is the right call, and the supporting fixes that came along with it are genuinely good catches.

Production code — looks correct

  • recognition.py else branch (on_erroron_event) — Correct. SpeechStarted and similar v3 info messages are not errors, and on_event takes the dict; this stops both the spurious downstream ASR error and the Pydantic string_type crash. Also future-proof for any new vendor message types.
  • on_error(e, code)on_error(str(e), str(code)) on the ConnectionClosed path — Correct. on_error's contract is (error_msg: str, error_code: Optional[str]), and downstream it feeds ModuleError(message=str) / ModuleErrorVendorInfo(code=str). The if code != 0 guard still prevents firing on an unparsed code.
  • Removing the except asyncio.CancelledError: block — Good catch. That block called on_error(e) with an unbound e (the name was never assigned in that scope), so it would have raised NameError on every cancellation. Since CancelledError derives from BaseException, it correctly bypasses except Exception and propagates after the finally runs on_close(). The inline comment explaining this is helpful.
  • Final except Exceptionon_error(str(e)) — Correct, consistent with the rest of the file where every other on_error call already passes a string.
  • extension.py f-string fix — Correct; the log line was previously emitting the literal {message_data} instead of interpolating it.

Potential issue — the new ConnectionClosed test may fail in CI

In test_connection_closed_passes_string_args_to_on_error:

```python
ConnectionClosed("connection closed: 4001 (some reason)")
```

The project pins websockets~=14.0 (requirements.txt). In websockets 14.x, ConnectionClosed.__init__ takes two required positional argsrcvd: Close | None and sent: Close | None (plus optional rcvd_then_sent). A single string arg would raise TypeError: missing required positional argument 'sent' at construction time, before _message_handler ever runs. And ConnectionClosed.__str__ reads self.rcvd.code / .reason, so a bare string also wouldn't stringify to "...4001..." the way the assertion expects.

I could not run this in the review environment (no websockets/pytest/ten_runtime available, and the sandbox blocked installing them), so please treat this as "verify against CI" rather than a confirmed failure. If it does fail, constructing a real close frame fixes it:

```python
from websockets.frames import Close
ConnectionClosed(Close(4001, "some reason"), None)
```

That also makes the regex-based code extraction (re.search(r"(\d{3,4})", str(e))"4001") exercise a realistic str(e). Since the PR notes mention the unit tests weren't run locally, worth confirming the CI run for this PR executed this specific test green.

Minor notes

  • The two routing tests (test_speech_started_*, test_unknown_message_*) are well-targeted and should pass — they bypass the runtime cleanly via _FakeTenEnv/_RecordingCallback, and the import path matches the existing mock.py convention.
  • Optional: _handle_message's Begin/Turn branches aren't covered. Not required for this fix, but a happy-path Turn → on_result assertion would round out the routing table cheaply while you're already in here.

Test coverage

Good — the two info-message tests directly lock in the regression this PR fixes. The connection-error test is valuable but needs the construction fix above to reliably run under the pinned websockets version.

Net: production changes look good to merge; please just confirm the ConnectionClosed test passes in CI (or apply the Close(...) construction).

@plutoless plutoless merged commit e063aac into main Jun 16, 2026
34 checks passed
@plutoless plutoless deleted the fix/assemblyai-asr-unknown-message-handling branch June 16, 2026 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants