Implement uniform exceptions classes across model providers#1823
Implement uniform exceptions classes across model providers#1823plpxsk wants to merge 14 commits intodottxt-ai:mainfrom
Conversation
|
OpenAI and Anthropic exception handling is (nearly?) identical since their SDKs are built by stainless.com whereas for MistralAI it is speakeasy.com . Google has its own style. See details to get a sense of what these look like, as extracted by some LLMs. DetailsOpenAI (
|
|
Finally, for some background research, future reference, etc, view one plan that helped towards an implementation: Detailsname: Provider Exception Mapping
Provider Exception Mapping PlanImplementation note: two patterns in use
Both patterns are compatible — status-code inspection helpers can call The call site pattern for flat-dict providers is: except Exception as e:
raise normalize_provider_exception(e, "openai", _get_exception_map()) from eImplementation note: retryable flagFour subclasses should carry a
This lets callers branch on 1. Extend
|
| Provider exception | Outlines class |
|---|---|
openai.AuthenticationError (401) |
AuthenticationError |
openai.PermissionDeniedError (403) |
PermissionDeniedError |
openai.RateLimitError (429) |
RateLimitError |
openai.InternalServerError (5xx) |
ServerError |
openai.APITimeoutError |
APITimeoutError |
openai.APIConnectionError |
APIConnectionError |
openai.BadRequestError (400), NotFoundError, ConflictError, UnprocessableEntityError |
BadRequestError |
openai.APIResponseValidationError |
ProviderResponseError |
openai.LengthFinishReasonError, ContentFilterFinishReasonError |
GenerationError |
Anthropic ([outlines/models/anthropic.py](outlines/models/anthropic.py))
Uses anthropic.* SDK exceptions (near-identical structure to OpenAI SDK).
| Provider exception | Outlines class |
|---|---|
anthropic.AuthenticationError (401) |
AuthenticationError |
anthropic.PermissionDeniedError (403) |
PermissionDeniedError |
anthropic.RateLimitError (429) |
RateLimitError |
anthropic.InternalServerError, ServiceUnavailableError (503), DeadlineExceededError (504), OverloadedError (529) |
ServerError |
anthropic.APITimeoutError |
APITimeoutError |
anthropic.APIConnectionError |
APIConnectionError |
anthropic.BadRequestError, NotFoundError, ConflictError, RequestTooLargeError (413), UnprocessableEntityError |
BadRequestError |
anthropic.APIResponseValidationError |
ProviderResponseError |
Mistral ([outlines/models/mistral.py](outlines/models/mistral.py))
Uses httpx for network errors and mistralai.client.models.* for application errors (newer SDK path only). The existing schema-check string matching is left in place inside its own except block; normalize_provider_exception handles the remainder.
| Provider exception | Outlines class |
|---|---|
httpx.TimeoutException |
APITimeoutError |
httpx.ConnectError |
APIConnectionError |
mistralai.client.models.HTTPValidationError |
BadRequestError |
mistralai.client.models.ResponseValidationError |
ProviderResponseError |
Dottxt ([outlines/models/dottxt.py](outlines/models/dottxt.py))
The dottxt SDK documents two named exception classes and uses urllib3 under the hood (not httpx). It also has backoff built-in, so rate-limit retries are handled inside the SDK before an exception ever surfaces.
| Provider exception | Outlines class |
|---|---|
dottxt.models.JsonSchemaCreationForbiddenError (403) |
PermissionDeniedError |
dottxt.models.ConflictingNameForbiddenError (403) |
PermissionDeniedError |
urllib3.exceptions.ConnectTimeoutError, ReadTimeoutError |
APITimeoutError |
urllib3.exceptions.MaxRetryError, NewConnectionError |
APIConnectionError |
Everything else falls through to the generic APIError.
⚠️ Needs verification before implementation: confirm the exact module path forJsonSchemaCreationForbiddenErrorandConflictingNameForbiddenErrorby inspecting the installeddottxtpackage.
Status-code inspection providers
Gemini ([outlines/models/gemini.py](outlines/models/gemini.py))
google.genai.errors exposes broad 4xx/5xx classes without per-status subclasses. The ClientError mapping inspects exc.code to differentiate auth/rate-limit from general bad requests. Note that exc.code may be 0 or None for websocket-style failures; the helper must guard against that.
| Provider exception | Outlines class |
|---|---|
google.genai.errors.ClientError (code 401) |
AuthenticationError |
google.genai.errors.ClientError (code 403) |
PermissionDeniedError |
google.genai.errors.ClientError (code 429) |
RateLimitError |
google.genai.errors.ClientError (other 4xx) |
BadRequestError |
google.genai.errors.ServerError (5xx) |
ServerError |
Because Gemini requires status-code inspection, the mapping for ClientError needs a small helper rather than a flat dict. One approach: a thin wrapper that checks exc.code (guarding for None/0) before calling normalize_provider_exception.
Ollama ([outlines/models/ollama.py](outlines/models/ollama.py))
The ollama SDK exposes exactly two public exception classes:
ollama.ResponseError— all HTTP-level errors; has a.status_codeint attributeollama.RequestError— connection-level failure (server unreachable)
Because there is only one HTTP error class, a flat dict cannot distinguish rate limits from auth errors. Ollama needs the same status-code inspection pattern as Gemini — a small _translate_ollama_exception() helper instead of a plain dict:
def _translate_ollama_exception(exc):
import ollama
if isinstance(exc, ollama.RequestError):
return APIConnectionError(provider="ollama", original_exception=exc)
if isinstance(exc, ollama.ResponseError):
code = exc.status_code
if code == 401:
return AuthenticationError(provider="ollama", original_exception=exc)
if code == 403:
return PermissionDeniedError(provider="ollama", original_exception=exc)
if code == 429:
return RateLimitError(provider="ollama", original_exception=exc)
if code >= 500:
return ServerError(provider="ollama", original_exception=exc)
if 400 <= code < 500:
return BadRequestError(provider="ollama", original_exception=exc)
return APIError(provider="ollama", original_exception=exc)The call site replaces raise APIError(...) from e with raise _translate_ollama_exception(e) from e.
TGI ([outlines/models/tgi.py](outlines/models/tgi.py))
TGI uses huggingface_hub's InferenceClient, which has a rich exception hierarchy in huggingface_hub.errors. Most classes map directly via a flat dict (checked first, most specific wins); the base HfHubHTTPError acts as a fallback that still needs status-code inspection.
| Provider exception | Outlines class |
|---|---|
huggingface_hub.errors.InferenceTimeoutError |
APITimeoutError |
huggingface_hub.errors.OverloadedError |
ServerError |
huggingface_hub.errors.ValidationError |
BadRequestError |
huggingface_hub.errors.GenerationError, IncompleteGenerationError |
GenerationError |
huggingface_hub.errors.BadRequestError |
BadRequestError |
huggingface_hub.errors.GatedRepoError |
PermissionDeniedError |
huggingface_hub.errors.RepositoryNotFoundError |
BadRequestError |
HfHubHTTPError (base, not caught by any of the above) still carries a .response object with .status_code. A fallback helper inspects it:
def _translate_tgi_exception(exc):
import huggingface_hub.errors as hf_errors
# Try flat dict first via normalize_provider_exception
result = normalize_provider_exception(exc, "tgi", _get_exception_map())
# normalize_provider_exception already returned a specific type — done
if not type(result) is APIError:
return result
# Fallback: status-code inspection on HfHubHTTPError base
if isinstance(exc, hf_errors.HfHubHTTPError):
code = getattr(getattr(exc, "response", None), "status_code", None)
if code == 401:
return AuthenticationError(provider="tgi", original_exception=exc)
if code == 403:
return PermissionDeniedError(provider="tgi", original_exception=exc)
if code == 429:
return RateLimitError(provider="tgi", original_exception=exc)
if code and code >= 500:
return ServerError(provider="tgi", original_exception=exc)
return resultReuses OpenAI mapping
vLLM and SGLang ([outlines/models/vllm.py](outlines/models/vllm.py), [outlines/models/sglang.py](outlines/models/sglang.py))
Both use the openai-compatible client — they can reuse the OpenAI mapping dict directly (just change provider=).
No mapping needed
Local model providers (Transformers, MLX-LM, LlamaCpp, vLLM offline)
These do not make HTTP calls, so HTTP-status mappings don't apply. Keep the existing except Exception → APIError catch-all; no mapping dict needed.
Files changed
[outlines/exceptions.py](outlines/exceptions.py)— add 9 classes +normalize_provider_exception()[outlines/models/openai.py](outlines/models/openai.py)[outlines/models/anthropic.py](outlines/models/anthropic.py)[outlines/models/mistral.py](outlines/models/mistral.py)[outlines/models/gemini.py](outlines/models/gemini.py)[outlines/models/vllm.py](outlines/models/vllm.py)[outlines/models/sglang.py](outlines/models/sglang.py)[outlines/models/ollama.py](outlines/models/ollama.py)[outlines/models/tgi.py](outlines/models/tgi.py)[outlines/models/dottxt.py](outlines/models/dottxt.py)
7202a14 to
3c18516
Compare
There was a problem hiding this comment.
It looks mostly good, thanks a lot for the PR! My main concern is whether it makes sense to have local model exceptions go through normalize_provider_exception if there's no mapping for them.
For the 3 specific questions you asked:
- I prefer flatter
- Let's not handle that ourselves
- Your naming is good imo
| if 400 <= code < 500: | ||
| return BadRequestError(provider=provider, original_exception=exc) | ||
|
|
||
| return APIError(provider=provider, original_exception=exc) |
There was a problem hiding this comment.
If there's no status code, I think we should re-raise the initial exception without the APIError wrapper as some exceptions could be unrelated to the api calls (programming errors for instance).
There was a problem hiding this comment.
Yep. Non-SDK exceptions should pass through somehow. Will implement a guard or similar
| if 400 <= code < 500: | ||
| return BadRequestError(provider=provider, original_exception=exc) | ||
|
|
||
| return APIError(provider=provider, original_exception=exc) |
There was a problem hiding this comment.
I think we shouldn't wrap all exceptions in APIError as there can be non model related errors and also it does not make too much sense for local models
There was a problem hiding this comment.
Yep, same as other comment. Will implement fix
outlines/models/anthropic.py
Outdated
| except Exception as e: | ||
| raise normalize_provider_exception(e, PROVIDER) from e | ||
|
|
||
| for chunk in stream: |
There was a problem hiding this comment.
I think we need to wrap that in a try/except block too as exceptions can be raised during the iteration on top of during the initial connection. This comment applies to other models as well
outlines/models/transformers.py
Outdated
| **inputs, | ||
| **inference_kwargs, | ||
| ) | ||
| try: |
There was a problem hiding this comment.
If we're not mapping local model errors, I don't think it makes sense to add this wrapping
There was a problem hiding this comment.
agree, will remove and refactor per related suggestion
outlines/models/anthropic.py
Outdated
| from functools import singledispatchmethod | ||
| from typing import TYPE_CHECKING, Any, Iterator, Optional, Union | ||
|
|
||
| from outlines.exceptions import APIError, normalize_provider_exception |
There was a problem hiding this comment.
The APIError class is not used here. Applies to other models as well
outlines/exceptions.py
Outdated
|
|
||
| if provider == "gemini": | ||
| import httpx | ||
| import aiohttp |
There was a problem hiding this comment.
I don't that's a dependency of the gemini sdk, so that import could raise an exception for some users that do not have the package installed
There was a problem hiding this comment.
Yea, aiohttp is (now) an optional dep, will remove. Ref: https://github.com/googleapis/python-genai/releases/tag/v1.65.0
|
Thanks so much for the engagement on the PR. Let me review, and push more commits in the coming days. |
|
I pushed updates to finalize. The new schema and refusal handling is probably a breaking API change, and could be documented in changelog etc as needed. |
810f9ca to
394b315
Compare
|
Linting is not going through and there's missing tests coverage! |
394b315 to
3de42f0
Compare
Implements #1658
You can review how some providers implement these, in a comment below.
The updated docs show a summary of the implementation.
If interested, also below, I include one plan that helped with research and implementation.
Overall, I tried to be comprehensive, but without creating too many classes.
Questions:
For example, GenerationError is a result of a successful API call, not a failure.
normalize_provider_exception()? Alternatives:translate_,convert_*, *_external_exception,*_client_exception_*_sdk_exceptionetc