Skip to content

Implement uniform exceptions classes across model providers#1823

Open
plpxsk wants to merge 14 commits intodottxt-ai:mainfrom
plpxsk:uniform-exceptions
Open

Implement uniform exceptions classes across model providers#1823
plpxsk wants to merge 14 commits intodottxt-ai:mainfrom
plpxsk:uniform-exceptions

Conversation

@plpxsk
Copy link
Copy Markdown

@plpxsk plpxsk commented Mar 4, 2026

Implements #1658

You can review how some providers implement these, in a comment below.

The updated docs show a summary of the implementation.
If interested, also below, I include one plan that helped with research and implementation.

Overall, I tried to be comprehensive, but without creating too many classes.

Questions:

  • Do we prefer a flat-er hierarchy, or more nesting, such as by Category in the summary table
    For example, GenerationError is a result of a successful API call, not a failure.
  • Do we want to handle retries ourselves? Could be complicated, so I prefer letting consumers choose.
  • How do we like the key API names, especially normalize_provider_exception()? Alternatives: translate_, convert_*, *_external_exception, *_client_exception_ *_sdk_exception etc

@plpxsk
Copy link
Copy Markdown
Author

plpxsk commented Mar 4, 2026

OpenAI and Anthropic exception handling is (nearly?) identical since their SDKs are built by stainless.com whereas for MistralAI it is speakeasy.com . Google has its own style.

See details to get a sense of what these look like, as extracted by some LLMs.

Details

OpenAI (openai-python)

Exception
└── OpenAIError
    ├── APIError
    │   ├── APIResponseValidationError
    │   ├── APIStatusError
    │   │   ├── BadRequestError            (HTTP 400)
    │   │   ├── AuthenticationError        (HTTP 401)
    │   │   ├── PermissionDeniedError      (HTTP 403)
    │   │   ├── NotFoundError              (HTTP 404)
    │   │   ├── ConflictError              (HTTP 409)
    │   │   ├── UnprocessableEntityError   (HTTP 422)
    │   │   ├── RateLimitError             (HTTP 429)
    │   │   └── InternalServerError        (HTTP 5xx)
    │   └── APIConnectionError
    │       └── APITimeoutError
    ├── LengthFinishReasonError
    └── ContentFilterFinishReasonError

ValueError
└── InvalidWebhookSignatureError

Anthropic (anthropic-sdk-python)

Exception
└── AnthropicError
    └── APIError
        ├── APIResponseValidationError
        ├── APIStatusError
        │   ├── BadRequestError            (HTTP 400)
        │   ├── AuthenticationError        (HTTP 401)
        │   ├── PermissionDeniedError      (HTTP 403)
        │   ├── NotFoundError              (HTTP 404)
        │   ├── ConflictError              (HTTP 409)
        │   ├── RequestTooLargeError       (HTTP 413)
        │   ├── UnprocessableEntityError   (HTTP 422)
        │   ├── RateLimitError             (HTTP 429)
        │   ├── InternalServerError        (HTTP 5xx)
        │   ├── ServiceUnavailableError    (HTTP 503)
        │   ├── DeadlineExceededError      (HTTP 504)
        │   └── OverloadedError            (HTTP 529)
        └── APIConnectionError
            └── APITimeoutError

Mistral (client-python README)

The README documents these exception classes (not a source .py file, so only what's explicitly documented is listed):

Exception
└── MistralError                        (base for all HTTP error responses)
    ├── HTTPValidationError             (HTTP 422 — Validation Error)
    └── ResponseValidationError         (type mismatch between response and Pydantic model;
                                         exposes the Pydantic error via .cause attribute)

httpx.RequestError                      (network-level, NOT a subclass of MistralError)
├── httpx.ConnectError
└── httpx.TimeoutException

Note: The README only documents these classes. The README does not expose additional HTTP-status-specific subclasses (beyond HTTPValidationError). MistralError carries .message, .status_code, .headers, .body, .raw_response, and optional .data fields.


Google GenAI (python-genai)

Exception
├── APIError                            (general API errors; base for HTTP errors)
│   ├── ClientError                     (HTTP 4xx range)
│   └── ServerError                     (HTTP 5xx range)
│
ValueError
├── UnknownFunctionCallArgumentError    (function call arg cannot be converted to param annotation)
├── UnsupportedFunctionError            (function is not supported)
├── FunctionInvocationError             (function cannot be invoked with given arguments)
└── UnknownApiResponseError             (response cannot be parsed as JSON)

Additionally, ExperimentalWarning is re-exported from _common (it's a Warning, not an Exception).

@plpxsk
Copy link
Copy Markdown
Author

plpxsk commented Mar 4, 2026

Finally, for some background research, future reference, etc, view one plan that helped towards an implementation:

Details

name: Provider Exception Mapping
overview: Extend the exception hierarchy in exceptions.py with 9 specific subclasses, add a normalize_provider_exception() utility (Option B), and update each provider's except block to use per-provider mapping dicts.
todos:

  • id: exceptions-hierarchy
    content: Add 9 subclasses + normalize_provider_exception() to outlines/exceptions.py
    status: pending
  • id: retryable-flags
    content: Add retryable = True class attribute to RateLimitError, ServerError, APITimeoutError, APIConnectionError in outlines/exceptions.py
    status: pending
  • id: openai-mapping
    content: Add _get_exception_map() and update except blocks in openai.py
    status: pending
  • id: anthropic-mapping
    content: Add _get_exception_map() and update except blocks in anthropic.py
    status: pending
  • id: mistral-mapping
    content: Add _get_exception_map() and update except blocks in mistral.py
    status: pending
  • id: gemini-mapping
    content: Add Gemini exception mapping with status-code inspection in gemini.py
    status: pending
  • id: vllm-sglang-mapping
    content: Reuse OpenAI mapping in vllm.py and sglang.py
    status: pending
  • id: ollama-mapping
    content: Add status-code-based exception mapping in ollama.py
    status: pending
  • id: tgi-mapping
    content: Add _get_exception_map() with flat dict + HfHubHTTPError status fallback in tgi.py
    status: pending
  • id: dottxt-mapping
    content: Add _get_exception_map() for named dottxt errors + urllib3 network errors in dottxt.py
    status: pending
    isProject: false

Provider Exception Mapping Plan

Implementation note: two patterns in use

Pattern Providers
Flat dict + normalize_provider_exception() OpenAI, Anthropic, Mistral, TGI (specific subclasses), Dottxt
Status-code inspection helper Gemini, Ollama, TGI (HfHubHTTPError base fallback)

Both patterns are compatible — status-code inspection helpers can call normalize_provider_exception() internally for the flat-dict portion, so there is no duplication.

The call site pattern for flat-dict providers is:

except Exception as e:
    raise normalize_provider_exception(e, "openai", _get_exception_map()) from e

Implementation note: retryable flag

Four subclasses should carry a retryable = True class attribute; all others default to False (inherited from APIError):

Class retryable
RateLimitError True
ServerError True
APITimeoutError True
APIConnectionError True (transient network blips; callers should cap retries)
All others False

This lets callers branch on exc.retryable without hardcoding a list of types.

1. Extend outlines/exceptions.py

Add 9 subclasses under APIError, plus the normalize_provider_exception() utility:

# --- Client errors (4xx) ---
class AuthenticationError(APIError): pass      # 401
class PermissionDeniedError(APIError): pass    # 403
class RateLimitError(APIError): pass           # 429
class BadRequestError(APIError): pass          # 400, 404, 409, 413, 422

# --- Server errors (5xx) ---
class ServerError(APIError): pass              # 5xx, 529 (Anthropic overloaded)

# --- Network/transport errors (no HTTP status) ---
class APITimeoutError(APIError): pass          # network/request timeout
class APIConnectionError(APIError): pass       # network unreachable

# --- Response/generation errors (provider-level, not HTTP) ---
class ProviderResponseError(APIError): pass    # malformed/unparseable response
class GenerationError(APIError): pass          # content filter, length stop

Add normalize_provider_exception():

def normalize_provider_exception(exc, provider, exception_map):
    for provider_exc_cls, outlines_exc_cls in exception_map.items():
        if isinstance(exc, provider_exc_cls):
            return outlines_exc_cls(provider=provider, original_exception=exc)
    return APIError(provider=provider, original_exception=exc)

Update __all__ to include all 10 public names:

__all__ = [
    "OutlinesError",
    "APIError",
    "AuthenticationError",
    "PermissionDeniedError",
    "RateLimitError",
    "BadRequestError",
    "ServerError",
    "APITimeoutError",
    "APIConnectionError",
    "ProviderResponseError",
    "GenerationError",
]

2. Per-provider mapping dicts

Each provider module gets a _get_exception_map() function using a lazy import (matching the existing pattern).

Flat dict providers

OpenAI ([outlines/models/openai.py](outlines/models/openai.py))

Uses openai.* SDK exceptions. The existing inner except openai.BadRequestError for schema errors is kept as-is inside the inner try; normalize_provider_exception handles the outer catch-all.

Provider exception Outlines class
openai.AuthenticationError (401) AuthenticationError
openai.PermissionDeniedError (403) PermissionDeniedError
openai.RateLimitError (429) RateLimitError
openai.InternalServerError (5xx) ServerError
openai.APITimeoutError APITimeoutError
openai.APIConnectionError APIConnectionError
openai.BadRequestError (400), NotFoundError, ConflictError, UnprocessableEntityError BadRequestError
openai.APIResponseValidationError ProviderResponseError
openai.LengthFinishReasonError, ContentFilterFinishReasonError GenerationError

Anthropic ([outlines/models/anthropic.py](outlines/models/anthropic.py))

Uses anthropic.* SDK exceptions (near-identical structure to OpenAI SDK).

Provider exception Outlines class
anthropic.AuthenticationError (401) AuthenticationError
anthropic.PermissionDeniedError (403) PermissionDeniedError
anthropic.RateLimitError (429) RateLimitError
anthropic.InternalServerError, ServiceUnavailableError (503), DeadlineExceededError (504), OverloadedError (529) ServerError
anthropic.APITimeoutError APITimeoutError
anthropic.APIConnectionError APIConnectionError
anthropic.BadRequestError, NotFoundError, ConflictError, RequestTooLargeError (413), UnprocessableEntityError BadRequestError
anthropic.APIResponseValidationError ProviderResponseError

Mistral ([outlines/models/mistral.py](outlines/models/mistral.py))

Uses httpx for network errors and mistralai.client.models.* for application errors (newer SDK path only). The existing schema-check string matching is left in place inside its own except block; normalize_provider_exception handles the remainder.

Provider exception Outlines class
httpx.TimeoutException APITimeoutError
httpx.ConnectError APIConnectionError
mistralai.client.models.HTTPValidationError BadRequestError
mistralai.client.models.ResponseValidationError ProviderResponseError

Dottxt ([outlines/models/dottxt.py](outlines/models/dottxt.py))

The dottxt SDK documents two named exception classes and uses urllib3 under the hood (not httpx). It also has backoff built-in, so rate-limit retries are handled inside the SDK before an exception ever surfaces.

Provider exception Outlines class
dottxt.models.JsonSchemaCreationForbiddenError (403) PermissionDeniedError
dottxt.models.ConflictingNameForbiddenError (403) PermissionDeniedError
urllib3.exceptions.ConnectTimeoutError, ReadTimeoutError APITimeoutError
urllib3.exceptions.MaxRetryError, NewConnectionError APIConnectionError

Everything else falls through to the generic APIError.

⚠️ Needs verification before implementation: confirm the exact module path for JsonSchemaCreationForbiddenError and ConflictingNameForbiddenError by inspecting the installed dottxt package.

Status-code inspection providers

Gemini ([outlines/models/gemini.py](outlines/models/gemini.py))

google.genai.errors exposes broad 4xx/5xx classes without per-status subclasses. The ClientError mapping inspects exc.code to differentiate auth/rate-limit from general bad requests. Note that exc.code may be 0 or None for websocket-style failures; the helper must guard against that.

Provider exception Outlines class
google.genai.errors.ClientError (code 401) AuthenticationError
google.genai.errors.ClientError (code 403) PermissionDeniedError
google.genai.errors.ClientError (code 429) RateLimitError
google.genai.errors.ClientError (other 4xx) BadRequestError
google.genai.errors.ServerError (5xx) ServerError

Because Gemini requires status-code inspection, the mapping for ClientError needs a small helper rather than a flat dict. One approach: a thin wrapper that checks exc.code (guarding for None/0) before calling normalize_provider_exception.

Ollama ([outlines/models/ollama.py](outlines/models/ollama.py))

The ollama SDK exposes exactly two public exception classes:

  • ollama.ResponseError — all HTTP-level errors; has a .status_code int attribute
  • ollama.RequestError — connection-level failure (server unreachable)

Because there is only one HTTP error class, a flat dict cannot distinguish rate limits from auth errors. Ollama needs the same status-code inspection pattern as Gemini — a small _translate_ollama_exception() helper instead of a plain dict:

def _translate_ollama_exception(exc):
    import ollama
    if isinstance(exc, ollama.RequestError):
        return APIConnectionError(provider="ollama", original_exception=exc)
    if isinstance(exc, ollama.ResponseError):
        code = exc.status_code
        if code == 401:
            return AuthenticationError(provider="ollama", original_exception=exc)
        if code == 403:
            return PermissionDeniedError(provider="ollama", original_exception=exc)
        if code == 429:
            return RateLimitError(provider="ollama", original_exception=exc)
        if code >= 500:
            return ServerError(provider="ollama", original_exception=exc)
        if 400 <= code < 500:
            return BadRequestError(provider="ollama", original_exception=exc)
    return APIError(provider="ollama", original_exception=exc)

The call site replaces raise APIError(...) from e with raise _translate_ollama_exception(e) from e.

TGI ([outlines/models/tgi.py](outlines/models/tgi.py))

TGI uses huggingface_hub's InferenceClient, which has a rich exception hierarchy in huggingface_hub.errors. Most classes map directly via a flat dict (checked first, most specific wins); the base HfHubHTTPError acts as a fallback that still needs status-code inspection.

Provider exception Outlines class
huggingface_hub.errors.InferenceTimeoutError APITimeoutError
huggingface_hub.errors.OverloadedError ServerError
huggingface_hub.errors.ValidationError BadRequestError
huggingface_hub.errors.GenerationError, IncompleteGenerationError GenerationError
huggingface_hub.errors.BadRequestError BadRequestError
huggingface_hub.errors.GatedRepoError PermissionDeniedError
huggingface_hub.errors.RepositoryNotFoundError BadRequestError

HfHubHTTPError (base, not caught by any of the above) still carries a .response object with .status_code. A fallback helper inspects it:

def _translate_tgi_exception(exc):
    import huggingface_hub.errors as hf_errors
    # Try flat dict first via normalize_provider_exception
    result = normalize_provider_exception(exc, "tgi", _get_exception_map())
    # normalize_provider_exception already returned a specific type — done
    if not type(result) is APIError:
        return result
    # Fallback: status-code inspection on HfHubHTTPError base
    if isinstance(exc, hf_errors.HfHubHTTPError):
        code = getattr(getattr(exc, "response", None), "status_code", None)
        if code == 401:
            return AuthenticationError(provider="tgi", original_exception=exc)
        if code == 403:
            return PermissionDeniedError(provider="tgi", original_exception=exc)
        if code == 429:
            return RateLimitError(provider="tgi", original_exception=exc)
        if code and code >= 500:
            return ServerError(provider="tgi", original_exception=exc)
    return result

Reuses OpenAI mapping

vLLM and SGLang ([outlines/models/vllm.py](outlines/models/vllm.py), [outlines/models/sglang.py](outlines/models/sglang.py))

Both use the openai-compatible client — they can reuse the OpenAI mapping dict directly (just change provider=).

No mapping needed

Local model providers (Transformers, MLX-LM, LlamaCpp, vLLM offline)

These do not make HTTP calls, so HTTP-status mappings don't apply. Keep the existing except Exception → APIError catch-all; no mapping dict needed.

Files changed

  • [outlines/exceptions.py](outlines/exceptions.py) — add 9 classes + normalize_provider_exception()
  • [outlines/models/openai.py](outlines/models/openai.py)
  • [outlines/models/anthropic.py](outlines/models/anthropic.py)
  • [outlines/models/mistral.py](outlines/models/mistral.py)
  • [outlines/models/gemini.py](outlines/models/gemini.py)
  • [outlines/models/vllm.py](outlines/models/vllm.py)
  • [outlines/models/sglang.py](outlines/models/sglang.py)
  • [outlines/models/ollama.py](outlines/models/ollama.py)
  • [outlines/models/tgi.py](outlines/models/tgi.py)
  • [outlines/models/dottxt.py](outlines/models/dottxt.py)

@RobinPicard RobinPicard force-pushed the uniform-exceptions branch 2 times, most recently from 7202a14 to 3c18516 Compare March 10, 2026 07:49
Copy link
Copy Markdown
Contributor

@RobinPicard RobinPicard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks mostly good, thanks a lot for the PR! My main concern is whether it makes sense to have local model exceptions go through normalize_provider_exception if there's no mapping for them.

For the 3 specific questions you asked:

  • I prefer flatter
  • Let's not handle that ourselves
  • Your naming is good imo

if 400 <= code < 500:
return BadRequestError(provider=provider, original_exception=exc)

return APIError(provider=provider, original_exception=exc)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's no status code, I think we should re-raise the initial exception without the APIError wrapper as some exceptions could be unrelated to the api calls (programming errors for instance).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Non-SDK exceptions should pass through somehow. Will implement a guard or similar

if 400 <= code < 500:
return BadRequestError(provider=provider, original_exception=exc)

return APIError(provider=provider, original_exception=exc)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't wrap all exceptions in APIError as there can be non model related errors and also it does not make too much sense for local models

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, same as other comment. Will implement fix

except Exception as e:
raise normalize_provider_exception(e, PROVIDER) from e

for chunk in stream:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to wrap that in a try/except block too as exceptions can be raised during the iteration on top of during the initial connection. This comment applies to other models as well

**inputs,
**inference_kwargs,
)
try:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're not mapping local model errors, I don't think it makes sense to add this wrapping

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, will remove and refactor per related suggestion

from functools import singledispatchmethod
from typing import TYPE_CHECKING, Any, Iterator, Optional, Union

from outlines.exceptions import APIError, normalize_provider_exception
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The APIError class is not used here. Applies to other models as well


if provider == "gemini":
import httpx
import aiohttp
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't that's a dependency of the gemini sdk, so that import could raise an exception for some users that do not have the package installed

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, aiohttp is (now) an optional dep, will remove. Ref: https://github.com/googleapis/python-genai/releases/tag/v1.65.0

@plpxsk
Copy link
Copy Markdown
Author

plpxsk commented Mar 12, 2026

Thanks so much for the engagement on the PR. Let me review, and push more commits in the coming days.

@plpxsk
Copy link
Copy Markdown
Author

plpxsk commented Mar 19, 2026

I pushed updates to finalize.

The new schema and refusal handling is probably a breaking API change, and could be documented in changelog etc as needed.

@RobinPicard
Copy link
Copy Markdown
Contributor

Linting is not going through and there's missing tests coverage!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants