Implement uniform exceptions classes across model providers by plpxsk · Pull Request #1823 · dottxt-ai/outlines

plpxsk · 2026-03-04T17:12:39Z

Implements #1658

You can review how some providers implement these, in a comment below.

The updated docs show a summary of the implementation.
If interested, also below, I include one plan that helped with research and implementation.

Overall, I tried to be comprehensive, but without creating too many classes.

Questions:

Do we prefer a flat-er hierarchy, or more nesting, such as by Category in the summary table
For example, GenerationError is a result of a successful API call, not a failure.
Do we want to handle retries ourselves? Could be complicated, so I prefer letting consumers choose.
How do we like the key API names, especially normalize_provider_exception()? Alternatives: translate_, convert_*, *_external_exception, *_client_exception_ *_sdk_exception etc

plpxsk · 2026-03-04T17:13:54Z

OpenAI and Anthropic exception handling is (nearly?) identical since their SDKs are built by stainless.com whereas for MistralAI it is speakeasy.com . Google has its own style.

See details to get a sense of what these look like, as extracted by some LLMs.

Details

OpenAI (`openai-python`)

Exception
└── OpenAIError
    ├── APIError
    │   ├── APIResponseValidationError
    │   ├── APIStatusError
    │   │   ├── BadRequestError            (HTTP 400)
    │   │   ├── AuthenticationError        (HTTP 401)
    │   │   ├── PermissionDeniedError      (HTTP 403)
    │   │   ├── NotFoundError              (HTTP 404)
    │   │   ├── ConflictError              (HTTP 409)
    │   │   ├── UnprocessableEntityError   (HTTP 422)
    │   │   ├── RateLimitError             (HTTP 429)
    │   │   └── InternalServerError        (HTTP 5xx)
    │   └── APIConnectionError
    │       └── APITimeoutError
    ├── LengthFinishReasonError
    └── ContentFilterFinishReasonError

ValueError
└── InvalidWebhookSignatureError

Anthropic (`anthropic-sdk-python`)

Exception
└── AnthropicError
    └── APIError
        ├── APIResponseValidationError
        ├── APIStatusError
        │   ├── BadRequestError            (HTTP 400)
        │   ├── AuthenticationError        (HTTP 401)
        │   ├── PermissionDeniedError      (HTTP 403)
        │   ├── NotFoundError              (HTTP 404)
        │   ├── ConflictError              (HTTP 409)
        │   ├── RequestTooLargeError       (HTTP 413)
        │   ├── UnprocessableEntityError   (HTTP 422)
        │   ├── RateLimitError             (HTTP 429)
        │   ├── InternalServerError        (HTTP 5xx)
        │   ├── ServiceUnavailableError    (HTTP 503)
        │   ├── DeadlineExceededError      (HTTP 504)
        │   └── OverloadedError            (HTTP 529)
        └── APIConnectionError
            └── APITimeoutError

Mistral (`client-python` README)

The README documents these exception classes (not a source .py file, so only what's explicitly documented is listed):

Exception
└── MistralError                        (base for all HTTP error responses)
    ├── HTTPValidationError             (HTTP 422 — Validation Error)
    └── ResponseValidationError         (type mismatch between response and Pydantic model;
                                         exposes the Pydantic error via .cause attribute)

httpx.RequestError                      (network-level, NOT a subclass of MistralError)
├── httpx.ConnectError
└── httpx.TimeoutException

Note: The README only documents these classes. The README does not expose additional HTTP-status-specific subclasses (beyond HTTPValidationError). MistralError carries .message, .status_code, .headers, .body, .raw_response, and optional .data fields.

Google GenAI (`python-genai`)

Exception
├── APIError                            (general API errors; base for HTTP errors)
│   ├── ClientError                     (HTTP 4xx range)
│   └── ServerError                     (HTTP 5xx range)
│
ValueError
├── UnknownFunctionCallArgumentError    (function call arg cannot be converted to param annotation)
├── UnsupportedFunctionError            (function is not supported)
├── FunctionInvocationError             (function cannot be invoked with given arguments)
└── UnknownApiResponseError             (response cannot be parsed as JSON)

Additionally, ExperimentalWarning is re-exported from _common (it's a Warning, not an Exception).

plpxsk · 2026-03-04T17:16:09Z

Finally, for some background research, future reference, etc, view one plan that helped towards an implementation:

Details

name: Provider Exception Mapping
overview: Extend the exception hierarchy in exceptions.py with 9 specific subclasses, add a normalize_provider_exception() utility (Option B), and update each provider's except block to use per-provider mapping dicts.
todos:

id: exceptions-hierarchy
content: Add 9 subclasses + normalize_provider_exception() to outlines/exceptions.py
status: pending
id: retryable-flags
content: Add retryable = True class attribute to RateLimitError, ServerError, APITimeoutError, APIConnectionError in outlines/exceptions.py
status: pending
id: openai-mapping
content: Add _get_exception_map() and update except blocks in openai.py
status: pending
id: anthropic-mapping
content: Add _get_exception_map() and update except blocks in anthropic.py
status: pending
id: mistral-mapping
content: Add _get_exception_map() and update except blocks in mistral.py
status: pending
id: gemini-mapping
content: Add Gemini exception mapping with status-code inspection in gemini.py
status: pending
id: vllm-sglang-mapping
content: Reuse OpenAI mapping in vllm.py and sglang.py
status: pending
id: ollama-mapping
content: Add status-code-based exception mapping in ollama.py
status: pending
id: tgi-mapping
content: Add _get_exception_map() with flat dict + HfHubHTTPError status fallback in tgi.py
status: pending
id: dottxt-mapping
content: Add _get_exception_map() for named dottxt errors + urllib3 network errors in dottxt.py
status: pending
isProject: false

Provider Exception Mapping Plan

Implementation note: two patterns in use

Pattern	Providers
Flat dict + `normalize_provider_exception()`	OpenAI, Anthropic, Mistral, TGI (specific subclasses), Dottxt
Status-code inspection helper	Gemini, Ollama, TGI (HfHubHTTPError base fallback)

Both patterns are compatible — status-code inspection helpers can call normalize_provider_exception() internally for the flat-dict portion, so there is no duplication.

The call site pattern for flat-dict providers is:

except Exception as e:
    raise normalize_provider_exception(e, "openai", _get_exception_map()) from e

Implementation note: retryable flag

Four subclasses should carry a retryable = True class attribute; all others default to False (inherited from APIError):

Class	retryable
`RateLimitError`	`True`
`ServerError`	`True`
`APITimeoutError`	`True`
`APIConnectionError`	`True` (transient network blips; callers should cap retries)
All others	`False`

This lets callers branch on exc.retryable without hardcoding a list of types.

1. Extend `outlines/exceptions.py`

Add 9 subclasses under APIError, plus the normalize_provider_exception() utility:

# --- Client errors (4xx) ---
class AuthenticationError(APIError): pass      # 401
class PermissionDeniedError(APIError): pass    # 403
class RateLimitError(APIError): pass           # 429
class BadRequestError(APIError): pass          # 400, 404, 409, 413, 422

# --- Server errors (5xx) ---
class ServerError(APIError): pass              # 5xx, 529 (Anthropic overloaded)

# --- Network/transport errors (no HTTP status) ---
class APITimeoutError(APIError): pass          # network/request timeout
class APIConnectionError(APIError): pass       # network unreachable

# --- Response/generation errors (provider-level, not HTTP) ---
class ProviderResponseError(APIError): pass    # malformed/unparseable response
class GenerationError(APIError): pass          # content filter, length stop

Add normalize_provider_exception():

def normalize_provider_exception(exc, provider, exception_map):
    for provider_exc_cls, outlines_exc_cls in exception_map.items():
        if isinstance(exc, provider_exc_cls):
            return outlines_exc_cls(provider=provider, original_exception=exc)
    return APIError(provider=provider, original_exception=exc)

Update __all__ to include all 10 public names:

__all__ = [
    "OutlinesError",
    "APIError",
    "AuthenticationError",
    "PermissionDeniedError",
    "RateLimitError",
    "BadRequestError",
    "ServerError",
    "APITimeoutError",
    "APIConnectionError",
    "ProviderResponseError",
    "GenerationError",
]

2. Per-provider mapping dicts

Each provider module gets a _get_exception_map() function using a lazy import (matching the existing pattern).

Flat dict providers

OpenAI (`[outlines/models/openai.py](outlines/models/openai.py)`)

Uses openai.* SDK exceptions. The existing inner except openai.BadRequestError for schema errors is kept as-is inside the inner try; normalize_provider_exception handles the outer catch-all.

Provider exception	Outlines class
`openai.AuthenticationError` (401)	`AuthenticationError`
`openai.PermissionDeniedError` (403)	`PermissionDeniedError`
`openai.RateLimitError` (429)	`RateLimitError`
`openai.InternalServerError` (5xx)	`ServerError`
`openai.APITimeoutError`	`APITimeoutError`
`openai.APIConnectionError`	`APIConnectionError`
`openai.BadRequestError` (400), `NotFoundError`, `ConflictError`, `UnprocessableEntityError`	`BadRequestError`
`openai.APIResponseValidationError`	`ProviderResponseError`
`openai.LengthFinishReasonError`, `ContentFilterFinishReasonError`	`GenerationError`

Anthropic (`[outlines/models/anthropic.py](outlines/models/anthropic.py)`)

Uses anthropic.* SDK exceptions (near-identical structure to OpenAI SDK).

Provider exception	Outlines class
`anthropic.AuthenticationError` (401)	`AuthenticationError`
`anthropic.PermissionDeniedError` (403)	`PermissionDeniedError`
`anthropic.RateLimitError` (429)	`RateLimitError`
`anthropic.InternalServerError`, `ServiceUnavailableError` (503), `DeadlineExceededError` (504), `OverloadedError` (529)	`ServerError`
`anthropic.APITimeoutError`	`APITimeoutError`
`anthropic.APIConnectionError`	`APIConnectionError`
`anthropic.BadRequestError`, `NotFoundError`, `ConflictError`, `RequestTooLargeError` (413), `UnprocessableEntityError`	`BadRequestError`
`anthropic.APIResponseValidationError`	`ProviderResponseError`

Mistral (`[outlines/models/mistral.py](outlines/models/mistral.py)`)

Uses httpx for network errors and mistralai.client.models.* for application errors (newer SDK path only). The existing schema-check string matching is left in place inside its own except block; normalize_provider_exception handles the remainder.

Provider exception	Outlines class
`httpx.TimeoutException`	`APITimeoutError`
`httpx.ConnectError`	`APIConnectionError`
`mistralai.client.models.HTTPValidationError`	`BadRequestError`
`mistralai.client.models.ResponseValidationError`	`ProviderResponseError`

Dottxt (`[outlines/models/dottxt.py](outlines/models/dottxt.py)`)

The dottxt SDK documents two named exception classes and uses urllib3 under the hood (not httpx). It also has backoff built-in, so rate-limit retries are handled inside the SDK before an exception ever surfaces.

Provider exception	Outlines class
`dottxt.models.JsonSchemaCreationForbiddenError` (403)	`PermissionDeniedError`
`dottxt.models.ConflictingNameForbiddenError` (403)	`PermissionDeniedError`
`urllib3.exceptions.ConnectTimeoutError`, `ReadTimeoutError`	`APITimeoutError`
`urllib3.exceptions.MaxRetryError`, `NewConnectionError`	`APIConnectionError`

Everything else falls through to the generic APIError.

⚠️ Needs verification before implementation: confirm the exact module path for JsonSchemaCreationForbiddenError and ConflictingNameForbiddenError by inspecting the installed dottxt package.

Status-code inspection providers

Gemini (`[outlines/models/gemini.py](outlines/models/gemini.py)`)

google.genai.errors exposes broad 4xx/5xx classes without per-status subclasses. The ClientError mapping inspects exc.code to differentiate auth/rate-limit from general bad requests. Note that exc.code may be 0 or None for websocket-style failures; the helper must guard against that.

Provider exception	Outlines class
`google.genai.errors.ClientError` (code 401)	`AuthenticationError`
`google.genai.errors.ClientError` (code 403)	`PermissionDeniedError`
`google.genai.errors.ClientError` (code 429)	`RateLimitError`
`google.genai.errors.ClientError` (other 4xx)	`BadRequestError`
`google.genai.errors.ServerError` (5xx)	`ServerError`

Because Gemini requires status-code inspection, the mapping for ClientError needs a small helper rather than a flat dict. One approach: a thin wrapper that checks exc.code (guarding for None/0) before calling normalize_provider_exception.

Ollama (`[outlines/models/ollama.py](outlines/models/ollama.py)`)

The ollama SDK exposes exactly two public exception classes:

ollama.ResponseError — all HTTP-level errors; has a .status_code int attribute
ollama.RequestError — connection-level failure (server unreachable)

Because there is only one HTTP error class, a flat dict cannot distinguish rate limits from auth errors. Ollama needs the same status-code inspection pattern as Gemini — a small _translate_ollama_exception() helper instead of a plain dict:

def _translate_ollama_exception(exc):
    import ollama
    if isinstance(exc, ollama.RequestError):
        return APIConnectionError(provider="ollama", original_exception=exc)
    if isinstance(exc, ollama.ResponseError):
        code = exc.status_code
        if code == 401:
            return AuthenticationError(provider="ollama", original_exception=exc)
        if code == 403:
            return PermissionDeniedError(provider="ollama", original_exception=exc)
        if code == 429:
            return RateLimitError(provider="ollama", original_exception=exc)
        if code >= 500:
            return ServerError(provider="ollama", original_exception=exc)
        if 400 <= code < 500:
            return BadRequestError(provider="ollama", original_exception=exc)
    return APIError(provider="ollama", original_exception=exc)

The call site replaces raise APIError(...) from e with raise _translate_ollama_exception(e) from e.

TGI (`[outlines/models/tgi.py](outlines/models/tgi.py)`)

TGI uses huggingface_hub's InferenceClient, which has a rich exception hierarchy in huggingface_hub.errors. Most classes map directly via a flat dict (checked first, most specific wins); the base HfHubHTTPError acts as a fallback that still needs status-code inspection.

Provider exception	Outlines class
`huggingface_hub.errors.InferenceTimeoutError`	`APITimeoutError`
`huggingface_hub.errors.OverloadedError`	`ServerError`
`huggingface_hub.errors.ValidationError`	`BadRequestError`
`huggingface_hub.errors.GenerationError`, `IncompleteGenerationError`	`GenerationError`
`huggingface_hub.errors.BadRequestError`	`BadRequestError`
`huggingface_hub.errors.GatedRepoError`	`PermissionDeniedError`
`huggingface_hub.errors.RepositoryNotFoundError`	`BadRequestError`

HfHubHTTPError (base, not caught by any of the above) still carries a .response object with .status_code. A fallback helper inspects it:

def _translate_tgi_exception(exc):
    import huggingface_hub.errors as hf_errors
    # Try flat dict first via normalize_provider_exception
    result = normalize_provider_exception(exc, "tgi", _get_exception_map())
    # normalize_provider_exception already returned a specific type — done
    if not type(result) is APIError:
        return result
    # Fallback: status-code inspection on HfHubHTTPError base
    if isinstance(exc, hf_errors.HfHubHTTPError):
        code = getattr(getattr(exc, "response", None), "status_code", None)
        if code == 401:
            return AuthenticationError(provider="tgi", original_exception=exc)
        if code == 403:
            return PermissionDeniedError(provider="tgi", original_exception=exc)
        if code == 429:
            return RateLimitError(provider="tgi", original_exception=exc)
        if code and code >= 500:
            return ServerError(provider="tgi", original_exception=exc)
    return result

Reuses OpenAI mapping

vLLM and SGLang (`[outlines/models/vllm.py](outlines/models/vllm.py)`, `[outlines/models/sglang.py](outlines/models/sglang.py)`)

Both use the openai-compatible client — they can reuse the OpenAI mapping dict directly (just change provider=).

No mapping needed

Local model providers (Transformers, MLX-LM, LlamaCpp, vLLM offline)

These do not make HTTP calls, so HTTP-status mappings don't apply. Keep the existing except Exception → APIError catch-all; no mapping dict needed.

Files changed

[outlines/exceptions.py](outlines/exceptions.py) — add 9 classes + normalize_provider_exception()
[outlines/models/openai.py](outlines/models/openai.py)
[outlines/models/anthropic.py](outlines/models/anthropic.py)
[outlines/models/mistral.py](outlines/models/mistral.py)
[outlines/models/gemini.py](outlines/models/gemini.py)
[outlines/models/vllm.py](outlines/models/vllm.py)
[outlines/models/sglang.py](outlines/models/sglang.py)
[outlines/models/ollama.py](outlines/models/ollama.py)
[outlines/models/tgi.py](outlines/models/tgi.py)
[outlines/models/dottxt.py](outlines/models/dottxt.py)

RobinPicard

It looks mostly good, thanks a lot for the PR! My main concern is whether it makes sense to have local model exceptions go through normalize_provider_exception if there's no mapping for them.

For the 3 specific questions you asked:

I prefer flatter
Let's not handle that ourselves
Your naming is good imo

RobinPicard · 2026-03-12T11:05:01Z

outlines/exceptions.py

+        if 400 <= code < 500:
+            return BadRequestError(provider=provider, original_exception=exc)
+
+    return APIError(provider=provider, original_exception=exc)


If there's no status code, I think we should re-raise the initial exception without the APIError wrapper as some exceptions could be unrelated to the api calls (programming errors for instance).

Yep. Non-SDK exceptions should pass through somehow. Will implement a guard or similar

RobinPicard · 2026-03-12T11:12:02Z

outlines/exceptions.py

+        if 400 <= code < 500:
+            return BadRequestError(provider=provider, original_exception=exc)
+
+    return APIError(provider=provider, original_exception=exc)


I think we shouldn't wrap all exceptions in APIError as there can be non model related errors and also it does not make too much sense for local models

Yep, same as other comment. Will implement fix

RobinPicard · 2026-03-12T11:14:55Z

outlines/models/anthropic.py

+        except Exception as e:
+            raise normalize_provider_exception(e, PROVIDER) from e

        for chunk in stream:


I think we need to wrap that in a try/except block too as exceptions can be raised during the iteration on top of during the initial connection. This comment applies to other models as well

RobinPicard · 2026-03-12T11:47:10Z

outlines/models/transformers.py

-            **inputs,
-            **inference_kwargs,
-        )
+        try:


If we're not mapping local model errors, I don't think it makes sense to add this wrapping

agree, will remove and refactor per related suggestion

RobinPicard · 2026-03-12T11:47:47Z

outlines/models/anthropic.py

 from functools import singledispatchmethod
 from typing import TYPE_CHECKING, Any, Iterator, Optional, Union

+from outlines.exceptions import APIError, normalize_provider_exception


The APIError class is not used here. Applies to other models as well

RobinPicard · 2026-03-12T11:49:15Z

outlines/exceptions.py

+
+    if provider == "gemini":
+        import httpx
+        import aiohttp


I don't that's a dependency of the gemini sdk, so that import could raise an exception for some users that do not have the package installed

Yea, aiohttp is (now) an optional dep, will remove. Ref: https://github.com/googleapis/python-genai/releases/tag/v1.65.0

plpxsk · 2026-03-12T16:16:27Z

Thanks so much for the engagement on the PR. Let me review, and push more commits in the coming days.

plpxsk · 2026-03-19T18:20:18Z

I pushed updates to finalize.

The new schema and refusal handling is probably a breaking API change, and could be documented in changelog etc as needed.

RobinPicard · 2026-03-26T14:23:03Z

Linting is not going through and there's missing tests coverage!

RobinPicard force-pushed the uniform-exceptions branch 2 times, most recently from 7202a14 to 3c18516 Compare March 10, 2026 07:49

RobinPicard requested changes Mar 12, 2026

View reviewed changes

RobinPicard force-pushed the uniform-exceptions branch from 810f9ca to 394b315 Compare March 24, 2026 17:39

plpxsk added 14 commits March 26, 2026 16:55

Implement uniform exceptions classes across providers

e081ecb

Guard provider exception handlers against non-SDK errors

f0f7be4

Remove unused APIError imports

0d51124

Remove bad dependency

bf6ef20

Let local runtimes propagate native errors

baa24b7

Add comments that local runtimes bypass exceptions normalization

a1caa9e

Implement exceptions for chunk streams also

6c1dcb6

Update docs

2bb60de

Consolidate tests

4f915bf

Use new BadRequestError for schema error

e9ac356

Use new GenerationError for refusals

aba41f0

Improve docs

159880c

Reconcile tests

f5deb34

Fix: add LM Studio to local models list

3de42f0

RobinPicard force-pushed the uniform-exceptions branch from 394b315 to 3de42f0 Compare March 26, 2026 15:55

Conversation

plpxsk commented Mar 4, 2026

Uh oh!

plpxsk commented Mar 4, 2026

OpenAI (openai-python)

Anthropic (anthropic-sdk-python)

Mistral (client-python README)

Google GenAI (python-genai)

Uh oh!

plpxsk commented Mar 4, 2026

Provider Exception Mapping Plan

Implementation note: two patterns in use

Implementation note: retryable flag

1. Extend outlines/exceptions.py

2. Per-provider mapping dicts

Flat dict providers

OpenAI ([outlines/models/openai.py](outlines/models/openai.py))

Anthropic ([outlines/models/anthropic.py](outlines/models/anthropic.py))

Mistral ([outlines/models/mistral.py](outlines/models/mistral.py))

Dottxt ([outlines/models/dottxt.py](outlines/models/dottxt.py))

Status-code inspection providers

Gemini ([outlines/models/gemini.py](outlines/models/gemini.py))

Ollama ([outlines/models/ollama.py](outlines/models/ollama.py))

TGI ([outlines/models/tgi.py](outlines/models/tgi.py))

Reuses OpenAI mapping

vLLM and SGLang ([outlines/models/vllm.py](outlines/models/vllm.py), [outlines/models/sglang.py](outlines/models/sglang.py))

No mapping needed

Local model providers (Transformers, MLX-LM, LlamaCpp, vLLM offline)

Files changed

Uh oh!

RobinPicard left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

plpxsk commented Mar 12, 2026

Uh oh!

plpxsk commented Mar 19, 2026

Uh oh!

RobinPicard commented Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

OpenAI (`openai-python`)

Anthropic (`anthropic-sdk-python`)

Mistral (`client-python` README)

Google GenAI (`python-genai`)

1. Extend `outlines/exceptions.py`

OpenAI (`[outlines/models/openai.py](outlines/models/openai.py)`)

Anthropic (`[outlines/models/anthropic.py](outlines/models/anthropic.py)`)

Mistral (`[outlines/models/mistral.py](outlines/models/mistral.py)`)

Dottxt (`[outlines/models/dottxt.py](outlines/models/dottxt.py)`)

Gemini (`[outlines/models/gemini.py](outlines/models/gemini.py)`)

Ollama (`[outlines/models/ollama.py](outlines/models/ollama.py)`)

TGI (`[outlines/models/tgi.py](outlines/models/tgi.py)`)

vLLM and SGLang (`[outlines/models/vllm.py](outlines/models/vllm.py)`, `[outlines/models/sglang.py](outlines/models/sglang.py)`)

RobinPicard left a comment •

edited

Loading