Add Mistral OCR (/v1/ocr) support to symfony/ai-mistral-platform

## Summary

The `mistral-ocr-latest` model uses a dedicated `/v1/ocr` endpoint that is structurally different from chat completions. The current `symfony/ai-mistral-platform` bridge only wraps `/v1/chat/completions` and `/v1/embeddings`, so applications that need document OCR must bypass the agent abstraction entirely and call the Mistral API via raw `HttpClient`.

## The gap

The Mistral OCR endpoint accepts a `document` payload (URL or base64-encoded image/PDF) and returns structured output with `pages[]`, markdown text, layout blocks, and embedded images. It is not a chat call — you cannot route it through `AgentInterface::call()` or `MessageBag`. There is currently no `Platform`, `Model`, or `Response` class in `symfony/ai-mistral-platform` that covers it.

Example request shape:

```json
POST https://api.mistral.ai/v1/ocr
{
  "model": "mistral-ocr-latest",
  "document": {
    "type": "image_url",
    "image_url": "https://..."
  },
  "include_image_base64": true,
  "document_annotation_format": { ... }
}
```

Example response shape:

```json
{
  "pages": [
    {
      "index": 0,
      "markdown": "...",
      "dimensions": { "width": 2480, "height": 3508 },
      "images": [ { "id": "img-0", "image_base64": "...", ... } ],
      "document_annotation": "{ \"blocks\": [...] }"
    }
  ]
}
```

## Proposed addition

A `MistralOcrPlatform` (or an `OcrModel` within the existing Mistral platform) that:

- Accepts a document URL or binary payload
- POSTs to `https://api.mistral.ai/v1/ocr`
- Returns a typed `OcrResponse` with `pages`, `layout_blocks`, and `image_blocks`
- Integrates with the existing `MISTRAL_API_KEY` environment variable wiring

This would allow applications to use Mistral OCR through the same DI-configured service layer as all other AI tasks, rather than managing a raw HTTP client alongside the agent abstraction.

## Context

Mistral positions OCR as a first-class document intelligence product (see [docs.mistral.ai/api/endpoint/ocr](https://docs.mistral.ai/api/endpoint/ocr)). It supports PDFs with multiple pages, layout detection, bounding boxes, and per-page annotation — well beyond what vision models produce via chat completions. Given that `symfony/ai-mistral-platform` already exists, this feels like a natural addition to the package.

Happy to contribute a PR if the direction is accepted.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Mistral OCR (/v1/ocr) support to symfony/ai-mistral-platform #2072

Summary

The gap

Proposed addition

Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Add Mistral OCR (/v1/ocr) support to symfony/ai-mistral-platform #2072

Description

Summary

The gap

Proposed addition

Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions