OCR Microservice

A lightweight OCR microservice built with FastAPI and Tesseract (pytesseract).
Accepts base64-encoded images and returns extracted text with word-level bounding boxes.

Docker Image

greenygh0st/ocr-micro

Features

GET /health → liveness check (returns Tesseract version)
GET /ready → readiness check (service ready & concurrency info)
POST /ocr → perform OCR on a base64-encoded image
Input validation (payload size, language, OEM/PSM modes)
Resource limits & concurrency guard
Dockerized with HEALTHCHECK

Requirements

Docker (recommended)
Or: Python 3.11+, Tesseract OCR installed locally

Quick Start

Build & Run

# Build
docker build -t ocr-micro .

# Run
docker run --rm -p 5001:5001 ocr-micro

Health

curl http://localhost:5001/health

Expected response:

{"ok": true, "tesseract_version": "5.4.0"}

OCR Request

curl -X POST http://localhost:5001/ocr \
  -H "Content-Type: application/json" \
  -d '{"image_b64":"<BASE64_STRING>","lang":"eng","psm":6}'

Example response:

{
  "text": "Hello World\n",
  "words": [
    {"text": "Hello", "conf": 95.3, "left": 34, "top": 20, "width": 60, "height": 18},
    {"text": "World", "conf": 91.1, "left": 100, "top": 20, "width": 70, "height": 18}
  ],
  "lang": "eng",
  "duration_ms": 124
}

Configuration

Environment variables:

Variable	Default	Description
`MAX_B64_BYTES`	`10000000`	Max size of base64 payload (bytes)
`MAX_IMAGE_PIXELS`	`40000000`	Max image pixels (defense-in-depth)
`OCR_TIMEOUT_SEC`	`15`	Max seconds per OCR call
`MAX_CONCURRENCY`	`4`	Max simultaneous OCR requests
`ALLOWED_LANGS`	`eng`	Allowed languages (`eng+spa+deu` etc.)

Development

Install locally:

pip install -r requirements.txt
uvicorn app:app --reload

Security Notes

Runs as a non-root user in Docker
Input validation prevents overly large images or payloads
Use behind a firewall or service mesh (no built-in auth)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR Microservice

Features

Requirements

Quick Start

Build & Run

Health

OCR Request

Configuration

Development

Security Notes

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OCR Microservice

Features

Requirements

Quick Start

Build & Run

Health

OCR Request

Configuration

Development

Security Notes

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages