Jina Embeddings V3 Server

This project provides a FastAPI server for generating text embeddings using the Jina Embeddings V3 model.

I just had to create this to offload the embedding generation from my notebook (fans go brrr) to a server with a GPU and couldn't find a simple solution that met my needs. This server is designed to be easy to deploy and use with no extra frills.

Features

HTTP API for generating text embeddings
Support for task-specific embeddings (retrieval, classification, etc.)
GPU acceleration support via CUDA
Detailed performance timing metrics
Interactive demo page

Out of scope:

No authentication or rate limiting If you need these features, consider using a reverse proxy like Nginx or Traefik.
No caching - embeddings are always regenerated

Installation

Prerequisites

Python 3.12+
uv package manager (recommended)
NVIDIA drivers for GPU support (optional)

Install Dependencies

# For CPU-only
uv sync --extra cpu

# For NVIDIA GPU support (CUDA 12.6+)
uv sync --extra cuda

Running the Server

For Development (Web UI + API)

uvicorn jina_server.serve:app --host 0.0.0.0 --port 8000

Visit http://localhost:8000 to access the demo Web UI

Production with Docker

CPU-only container

docker build -t jina-server .
docker run -p 80:80 jina-server

NVIDIA GPU container

docker build -f Dockerfile.nvidia -t jina-server-gpu .
docker run --gpus all -p 80:80 jina-server-gpu

Using Docker Compose

CPU Version

docker-compose -f docker-compose.cpu.yml up

NVIDIA GPU Version

docker-compose -f docker-compose.nvidia.yml up

API Documentation

Generate Embeddings

Endpoint: POST /v1/embed

Request Format:

{
  "text": "Your text to embed",
  "task": "retrieval.query"  // Optional fine-tuned task parameter, field can be omitted
}

Valid Task Options:

retrieval.query
retrieval.passage
separation
classification
text-matching

Example API Usage

curl -X POST "http://localhost/v1/embed" \
  -H "Content-Type: application/json" \
  -d '{"text": "Sample text for embedding"}'

Response Example:

{
  "embedding": [0.9876, ...],
  "model_name": "jinaai/jina-embeddings-v3",
  "queue_duration_seconds": 0.002541,
  "embedding_duration_seconds": 0.041536
}

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
src/jina_server		src/jina_server
.dockerignore		.dockerignore
.gitignore		.gitignore
COPYING		COPYING
Dockerfile		Dockerfile
Dockerfile.nvidia		Dockerfile.nvidia
README.md		README.md
docker-compose.cpu.yml		docker-compose.cpu.yml
docker-compose.nvidia.yml		docker-compose.nvidia.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock
uvicorn_log_config.yaml		uvicorn_log_config.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Jina Embeddings V3 Server

Features

Installation

Prerequisites

Install Dependencies

Running the Server

For Development (Web UI + API)

Production with Docker

CPU-only container

NVIDIA GPU container

Using Docker Compose

CPU Version

NVIDIA GPU Version

API Documentation

Generate Embeddings

Example API Usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

mostlyuseful/jina-server

Folders and files

Latest commit

History

Repository files navigation

Jina Embeddings V3 Server

Features

Installation

Prerequisites

Install Dependencies

Running the Server

For Development (Web UI + API)

Production with Docker

CPU-only container

NVIDIA GPU container

Using Docker Compose

CPU Version

NVIDIA GPU Version

API Documentation

Generate Embeddings

Example API Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages