A lightweight, containerized embedding API that runs locally using sentence-transformers. Designed to power semantic search and retrieval-augmented generation (RAG) pipelines without relying on OpenAI or external APIs.
Docker Image:
greenygh0st/mini-embed
- Runs locally via Flask + Docker
- Uses the
nomic-ai/nomic-embed-text-v1model (768-dim embeddings) - Accepts text over HTTP and returns vector embeddings
- CPU-friendly — no GPU required
- Built-in request logging to STDOUT
- Docker
- Python (for local testing without Docker)
docker build -t mini-embed .
docker run -d -p 5000:5000 --name mini-embed mini-embedcurl -X POST http://localhost:5000/embed \
-H "Content-Type: application/json" \
-d '{"text": "How do I reset my password?"}'Request Body:
{
"text": "Your input text goes here"
}Response:
{
"embedding": [0.123, -0.456, ...]
}- Semantic search with PGVector
- Local RAG pipelines
- Embedding indexing for internal documents
- Offline language understanding
By default, the server listens on:
Host: 0.0.0.0
Port: 5000
Edit embed_server.py to change the model or port.
pip install flask sentence-transformers
python embed_server.pyMIT or similar — use freely, modify as needed.
- sentence-transformers
- OpenAI Embeddings — but free and local