Wyoming protocol server for faster-whisper speech-to-text, running on NVIDIA GPU via CUDA.
Forked from rhasspy/wyoming-faster-whisper. For CPU or Home Assistant add-on use, see the upstream repo.
- NVIDIA GPU with CUDA 12.6 compatible driver
- nvidia-container-toolkit installed on the host
docker compose upEdit docker-compose.yml to set your data volume path and preferred model before running.
docker build -t wyoming-faster-whisper:cuda .
docker run -it --gpus all \
-p 10300:10300 \
-v /path/to/local/data:/data \
wyoming-faster-whisper:cuda \
--device cuda \
--model tiny-int8 \
--language enKey options passed after the image name:
| Option | Description |
|---|---|
--device cuda |
Use GPU for inference (required) |
--model |
Model name or HuggingFace ID (e.g. tiny-int8, small-int8, Systran/faster-distil-whisper-small.en) |
--language |
Language code (e.g. en), or omit for auto-detect |
--beam-size |
Beam size for decoding (default: 1) |
--data-dir |
Directory where models are stored (default: /data) |
Models are downloaded to /data on first run — mount a persistent volume there.
The wyoming server listens on TCP port 10300.