ML-powered manga translator, written in Rust.
Koharu introduces a new workflow for manga translation, utilizing the power of ML to automate the process. It combines the capabilities of object detection, OCR, inpainting, and LLMs to create a seamless translation experience.
Under the hood, Koharu uses candle and llama.cpp for high-performance inference, and uses Tauri for the GUI. All components are written in Rust, ensuring safety and speed.
Note
Koharu runs its vision models and local LLMs locally on your machine by default. If you choose a remote LLM provider, Koharu sends translation text only to the provider you configured. Koharu itself does not collect user data.
Note
For help and support, join the Discord server.
- Automatic detection of text regions, speech bubbles, and cleanup masks
- OCR for manga dialogue, captions, and other page text
- Inpainting to remove source lettering from the page
- Translation with local or remote LLM backends
- Vertical CJK layout and text rendering with automatic contrasting black/white default outlines
- Layered PSD export with editable text
- Local HTTP API and MCP server for automation
If you just want to get started, see Install Koharu and Translate Your First Page.
- Ctrl + Mouse Wheel: Zoom in/out
- Ctrl + Drag: Pan the canvas
- Del: Delete selected text block
Koharu can export the current page either as a rendered image or as a layered Photoshop PSD. PSD export keeps helper layers and writes translated text as editable text layers, which makes manual cleanup much easier when the automatic pass gets you most of the way there.
For export behavior, PSD contents, and file naming, see Export Pages and Manage Projects.
Koharu includes a built-in MCP server for agent workflows. By default it listens on a random local port, but you can pin it with --port.
# macOS / Linux
koharu --port 9999
# Windows
koharu.exe --port 9999Then point your client at http://localhost:9999/mcp.
For local setup and the available tools, see Run GUI, Headless, and MCP Modes, Configure MCP Clients, and MCP Tools Reference.
Koharu can also run without opening the desktop window.
# macOS / Linux
koharu --port 4000 --headless
# Windows
koharu.exe --port 4000 --headlessYou can then open the Web UI at http://localhost:4000.
For runtime modes, ports, and local endpoints, see Run GUI, Headless, and MCP Modes.
Settings > Runtime controls the shared local data path plus HTTP connect timeout, read timeout, and retry count used by downloads and provider requests.
Those values are loaded at startup, so applying changes saves the config and restarts the app.
Koharu supports CUDA, Metal, and Vulkan. CPU fallback is always available when the accelerated path is unavailable or not worth the setup cost on your system.
On Windows, Koharu ships with CUDA support so it can use NVIDIA GPUs for the full local pipeline.
Koharu bundles CUDA Toolkit 13.1. The required DLLs are extracted to the application data directory on first run.
Note
Make sure you have current NVIDIA drivers installed. You can update them through NVIDIA App.
Koharu supports NVIDIA GPUs with compute capability 7.5 or higher.
If you want to confirm GPU support, see CUDA GPU Compute Capability and the cuDNN Support Matrix.
Koharu supports Metal on Apple Silicon Macs. No extra runtime setup is required beyond a normal app install.
Koharu also supports Vulkan on Windows and Linux. This backend is currently used primarily for OCR and local LLM inference.
Detection and inpainting still depend on CUDA or Metal, so Vulkan is useful but not a full replacement for the main accelerated path. AMD and Intel GPUs can still benefit from it, but the best all-around experience is still NVIDIA on Windows or Apple Silicon on macOS.
You can always force Koharu to use CPU for inference:
# macOS / Linux
koharu --cpu
# Windows
koharu.exe --cpuFor backend selection, fallback behavior, and model runtime support, see Acceleration and Runtime.
Koharu uses a staged stack of vision and language models instead of trying to solve the entire page with a single network.
Koharu uses multiple pretrained models, each tuned for a specific part of the page pipeline:
- comic-text-bubble-detector for joint text block and speech bubble detection
- comic-text-detector for text segmentation masks
- PaddleOCR-VL-1.5 for OCR text recognition
- aot-inpainting for default inpainting
- YuzuMarker.FontDetection for font and color detection
Optional built-in alternatives available in Settings > Engines include:
- PP-DocLayoutV3 as an alternative detector and layout-analysis engine
- speech-bubble-segmentation as a dedicated speech bubble detector
- Manga OCR and MIT 48px OCR as alternative OCR engines
- lama-manga as an alternative inpainter
Koharu downloads the required models automatically on first use.
Some models are consumed directly from upstream Hugging Face repos, while Rust-friendly safetensors conversions are hosted on Hugging Face when Koharu needs a converted bundle.
For a closer look at the pipeline, see Models and Providers and the Technical Deep Dive.
Koharu supports both local and remote LLM backends. When possible, it also tries to preselect sensible defaults based on your system locale.
Koharu supports quantized GGUF models through llama.cpp. These models run on your machine and are downloaded on demand when you select them in Settings.
If you want general-purpose local models first, the built-in picker includes:
- Gemma 4 instruct: gemma4-e2b-it, gemma4-e4b-it, gemma4-26b-a4b-it, gemma4-31b-it
- Qwen 3.5: qwen3.5-0.8b, qwen3.5-2b, qwen3.5-4b, qwen3.5-9b, qwen3.5-27b, qwen3.5-35b-a3b
If you want uncensored / NSFW-capable local models, the built-in picker also includes:
- Gemma 4 uncensored: gemma4-e2b-uncensored, gemma4-e4b-uncensored
- Qwen 3.5 uncensored: qwen3.5-2b-uncensored, qwen3.5-4b-uncensored, qwen3.5-9b-uncensored, qwen3.5-27b-uncensored, qwen3.5-35b-a3b-uncensored
If you want fine-tuned translation models, built-in options include:
For translating to English:
- vntl-llama3-8b-v2: around 8.5 GB in Q8_0, best when translation quality matters more than speed or memory use
- lfm2.5-1.2b-instruct: a smaller multilingual instruct model that is easier to run on CPUs or low-memory GPUs
- sugoi-14b-ultra and sugoi-32b-ultra: larger translation-oriented options when you have more VRAM or RAM available
For translating to Chinese:
- sakura-galtransl-7b-v3.7: around 6.3 GB, a good balance of quality and speed on 8 GB GPUs
- sakura-1.5b-qwen2.5-v1.0: lighter and faster, useful on mid-range GPUs or CPU-only setups
For broader language coverage:
- hunyuan-mt-7b: around 6.3 GB, with broad multilingual translation coverage
LLMs are downloaded on demand when you pick a model in Settings. If you are constrained by memory, start with a smaller model. If you have the VRAM or RAM budget, the 7B and 8B models generally produce better translations.
Koharu can also translate through remote or self-hosted API providers instead of a downloaded local model. Supported remote providers:
- OpenAI
- Gemini
- Claude
- DeepSeek
- OpenAI Compatible, including LM Studio, OpenRouter, or any endpoint that exposes the OpenAI-style
/v1/modelsand/v1/chat/completionsAPIs
Current built-in remote model defaults:
- OpenAI:
gpt-5-mini(GPT-5 mini) - Gemini:
gemini-3.1-flash-lite-preview(Gemini 3.1 Flash-Lite Preview) - Claude:
claude-haiku-4-5(Claude Haiku 4.5) - DeepSeek:
deepseek-chat(DeepSeek-V3.2-Chat) - OpenAI Compatible: models are discovered from the configured endpoint
Remote providers are configured in Settings > API Keys. OpenAI-compatible providers also need a custom base URL. API keys are optional for local servers such as LM Studio, but are usually required for hosted services such as OpenRouter.
Use a remote provider if you do not want to download local models, if you want to reduce local VRAM or RAM use, or if you already have a hosted model endpoint. Keep in mind that the OCR text selected for translation is sent to the provider you configured.
For LM Studio, OpenRouter, and other OpenAI-style endpoints, see Use OpenAI-Compatible APIs. For provider configuration, see Settings Reference.
You can download the latest release of Koharu from the releases page.
We provide prebuilt binaries for Windows, macOS, and Linux. For the standard install flow, see Install Koharu. If something goes wrong, see Troubleshooting.
To build Koharu from source, follow the steps below.
bun installbun run buildIf you want more direct control over the Tauri build:
bun tauri build --release --no-bundleThe built binaries are written to target/release.
For platform-specific build notes, see Build From Source. For the local development workflow, see Contributing.
If Koharu is useful in your workflow, consider sponsoring the project.
Koharu is licensed under the GNU General Public License v3.0.
