Skip to content

mayocream/koharu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,427 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Koharu

Documentation

ML-powered manga translator, written in Rust.

Koharu introduces a new workflow for manga translation, utilizing the power of ML to automate the process. It combines the capabilities of object detection, OCR, inpainting, and LLMs to create a seamless translation experience.

Under the hood, Koharu uses candle and llama.cpp for high-performance inference, and uses Tauri for the GUI. All components are written in Rust, ensuring safety and speed.

Note

Koharu runs its vision models and local LLMs locally on your machine by default. If you choose a remote LLM provider, Koharu sends translation text only to the provider you configured. Koharu itself does not collect user data.


screenshot

Note

For help and support, join the Discord server.

Features

  • Automatic detection of text regions, speech bubbles, and cleanup masks
  • OCR for manga dialogue, captions, and other page text
  • Inpainting to remove source lettering from the page
  • Translation with local or remote LLM backends
  • Vertical CJK layout and text rendering with automatic contrasting black/white default outlines
  • Layered PSD export with editable text
  • Local HTTP API and MCP server for automation

If you just want to get started, see Install Koharu and Translate Your First Page.

Usage

Hot keys

  • Ctrl + Mouse Wheel: Zoom in/out
  • Ctrl + Drag: Pan the canvas
  • Del: Delete selected text block

Export

Koharu can export the current page either as a rendered image or as a layered Photoshop PSD. PSD export keeps helper layers and writes translated text as editable text layers, which makes manual cleanup much easier when the automatic pass gets you most of the way there.

For export behavior, PSD contents, and file naming, see Export Pages and Manage Projects.

MCP Server

Koharu includes a built-in MCP server for agent workflows. By default it listens on a random local port, but you can pin it with --port.

# macOS / Linux
koharu --port 9999
# Windows
koharu.exe --port 9999

Then point your client at http://localhost:9999/mcp.

For local setup and the available tools, see Run GUI, Headless, and MCP Modes, Configure MCP Clients, and MCP Tools Reference.

Headless Mode

Koharu can also run without opening the desktop window.

# macOS / Linux
koharu --port 4000 --headless
# Windows
koharu.exe --port 4000 --headless

You can then open the Web UI at http://localhost:4000.

For runtime modes, ports, and local endpoints, see Run GUI, Headless, and MCP Modes.

Runtime settings

Settings > Runtime controls the shared local data path plus HTTP connect timeout, read timeout, and retry count used by downloads and provider requests.

Those values are loaded at startup, so applying changes saves the config and restarts the app.

GPU acceleration

Koharu supports CUDA, Metal, and Vulkan. CPU fallback is always available when the accelerated path is unavailable or not worth the setup cost on your system.

CUDA (NVIDIA GPUs on Windows)

On Windows, Koharu ships with CUDA support so it can use NVIDIA GPUs for the full local pipeline.

Koharu bundles CUDA Toolkit 13.1. The required DLLs are extracted to the application data directory on first run.

Note

Make sure you have current NVIDIA drivers installed. You can update them through NVIDIA App.

Supported NVIDIA GPUs

Koharu supports NVIDIA GPUs with compute capability 7.5 or higher.

If you want to confirm GPU support, see CUDA GPU Compute Capability and the cuDNN Support Matrix.

Metal (Apple Silicon on macOS)

Koharu supports Metal on Apple Silicon Macs. No extra runtime setup is required beyond a normal app install.

Vulkan (Windows and Linux)

Koharu also supports Vulkan on Windows and Linux. This backend is currently used primarily for OCR and local LLM inference.

Detection and inpainting still depend on CUDA or Metal, so Vulkan is useful but not a full replacement for the main accelerated path. AMD and Intel GPUs can still benefit from it, but the best all-around experience is still NVIDIA on Windows or Apple Silicon on macOS.

CPU fallback

You can always force Koharu to use CPU for inference:

# macOS / Linux
koharu --cpu
# Windows
koharu.exe --cpu

For backend selection, fallback behavior, and model runtime support, see Acceleration and Runtime.

ML Models

Koharu uses a staged stack of vision and language models instead of trying to solve the entire page with a single network.

Computer Vision Models

Koharu uses multiple pretrained models, each tuned for a specific part of the page pipeline:

Optional built-in alternatives available in Settings > Engines include:

Koharu downloads the required models automatically on first use.

Some models are consumed directly from upstream Hugging Face repos, while Rust-friendly safetensors conversions are hosted on Hugging Face when Koharu needs a converted bundle.

For a closer look at the pipeline, see Models and Providers and the Technical Deep Dive.

Large Language Models

Koharu supports both local and remote LLM backends. When possible, it also tries to preselect sensible defaults based on your system locale.

Local LLMs

Koharu supports quantized GGUF models through llama.cpp. These models run on your machine and are downloaded on demand when you select them in Settings.

If you want general-purpose local models first, the built-in picker includes:

If you want uncensored / NSFW-capable local models, the built-in picker also includes:

If you want fine-tuned translation models, built-in options include:

For translating to English:

For translating to Chinese:

For broader language coverage:

  • hunyuan-mt-7b: around 6.3 GB, with broad multilingual translation coverage

LLMs are downloaded on demand when you pick a model in Settings. If you are constrained by memory, start with a smaller model. If you have the VRAM or RAM budget, the 7B and 8B models generally produce better translations.

Remote LLMs

Koharu can also translate through remote or self-hosted API providers instead of a downloaded local model. Supported remote providers:

  • OpenAI
  • Gemini
  • Claude
  • DeepSeek
  • OpenAI Compatible, including LM Studio, OpenRouter, or any endpoint that exposes the OpenAI-style /v1/models and /v1/chat/completions APIs

Current built-in remote model defaults:

  • OpenAI: gpt-5-mini (GPT-5 mini)
  • Gemini: gemini-3.1-flash-lite-preview (Gemini 3.1 Flash-Lite Preview)
  • Claude: claude-haiku-4-5 (Claude Haiku 4.5)
  • DeepSeek: deepseek-chat (DeepSeek-V3.2-Chat)
  • OpenAI Compatible: models are discovered from the configured endpoint

Remote providers are configured in Settings > API Keys. OpenAI-compatible providers also need a custom base URL. API keys are optional for local servers such as LM Studio, but are usually required for hosted services such as OpenRouter.

Use a remote provider if you do not want to download local models, if you want to reduce local VRAM or RAM use, or if you already have a hosted model endpoint. Keep in mind that the OCR text selected for translation is sent to the provider you configured.

For LM Studio, OpenRouter, and other OpenAI-style endpoints, see Use OpenAI-Compatible APIs. For provider configuration, see Settings Reference.

Installation

You can download the latest release of Koharu from the releases page.

We provide prebuilt binaries for Windows, macOS, and Linux. For the standard install flow, see Install Koharu. If something goes wrong, see Troubleshooting.

Development

To build Koharu from source, follow the steps below.

Prerequisites

  • Rust 1.92 or later
  • Bun 1.0 or later

Install dependencies

bun install

Build

bun run build

If you want more direct control over the Tauri build:

bun tauri build --release --no-bundle

The built binaries are written to target/release.

For platform-specific build notes, see Build From Source. For the local development workflow, see Contributing.

Sponsorship

If Koharu is useful in your workflow, consider sponsoring the project.

Contributors

License

Koharu is licensed under the GNU General Public License v3.0.

Sponsor this project

  •  

Packages

 
 
 

Contributors

Languages