mayocream / koharu

ML-powered manga translator, written in Rust.

944 stars

48 forks

46 issues

RustTypeScriptPython

Chat with Codebase Architecture Scan Security Audit Explain Codebase

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing mayocream/koharu in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Click here to launch the interactive analysis workspace

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/mayocream/koharu)

Preview:

Repository Overview (README excerpt)

Crawler view

Koharu 日本語 | 简体中文 ML-powered manga translator, written in **Rust**. Koharu introduces a new workflow for manga translation, utilizing the power of ML to automate the process. It combines the capabilities of object detection, OCR, inpainting, and LLMs to create a seamless translation experience. Under the hood, Koharu uses candle for high-performance inference, and uses Tauri for the GUI. All components are written in Rust, ensuring safety and speed. > [!NOTE] > Koharu runs its vision models and local LLMs **locally** on your machine by default. If you choose a remote LLM provider, Koharu sends translation text only to the provider you configured. Koharu itself does not collect user data. --- > [!NOTE] > For help and support, please join our Discord server. Features• Automatic speech bubble detection and segmentation• OCR for manga text recognition• Inpainting to remove original text from images• LLM-powered translation• Vertical text layout for CJK languages• Export to layered PSD with editable text• MCP server for AI agents Usage Hot keys• Ctrl + Mouse Wheel: Zoom in/out• Ctrl + Drag: Pan the canvas• Del : Delete selected text block Export Koharu can export the current page as a rendered image or as a layered Photoshop PSD. PSD export preserves helper layers and writes translated text as editable text layers for further cleanup in Photoshop. MCP Server Koharu has a built-in MCP server that can be used to integrate with AI agents. By default, the MCP server will listen on a random port, but you can specify the port using the flag. You can input into the MCP server URL field in your AI agent. Headless Mode Koharu can be run in headless mode via command line. You can now access Koharu Web UI at . File association On Windows, Koharu automatically associates files, so you can open them by double-clicking. The files can also be opened from as picture to view the thumbnails of the contained images. GPU acceleration CUDA and Metal are supported for GPU acceleration, significantly improving performance on supported hardware. CUDA Koharu is built with CUDA support, allowing it to leverage the power of NVIDIA GPUs for faster processing. Koharu bundles CUDA toolkit 13.1 and cuDNN 9.19, dylibs will be automatically extracted to the application data directory on first run. > [!NOTE] > Please ensure that your system has the latest NVIDIA drivers installed. You can download the latest drivers via NVIDIA App. Supported NVIDIA GPUs Koharu supports NVIDIA GPUs with compute capability 7.5 or higher. Please make sure your GPU is supported by checking the CUDA GPU Compute Capability and the cuDNN Support Matrix. Metal Koharu supports Metal for GPU acceleration on macOS with Apple Silicon (M1, M2, etc.). This allows Koharu to run efficiently on a wide range of Apple devices. CPU fallback You can always force Koharu to use CPU for inference: ML Models Koharu relies on a mixin of computer vision and natural language processing models to perform its tasks. Computer Vision Models Koharu uses several pre-trained models for different tasks:• PP-DocLayoutV3 for text detection and layout analysis• comic-text-detector for text segmentation• PaddleOCR-VL-1.5 for OCR text recognition• lama-manga for inpainting• YuzuMarker.FontDetection for font and color detection The models will be automatically downloaded when you run Koharu for the first time. We convert the original models to safetensors format for better performance and compatibility with Rust. The converted models are hosted on Hugging Face. Large Language Models Koharu supports both local and remote LLM backends, and preselects a model based on your system locale when possible. Local LLMs Koharu supports various quantized LLMs in GGUF format via candle. These models run on your machine and are downloaded on demand when you select them in Settings. Supported models and suggested usage: For translating to English:• vntl-llama3-8b-v2: ~8.5 GB Q8_0 weight size and suggests >=10 GB VRAM or plenty of system RAM for CPU inference, best when accuracy matters most.• lfm2-350m-enjp-mt: ultra-light (≈350M, Q8_0); runs comfortably on CPUs and low-memory GPUs, ideal for quick previews or low-spec machines at the cost of quality. For translating to Chinese:• sakura-galtransl-7b-v3.7: ~6.3 GB and fits on 8 GB VRAM, good balance of quality and speed.• sakura-1.5b-qwen2.5-v1.0: lightweight (≈1.5B, Q5KS); fits on mid-range GPUs (4–6 GB VRAM) or CPU-only setups with moderate RAM, faster than 7B/8B while keeping Qwen-style tokenizer behavior. For other languages, you may use:• hunyuan-7b-mt-v1.0: ~6.3GB and fits on 8 GB VRAM, decent multi-language translation quality. LLMs will be automatically downloaded on demand when you select a model in the settings. Choose the smallest model that meets your quality needs if you are memory-bound; prefer the 7B/8B variants when you have sufficient VRAM/RAM for better translations. Remote LLMs Koharu can also translate through remote or self-hosted API providers instead of a downloaded local model. Supported remote providers:• OpenAI• Gemini• Claude• DeepSeek• OpenAI Compatible, including tools and services such as LM Studio, OpenRouter, or any endpoint that exposes the OpenAI-style and APIs Remote providers are configured in **Settings > API Keys**. For OpenAI Compatible, you also set a custom base URL. API keys are optional for local servers like LM Studio, but typically required for hosted services like OpenRouter. Use remote providers when you want to avoid local model downloads, reduce local VRAM/RAM usage, or connect Koharu to a hosted model. Keep in mind that OCR text selected for translation is sent to the configured provider. Installation You can download the latest release of Koharu from the releases page. We provide pre-built binaries for Windows, macOS, and Linux. For other platforms, you may need to build from source, see the Development section below. Development To build Kohar…