Skip to content

moyangzhan/mango-finder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

47 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

English | ไธญๆ–‡

Mango Finder

๐Ÿฅญ Awake your data

image

Download

What is Mango Finder?

Mango Finder (formerly MangoDesk) is a local-first desktop app for searching your local documents with natural language.

It helps you find information based on what you remember, not file names or folder structures.

search

๐Ÿ“Œ Use Cases

Mango Finder is especially useful in scenarios where you have a large amount of local documents and want to retrieve information using natural language.

Typical Use Cases

  • ๐Ÿ“ Personal Document Libraries

    • Years of accumulated notes, PDFs, Word files, Markdown files. etc
    • Example: โ€œthat note about how rust handles memory ownershipโ€
  • ๐Ÿ“‚ SVN / Git Repositories

    • Search through design docs, READMEs, technical proposals, and historical solutions
    • Example: โ€œthe solution we used for the permission system refactorโ€
  • ๐Ÿข Team or Company Knowledge Base

    • Internal documents, project docs, meeting notes, onboarding materials
    • Example: "Q4 budget planning and team feedback from last year"
  • ๐Ÿ“š Research and Study Materials

    • Papers, experiment records, literature notes
    • Example: โ€œrecent breakthroughs in large language model efficiencyโ€
  • โš–๏ธ Legal and Financial Documents

    • Contracts, policy documents, reports
    • Example: โ€œclauses regarding data privacy and user consentโ€

โœจ Features

  • ๐Ÿ’ญ Search by meaning

    • Find documents by describing what you remember, even if you donโ€™t recall exact titles or locations
  • ๐Ÿ“ Exact Keyword Match

    • Instantly locate files using precise terms from file paths or content, ideal for finding specific phrases or technical strings.
  • ๐Ÿ” Find Similar Files

    • Find visually similar images using perceptual hashing, semantically similar documents, or audio files with matching content
    • One click to discover related files based on visual, semantic, or audio fingerprint similarity
  • ๐ŸŒ Multilingual & Cross-language Search

    • Search across 100+ languages seamlessly. Find English documents using Chinese queries, or vice versa, with zero configuration required
  • ๐Ÿ”’ Private by default

    • All data stays on your device, ensuring your privacy
  • ๐Ÿ–ฅ๏ธ Self-Hosted Model Support

    • Integration with Ollama and vLLM for image analysis using local vision models (e.g., LLaVA)
    • Keep your data completely private by running vision models on your own hardware
  • โšก Fast and efficient

    • Instant search results with optimized indexing system
  • ๐Ÿ‘€ Real-time file & directory watching

    • Automatically detects file and folder changes (add / modify / delete) and keeps index and search results up to date
  • ๐Ÿ“‚ Works with your existing local files

    • No need to reorganize folders or rename files โ€” Mango Finder works with what you already have

๐Ÿ—๏ธ Architecture

Indexing

indexing

Supports three processing modes: Local (fully offline), Self-Hosted (Ollama/vLLM), and Cloud (remote AI services).

Search

search

๐Ÿ› ๏ธ Technology Stack

  • Frontend
    • WebView๏ผˆTauri๏ผ‰
    • PNPM
    • Node.js
  • Backend
    • Rust
    • Tauri Core

๐Ÿš€ Setting Up

1. Frontend

Node

node v20+ required

It is recommended to use nvm to manage multiple node versions.

PNPM

pnpm v9+ required

If you haven't installed pnpm, you can install it with the following command:

npm install pnpm -g

Install dependencies

pnpm i

2. Backend(Rust)

rust v1.94.0+ required

Install tools: https://www.rust-lang.org/tools/install

3. Tauri

Install Tauri Prerequisites: https://tauri.app/start/prerequisites/

4. Whisper.cpp Dependencies

The audio transcription feature uses whisper.cpp. Different operating systems require different dependencies.

Windows

Compiling on Windows requires CMake and LLVM/Clang 18 (Note: LLVM 19/20/22 have compatibility issues, please use LLVM 18).

  1. Install CMake 4.3

    Download from cmake-4.3.0

  2. Download and Install LLVM 18

    • Visit LLVM 18.1.8 Release
    • Download LLVM-18.1.8-win64.exe
    • Check "Add LLVM to the system PATH for all users" during installation
  3. Verify installation

    cmake --version
    clang --version

    The clang version should show 18.1.8

  4. Set environment variables (permanent)

    • Press Win + R, type sysdm.cpl, press Enter
    • Click "Advanced" tab โ†’ "Environment Variables"
    • Under "User variables", click "New" and add:
    Variable name Value
    CXXFLAGS /utf-8
    CFLAGS /utf-8
    • Click OK and restart your terminal for changes to take effect
  5. Build the project (first time only)

    Open "x64 Native Tools Command Prompt for VS 2022" (search from Start Menu), then build:

    cd your-project-path\src-tauri
    cargo build

    โš ๏ธ Important Notes:

    • The /utf-8 flag is required to resolve encoding issues
    • If previous build failed, run cargo clean -p whisper-rs-sys to clear cache first
    • After whisper is compiled successfully, subsequent builds can use pnpm tauri dev directly in any terminal
    • VSCode's rust-analyzer plugin auto-checks code on startup. Without MSVC environment, whisper-rs-sys build will fail and show as red in target/debug/build directory. If you've successfully built in "x64 Native Tools Command Prompt for VS 2022", you can ignore this error

macOS

macOS usually has Clang built-in. If you encounter issues, install Xcode Command Line Tools:

xcode-select --install

Linux

Most Linux distributions require C/C++ build tools:

Ubuntu/Debian:

sudo apt update
sudo apt install build-essential cmake

Fedora/RHEL:

sudo dnf install gcc-c++ make cmake

Arch Linux:

sudo pacman -S base-devel cmake

5. Download Model Files

Download the required model files from one of the following sources:

  1. GitHub Release: model.zip - Contains all required files
  2. Hugging Face: moyangzhan/mango-finder - Manually download the following files:
    • *.onnx model files
    • *_tokenizer.json tokenizer files
    • whisper-small-q8_0.bin

After downloading, extract the files to the src-tauri/assets/model directory.

Required Files:

  • embedding.onnx
  • embedding_tokenizer.json
  • vision.onnx
  • vision_tokenizer.json
  • whisper-small-q8_0.bin

๐Ÿš€ Getting Started

โ–ถ๏ธ Development Run

A Tauri app has at least two processes:

  • the Core Process (backend)
  • the WebView process (frontend)

Both backend and frontend start with a single command:

pnpm tauri dev

๐Ÿ“ฆ Building

pnpm tauri build

After building, the executable file is usually located in src-tauri/target/release/.

windows: src-tauri/target/release/bundle/msi/Mango Finder_0.1.0_x64_en-US.msi

โ“ FAQ

Q: How does Mango Finder ensure data privacy?

A: Mango Finder follows a local-first architecture to ensure data privacy:

Local Data Processing

  • All document indexing and search operations are performed locally on your device
  • No data is transmitted to external servers during normal operation

Exception Cases

  • Only when processing images or audio files, remote models may be used (if enabled)
  • These remote models are disabled by default and must be manually enabled by users

Data storage

  • All user data remains on the local device by default

Architecture Details

As shown in the architecture diagram above, the entire processing pipeline is designed to keep data local, ensuring maximum privacy and security.

Q: Why are so many models used in the code?

A: The codebase includes multiple models serving different purposes:

1. Active Local Models (Enabled by Default)

  • src-tauri/assets/model/*
  • These models run locally on users' computers for basic document processing
  • Prioritized for privacy and performance

2. Remote Models (Optional)

  • gpt-5-mini and gpt-4o-mini-transcribe
  • Designed for image and audio parsing
  • Disabled by default, can be enabled if needed
  • Note: We plan to replace these with local alternatives when available
  • Kept as optional features for self-hosting scenarios

3. Reserved Models (Future Features)

  • qwen-turbo, deepseek-chat, and deepseek-reasoner
  • Prepared for upcoming features like:
    • Knowledge graph generation
    • Advanced document analysis
  • Also serves as a foundation for developers who want to customize with these models
  • Maintains flexibility for future feature expansion

๐Ÿ“ License

see the LICENSE file for details.

๐Ÿค Contributing

Contributions of all kinds are welcome, including but not limited to:

  • ๐Ÿ› Reporting bugs
  • ๐Ÿ’ก Suggesting new features or improvements
  • ๐Ÿ“– Improving documentation
  • ๐Ÿ”ง Submitting code (pull requests)

Before submitting a pull request, please consider:

  1. Fork this repository
  2. Create a new branch (git checkout -b feature/xxx)
  3. Ensure pnpm tauri dev runs successfully locally
  4. Commit changes (git commit -m 'feat: xxx')
  5. Push the branch (git push origin feature/xxx)
  6. Submit a Pull Request

โญ Support the Project

Support Mango Finder if you find it helpful:

  • Starring the repository on GitHub
  • Recommending it to others
  • Sharing your experience