Support MiniCPM-V 4.0 #14983

tc-mb · 2025-07-31T05:42:06Z

Hi llama.cpp team,

Greetings from the MiniCPM-V team! As presented in our recent Nature Communications paper, "Efficient GPT-4V level multimodal large language model for deployment on edge devices", our mission has always been to empower the open-source community with highly efficient, edge-deployabl models.

In this pull request, I'm contributing support for MiniCPM-V 4.0, a multimodal model specifically designed for phone-sized devices.

As part of our effort to make this model broadly accessible, we plan to open-source the following three components:

Adaptation of MiniCPM-V 4.0 to llama.cpp
Apple NPU acceleration integrated into llama.cpp — to take full advantage of Apple's on-device hardware on macOS, iPadOS, and iOS.
A reference app demo built on top of the above adaptations — demonstrating how to deploy and run the multimodal model seamlessly on Apple devices. This app was recently showcased at the WAIC conference.

With these contributions, we hope to enable the community to run fast, efficient multimodal models across Mac/iPad/iPhone devices, and to customize or extend the codebase as needed.

This initial PR includes only the model integration and introduces minimal changes. I hope it can be reviewed and merged quickly. The NPU acceleration PR will follow shortly. Since it may involve more complex discussion around API design and integration, I would really appreciate the support and feedback from the llama.cpp community during that process.

Below is a GIF recording of our actual demo running entirely on an iPhone in airplane mode, showcasing the fully on-device deployment in action.

Looking forward to your review and collaboration!

Best regards,
MiniCPM-V team

ggerganov · 2025-07-31T06:14:42Z

We can merge after CI is green

tools/mtmd/legacy-models/minicpmv-convert-image-encoder-to-gguf.py

* support minicpm-v 4 * add md * support MiniCPM-o 4.0 * add default location * temp rm MiniCPM-o 4.0 * fix code * fix "minicpmv_projector" default path

tc-mb added 5 commits July 7, 2025 14:58

support minicpm-v 4

f37f8a9

add md

1d306ec

support MiniCPM-o 4.0

d2dde1b

add default location

6663544

temp rm MiniCPM-o 4.0

d726346

github-actions bot added documentation Improvements or additions to documentation examples python python script changes labels Jul 31, 2025

ggerganov approved these changes Jul 31, 2025

View reviewed changes

fix code

06bf4a1

CISC approved these changes Jul 31, 2025

View reviewed changes

tools/mtmd/legacy-models/minicpmv-convert-image-encoder-to-gguf.py Outdated Show resolved Hide resolved

fix "minicpmv_projector" default path

3cc1215

CISC merged commit 952a47f into ggml-org:master Jul 31, 2025
49 checks passed

tc-mb mentioned this pull request Aug 3, 2025

support MiniCPM-V-4 ollama/ollama#11647

Closed

This was referenced Aug 12, 2025

Apple NPU acceleration integrated into llama.cpp, using MiniCPM-V 4.0 as an example. #15262

Open

MiniCPM‑V‑4 GGUF vision not engaging in llama.cpp (llama-cpp-python 0.2.89) — images ignored OpenBMB/MiniCPM-V#957

Closed

tc-mb deleted the support-MiniCPM-V-4 branch September 2, 2025 05:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support MiniCPM-V 4.0 #14983

Support MiniCPM-V 4.0 #14983

Uh oh!

tc-mb commented Jul 31, 2025

Uh oh!

ggerganov commented Jul 31, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Support MiniCPM-V 4.0 #14983

Support MiniCPM-V 4.0 #14983

Uh oh!

Conversation

tc-mb commented Jul 31, 2025

Uh oh!

ggerganov commented Jul 31, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants