Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

CANN: implement the SSM_CONV operator Ascend NPU issues specific to Ascend NPUs ggml changes relating to the ggml tensor library for machine learning testing Everything test related
#17737 opened Dec 3, 2025 by 0Marble Loading…
CANN: In the ROPE operator, yarn_ramp uses cache Ascend NPU issues specific to Ascend NPUs ggml changes relating to the ggml tensor library for machine learning
#17725 opened Dec 3, 2025 by TianHao324 Loading…
build: for GGML_BACKEND_DL, ggml need not depend on backend ggml changes relating to the ggml tensor library for machine learning
#17709 opened Dec 3, 2025 by jeffbolznv Loading…
common: Deepseek V3.2 tool call parser testing Everything test related
#17707 opened Dec 3, 2025 by hksdpc255 Loading…
CANN: Support fusion operator that supports mul and add Ascend NPU issues specific to Ascend NPUs ggml changes relating to the ggml tensor library for machine learning testing Everything test related
#17706 opened Dec 3, 2025 by TianHao324 Draft
model : add ASR support for LFM2-Audio-1.5B examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs python python script changes testing Everything test related
#17694 opened Dec 2, 2025 by tdakhran Draft
Add Support for Microsoft Phi-3.5 Vision Instruct Models Apple Metal https://en.wikipedia.org/wiki/Metal_(API) Ascend NPU issues specific to Ascend NPUs build Compilation issues devops improvements to build systems and github actions documentation Improvements or additions to documentation examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs python python script changes script Script related server testing Everything test related Vulkan Issues specific to the Vulkan backend
#17687 opened Dec 2, 2025 by z-manoj Draft
ggml: added missing cast sections in memcpy ggml changes relating to the ggml tensor library for machine learning vibe-coded Created with heavy use of LLM assistants, requires human verification
#17651 opened Dec 1, 2025 by GermanAizek Loading…
sgemm: reuse loaded vector in AVX dot product calculation ggml changes relating to the ggml tensor library for machine learning vibe-coded Created with heavy use of LLM assistants, requires human verification
#17648 opened Dec 1, 2025 by GermanAizek Loading…
vec: optimize AVX2/FMA sum-of-squares with loop unrolling and FMA ggml changes relating to the ggml tensor library for machine learning vibe-coded Created with heavy use of LLM assistants, requires human verification
#17642 opened Dec 1, 2025 by GermanAizek Loading…
ggml-quants: use _mm256_testz_si256 for mask checks in AVX2 ggml changes relating to the ggml tensor library for machine learning vibe-coded Created with heavy use of LLM assistants, requires human verification
#17641 opened Dec 1, 2025 by GermanAizek Loading…
ggml-alloc: optimize free block shifting with memmove ggml changes relating to the ggml tensor library for machine learning vibe-coded Created with heavy use of LLM assistants, requires human verification
#17640 opened Dec 1, 2025 by GermanAizek Loading…
llama-router, the C++ "llama-swap" for llama.cpp examples need feedback Testing and feedback with results are needed server testing Everything test related
#17629 opened Nov 30, 2025 by ServeurpersoCom Draft
common : add minimalist multi-thread progress bar
#17602 opened Nov 29, 2025 by angt Loading…
Feature/kimi linear support ggml changes relating to the ggml tensor library for machine learning model Model specific Nvidia GPU Issues specific to Nvidia GPUs python python script changes
#17592 opened Nov 29, 2025 by cacaview Loading…
Improve Qwen3-Next Speed model Model specific
#17585 opened Nov 29, 2025 by lovedheart Draft
Add PagedAttention support (experimental, CUDA only) examples ggml changes relating to the ggml tensor library for machine learning model Model specific Nvidia GPU Issues specific to Nvidia GPUs server
#17579 opened Nov 28, 2025 by ericcurtin Draft
mtmd: support dots.ocr examples python python script changes
#17575 opened Nov 28, 2025 by ngxson Draft
New llama-run examples python python script changes script Script related server
#17554 opened Nov 27, 2025 by ericcurtin Loading…
llama.cpp with sentencepiece testing Everything test related
#17529 opened Nov 26, 2025 by awenzel67 Loading…
ggml-cpu: BMI2 is only available on amd64 ggml changes relating to the ggml tensor library for machine learning
#17528 opened Nov 26, 2025 by candrews Loading…
CANN: add operator fusion support for ADD+RMS_NORM operations Ascend NPU issues specific to Ascend NPUs documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning testing Everything test related
#17512 opened Nov 26, 2025 by noemotiovon Loading…
llama: remove init f16 tables
#17511 opened Nov 26, 2025 by tosone Loading…
ProTip! Exclude everything labeled bug with -label:bug.