-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
CANN: implement the SSM_CONV operator
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
#17737
opened Dec 3, 2025 by
0Marble
Loading…
CANN: In the ROPE operator, yarn_ramp uses cache
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#17725
opened Dec 3, 2025 by
TianHao324
Loading…
build: for GGML_BACKEND_DL, ggml need not depend on backend
ggml
changes relating to the ggml tensor library for machine learning
#17709
opened Dec 3, 2025 by
jeffbolznv
Loading…
common: Deepseek V3.2 tool call parser
testing
Everything test related
#17707
opened Dec 3, 2025 by
hksdpc255
Loading…
CANN: Support fusion operator that supports mul and add
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
#17706
opened Dec 3, 2025 by
TianHao324
•
Draft
model : add ASR support for LFM2-Audio-1.5B
examples
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
testing
Everything test related
Add Support for Microsoft Phi-3.5 Vision Instruct Models
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
Ascend NPU
issues specific to Ascend NPUs
build
Compilation issues
devops
improvements to build systems and github actions
documentation
Improvements or additions to documentation
examples
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
script
Script related
server
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
ggml: added missing cast sections in memcpy
ggml
changes relating to the ggml tensor library for machine learning
vibe-coded
Created with heavy use of LLM assistants, requires human verification
#17651
opened Dec 1, 2025 by
GermanAizek
Loading…
sgemm: reuse loaded vector in AVX dot product calculation
ggml
changes relating to the ggml tensor library for machine learning
vibe-coded
Created with heavy use of LLM assistants, requires human verification
#17648
opened Dec 1, 2025 by
GermanAizek
Loading…
vec: optimize AVX2/FMA sum-of-squares with loop unrolling and FMA
ggml
changes relating to the ggml tensor library for machine learning
vibe-coded
Created with heavy use of LLM assistants, requires human verification
#17642
opened Dec 1, 2025 by
GermanAizek
Loading…
ggml-quants: use _mm256_testz_si256 for mask checks in AVX2
ggml
changes relating to the ggml tensor library for machine learning
vibe-coded
Created with heavy use of LLM assistants, requires human verification
#17641
opened Dec 1, 2025 by
GermanAizek
Loading…
ggml-alloc: optimize free block shifting with changes relating to the ggml tensor library for machine learning
vibe-coded
Created with heavy use of LLM assistants, requires human verification
memmove
ggml
#17640
opened Dec 1, 2025 by
GermanAizek
Loading…
llama-router, the C++ "llama-swap" for llama.cpp
examples
need feedback
Testing and feedback with results are needed
server
testing
Everything test related
#17629
opened Nov 30, 2025 by
ServeurpersoCom
•
Draft
model : Fix marker placement for LFM2-VL in single turn llama-mtmd-cli
examples
#17616
opened Nov 30, 2025 by
tdakhran
Loading…
Feature/kimi linear support
ggml
changes relating to the ggml tensor library for machine learning
model
Model specific
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
#17592
opened Nov 29, 2025 by
cacaview
Loading…
Add PagedAttention support (experimental, CUDA only)
examples
ggml
changes relating to the ggml tensor library for machine learning
model
Model specific
Nvidia GPU
Issues specific to Nvidia GPUs
server
#17579
opened Nov 28, 2025 by
ericcurtin
•
Draft
Fix unreadable user markdown colors and truncate long texts in deletion dialogs
examples
server
#17555
opened Nov 27, 2025 by
ServeurpersoCom
Loading…
New llama-run
examples
python
python script changes
script
Script related
server
#17554
opened Nov 27, 2025 by
ericcurtin
Loading…
llama.cpp with sentencepiece
testing
Everything test related
#17529
opened Nov 26, 2025 by
awenzel67
Loading…
ggml-cpu: BMI2 is only available on amd64
ggml
changes relating to the ggml tensor library for machine learning
#17528
opened Nov 26, 2025 by
candrews
Loading…
CANN: add operator fusion support for ADD+RMS_NORM operations
Ascend NPU
issues specific to Ascend NPUs
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
#17512
opened Nov 26, 2025 by
noemotiovon
Loading…
ProTip!
Exclude everything labeled
bug with -label:bug.