Tags · fish23/llama.cpp

b2116

metal : use autoreleasepool to avoid memory leaks (ggml-org#5437)

There appears to be a known memory leak when using the
`MLTCommandBuffer`. It is suggested to use `@autoreleasepool` in
[1,2]

[1] https://developer.apple.com/forums/thread/662721
[2] https://forums.developer.apple.com/forums/thread/120931

This change-set wraps the `ggml_metal_graph_compute` in a
`@autoreleasepool`.

This commit addresses ggml-org#5436

Feb 10, 2024
f026f81
zip
tar.gz

b2114

sync : ggml

Feb 10, 2024
43b65f5
zip
tar.gz

b2110

server : fix prompt caching for repeated prompts (ggml-org#5420)

Feb 9, 2024
7c777fc
zip
tar.gz

b2109

llama : do not cap thread count when MoE on CPU (ggml-org#5419)

* Not capping thread count when MoE inference is running on CPU

* Whitespace

Feb 9, 2024
e5ca393
zip
tar.gz

b2107

ggml : fix `error C2078: too many initializers` for MSVC ARM64 (ggml-…

…org#5404)

Feb 9, 2024
b2f87cb
zip
tar.gz

b2106

Fix Vulkan crash on APUs with very little device memory (ggml-org#5424)

* Fix Vulkan crash on APUs with very little device memory

* Fix debug output function names

Feb 9, 2024
44fbe34
zip
tar.gz

b2105

CUDA: more warps for mmvq on NVIDIA (ggml-org#5394)

Feb 8, 2024
8e6a9d2
zip
tar.gz

b2104

llama : do not print "offloading layers" message in CPU-only builds (g…

…gml-org#5416)

Feb 8, 2024
41f308f
zip
tar.gz

b2103

Fix f16_sycl cpy call from Arc (ggml-org#5411)

* fix f16_sycl cpy call

* rm old logic

* add fp16 build CI

* use macro

* format fix

Feb 8, 2024
6e99f2a
zip
tar.gz

b2101

fix trailing whitespace (ggml-org#5407)

Feb 8, 2024
b7b74ce
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

b2116

b2114

b2110

b2109

b2107

b2106

b2105

b2104

b2103

b2101

Tags: fish23/llama.cpp