Tags · psugihara/llama.cpp

b2360

llama-bench : add embeddings option (ggml-org#5924)

* llama-bench : add embeddings option

* llama-bench : do not hard code embd default value

---------

Co-authored-by: slaren <[email protected]>

Mar 7, 2024
6cdabe6
zip
tar.gz
Downloads

b2167

cmake : fix VULKAN and ROCm builds (ggml-org#5525)

* cmake : fix VULKAN and ROCm builds

* cmake : fix (cont)

* vulkan : fix compile warnings

ggml-ci

* cmake : fix

ggml-ci

* cmake : minor

ggml-ci

Feb 16, 2024
5bf2b94
zip
tar.gz
Downloads

b2042

add --no-mmap in llama-bench (ggml-org#5257)

* add --no-mmap, show sycl backend

* fix conflict

* fix code format, change print for --no-mmap

* ren no_mmap to mmap, show mmap when not default value in printer

* update guide for mmap

* mv position to reduce model reload

Feb 1, 2024
128dcbd
zip
tar.gz
Downloads

b1963

nix-shell: use addToSearchPath

thx to @SomeoneSerge for the suggestion!

Jan 24, 2024
c9b316c
zip
tar.gz
Downloads

b1794

common : fix the short form of `--grp-attn-w`, not `-gat` (ggml-org#4825

)

See https://github.com/ggerganov/llama.cpp/blob/master/common/common.cpp#L230C53-L230C57

Jan 8, 2024
1fc2f26
zip
tar.gz
Downloads

b1767

server : send token probs for "stream == false" (ggml-org#4714)

Jan 4, 2024
012cf34
zip
tar.gz
Downloads

b1761

train : fix typo in overlapping-samples help msg (ggml-org#4758)

This commit fixes a typo in the help message for the
--overlapping-samples option.

Signed-off-by: Daniel Bevenius <[email protected]>

Jan 3, 2024
cb1e281
zip
tar.gz
Downloads

b1759

cuda : simplify expression

Co-authored-by: slaren <[email protected]>

Jan 3, 2024
7bed7eb
zip
tar.gz
Downloads

b1709

gpt2 : Add gpt2 architecture integration (ggml-org#4555)

Dec 28, 2023
ea5497d
zip
tar.gz
Downloads

b1708

llama : add AWQ for llama, llama2, mpt, and mistral models (ggml-org#…

…4593)

* update: awq support llama-7b model

* update: change order

* update: benchmark results for llama2-7b

* update: mistral 7b v1 benchmark

* update: support 4 models

* fix: Readme

* update: ready for PR

* update: readme

* fix: readme

* update: change order import

* black

* format code

* update: work for bot mpt and awqmpt

* update: readme

* Rename to llm_build_ffn_mpt_awq

* Formatted other files

* Fixed params count

* fix: remove code

* update: more detail for mpt

* fix: readme

* fix: readme

* update: change folder architecture

* fix: common.cpp

* fix: readme

* fix: remove ggml_repeat

* update: cicd

* update: cicd

* uppdate: remove use_awq arg

* update: readme

* llama : adapt plamo to new ffn

ggml-ci

---------

Co-authored-by: Trần Đức Nam <[email protected]>
Co-authored-by: Le Hoang Anh <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>

Dec 27, 2023
f679349
zip
tar.gz
Downloads

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

b2360

b2167

b2042

b1963

b1794

b1767

b1761

b1759

b1709

b1708

Tags: psugihara/llama.cpp