Tags: edude03/llama.cpp
Tags
common: fix warning (ggml-org#8036) * common: fix warning * Update common/common.cpp Co-authored-by: slaren <[email protected]> --------- Co-authored-by: slaren <[email protected]>
[SYCL] Fix windows build and inference (ggml-org#8003) * add sycl preset * fix debug link error. fix windows crash * update README
CUDA: stream-k decomposition for MMQ (ggml-org#8018) * CUDA: stream-k decomposition for MMQ * fix undefined memory reads for small matrices
metal : fix `ggml_metal_supports_op` for BF16 (ggml-org#8021) Currently the Metal backend does not support BF16. `ggml_metal_supports_op` was returning true in these cases, leading to a crash with models converted with `--leave-output-tensor`. This commit checks if the first few sources types are BF16 and returns false if that's the case.
[SYCL] refactor (ggml-org#6408) * seperate lower precision GEMM from the main files * fix workgroup size hardcode
tokenizer : BPE fixes (ggml-org#7530) * Random test: add_bos_token, add_eos_token * Random test: add BPE models for testing * Custom regex split fails with codepoint 0 * Fix falcon punctuation regex * Refactor llm_tokenizer_bpe: move code to constructor * Move 'add_special_bos/eos' logic to llm_tokenizer_bpe * Move tokenizer flags to vocab structure. * Default values for special_add_bos/eos * Build vocab.special_tokens_cache using vocab token types * Generalize 'jina-v2' per token attributes * Fix unicode whitespaces (deepseek-coder, deepseek-llm) * Skip missing byte tokens (falcon) * Better unicode data generation * Replace char32_t with uint32_t
Only use FIM middle token if it exists (ggml-org#7648) * Only use FIM middle if it exists * Only use FIM middle if it exists
PreviousNext