Commit 77a7340
authored
ggml : add new Q4_2 quantization (ARM only) (ggml-org#1046)
* ggml : Q4_2 ARM
* ggml : add ggml_is_quantized()
* llama : update llama_type_name() with Q4_2 entry
* ggml : speed-up q4_2
- 4 threads: ~100ms -> ~90ms
- 8 threads: ~55ms -> ~50ms
* ggml : optimize q4_2 using vmlaq_n_f32 + vmulq_n_f321 parent 50a8a2a commit 77a7340
File tree
5 files changed
+287
-11
lines changed- examples/quantize
5 files changed
+287
-11
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| |||
0 commit comments