Skip to content

Releases: tc-mb/llama.cpp

b5835

07 Jul 07:55
6491d6e
Compare
Choose a tag to compare
vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (#14485)

Commit taken from remyoudompheng's PR https://github.com/ggml-org/llama.cpp/pull/12260

Co-authored-by: Rémy Oudompheng <[email protected]>

b5819

03 Jul 12:41
7b63a71
Compare
Choose a tag to compare
Fix conditional enabling following arch checks for ggml-sycl (#14504)

Signed-off-by: nscipione <[email protected]>

b5787

01 Jul 07:54
0a5a3b5
Compare
Choose a tag to compare
Add Conv2d for CPU (#14388)

* Conv2D: Add CPU version

* Half decent

* Tiled approach for F32

* remove file

* Fix tests

* Support F16 operations

* add assert about size

* Review: further formatting fixes, add assert and use CPU version of fp32->fp16

b5780

30 Jun 08:35
caf5681
Compare
Choose a tag to compare
server : support jinja extra template kwargs (Qwen3 enable_thinking f…

b5607

09 Jun 04:41
056eb74
Compare
Choose a tag to compare
CANN: Enable labeler for Ascend NPU (#13914)

b5165

22 Apr 08:38
1d735c0
Compare
Choose a tag to compare
ggml : add SSE 4.2 and x64 base variant for CPUs without AVX (#12871)

* ggml : add SSE 4.2 variant for CPUs without AVX

* ggml : add x64 base ABI variant

b5145

17 Apr 09:04
12b1750
Compare
Choose a tag to compare
opencl: fix incorrect local_size index in profiling log (#12868)

b5129

14 Apr 08:25
Compare
Choose a tag to compare
sync : ggml

ggml-ci

b4974

27 Mar 09:28
Compare
Choose a tag to compare
sync : ggml

ggml-ci

b4909

18 Mar 08:43
fd123cf
Compare
Choose a tag to compare
Vulkan: Default to 1GB allocations instead of 4GB to avoid fragmentat…