Releases · tc-mb/llama.cpp

07 Jul 07:55

6491d6e

b5835 Latest

Latest

vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (#14485)

Commit taken from remyoudompheng's PR https://github.com/ggml-org/llama.cpp/pull/12260

Co-authored-by: Rémy Oudompheng <[email protected]>

Assets 15

cudart-llama-bin-win-cuda-12.4-x64.zip

sha256:8c79a9b226de4b3cacfd1f83d24f962d0773be79f1e7b75c6af4ded7e32ae1d6

373 MB 2025-07-07T07:55:57Z
llama-b5835-bin-macos-arm64.zip

sha256:9bb2faa7a9754334396ed31f9b21b74d724a363c9aca4faae3a81efa6e368e1a

10.5 MB 2025-07-07T07:56:10Z
llama-b5835-bin-macos-x64.zip

sha256:1b88f1368b31c1dd40246973865aaaf6d4bfb2ab7587499f555530644ba0b829

26.3 MB 2025-07-07T07:56:11Z
llama-b5835-bin-ubuntu-vulkan-x64.zip

sha256:b98cd2043b0145364136dc9f18657f682593e1c511f34a5ec117f8da992d3511

20.2 MB 2025-07-07T07:56:13Z
llama-b5835-bin-ubuntu-x64.zip

sha256:5e8f7cac274dc7e1f4c4155b4a23ed67be305cc5e9cbfd7040af405830070b50

12.4 MB 2025-07-07T07:56:14Z
llama-b5835-bin-win-cpu-arm64.zip

sha256:ee2fb457c5dac8dbc653f5866609c9786a402872cdd0f9dd49ce8d2e63408009

10.8 MB 2025-07-07T07:56:15Z
llama-b5835-bin-win-cpu-x64.zip

sha256:011f279eb35d5a8f8d641bed50926a9dc6a9428d59b8afde9a817063a376026b

13.6 MB 2025-07-07T07:56:16Z
llama-b5835-bin-win-cuda-12.4-x64.zip

sha256:d29dc479f0f71141462b2156a96f0847cbf43efa6ba8f957ed26a97e6ca9fc6e

128 MB 2025-07-07T07:56:18Z
llama-b5835-bin-win-hip-radeon-x64.zip

sha256:a2a41fd60dc984179f8b1459c9e468b75a165f55dfd48cbfa87ca5c5c2c6ff3e

298 MB 2025-07-07T07:56:23Z
llama-b5835-bin-win-opencl-adreno-arm64.zip

sha256:d4a35ea276bcbc59b7c78a4b4670b3289f3b65ed8c2c07baab87a594040812c0

11.1 MB 2025-07-07T07:56:36Z
Source code (zip)

2025-07-06T10:29:36Z
Source code (tar.gz)

2025-07-06T10:29:36Z

03 Jul 12:41

github-actions

b5819

7b63a71

b5819

Fix conditional enabling following arch checks for ggml-sycl (#14504)

Signed-off-by: nscipione <[email protected]>

Assets 15

01 Jul 07:54

github-actions

b5787

0a5a3b5

b5787

Add Conv2d for CPU (#14388)

* Conv2D: Add CPU version

* Half decent

* Tiled approach for F32

* remove file

* Fix tests

* Support F16 operations

* add assert about size

* Review: further formatting fixes, add assert and use CPU version of fp32->fp16

Assets 15

30 Jun 08:35

github-actions

b5780

caf5681

b5780

server : support jinja extra template kwargs (Qwen3 enable_thinking f…

Assets 15

09 Jun 04:41

github-actions

b5607

056eb74

b5607

CANN: Enable labeler for Ascend NPU (#13914)

Assets 15

22 Apr 08:38

github-actions

b5165

1d735c0

b5165

ggml : add SSE 4.2 and x64 base variant for CPUs without AVX (#12871)

* ggml : add SSE 4.2 variant for CPUs without AVX

* ggml : add x64 base ABI variant

Assets 26

17 Apr 09:04

github-actions

b5145

12b1750

b5145

opencl: fix incorrect local_size index in profiling log (#12868)

Assets 26

14 Apr 08:25

github-actions

b5129

526739b

b5129

sync : ggml

ggml-ci

Assets 26

27 Mar 09:28

github-actions

b4974

029c693

b4974

sync : ggml

ggml-ci

Assets 26

18 Mar 08:43

github-actions

b4909

fd123cf

b4909

Vulkan: Default to 1GB allocations instead of 4GB to avoid fragmentat…

Assets 26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: tc-mb/llama.cpp

b5835

Uh oh!

b5819

Uh oh!

b5787

Uh oh!

b5780

Uh oh!

b5607

Uh oh!

b5165

Uh oh!

b5145

Uh oh!

b5129

Uh oh!

b4974

Uh oh!

b4909

Uh oh!