Releases: tc-mb/llama.cpp
Releases · tc-mb/llama.cpp
b5835
vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (#14485) Commit taken from remyoudompheng's PR https://github.com/ggml-org/llama.cpp/pull/12260 Co-authored-by: Rémy Oudompheng <[email protected]>
b5819
Fix conditional enabling following arch checks for ggml-sycl (#14504) Signed-off-by: nscipione <[email protected]>
b5787
Add Conv2d for CPU (#14388) * Conv2D: Add CPU version * Half decent * Tiled approach for F32 * remove file * Fix tests * Support F16 operations * add assert about size * Review: further formatting fixes, add assert and use CPU version of fp32->fp16
b5780
server : support jinja extra template kwargs (Qwen3 enable_thinking f…
b5607
CANN: Enable labeler for Ascend NPU (#13914)
b5165
ggml : add SSE 4.2 and x64 base variant for CPUs without AVX (#12871) * ggml : add SSE 4.2 variant for CPUs without AVX * ggml : add x64 base ABI variant
b5145
opencl: fix incorrect local_size index in profiling log (#12868)
b5129
sync : ggml ggml-ci
b4974
sync : ggml ggml-ci
b4909
Vulkan: Default to 1GB allocations instead of 4GB to avoid fragmentat…