-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
test-backend-ops: print failed tests at the end
testing
Everything test related
#16785
opened Oct 26, 2025 by
am17an
Loading…
webui: auto-refresh /props on inference start to resync model metadata
examples
server
#16784
opened Oct 26, 2025 by
ServeurpersoCom
Loading…
[model] add support for qwen3vl series
examples
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
#16780
opened Oct 26, 2025 by
JJJYmmm
Loading…
Add basic support for MXFP6_MOE quantization
examples
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
ggml: add s390x cpu-feats
devops
improvements to build systems and github actions
ggml
changes relating to the ggml tensor library for machine learning
#16774
opened Oct 26, 2025 by
taronaeo
Loading…
Adding CUDA release for Ubuntu
devops
improvements to build systems and github actions
#16773
opened Oct 26, 2025 by
bannazz
Loading…
vulkan: Fuse rope+set_rows
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#16769
opened Oct 25, 2025 by
jeffbolznv
Loading…
llama: fix leaked buffers for mmap + split files
#16765
opened Oct 25, 2025 by
JohannesGaessler
Loading…
model : add LightOnOCR-1B model
examples
python
python script changes
#16764
opened Oct 24, 2025 by
ngxson
Loading…
Add LFM2 tool handling
testing
Everything test related
#16763
opened Oct 24, 2025 by
ykhrustalev
Loading…
webui: add HTML/JS preview support to MarkdownContent with sandboxed iframe
examples
server
#16757
opened Oct 24, 2025 by
ServeurpersoCom
Loading…
qwen3-coder tool call parser
testing
Everything test related
#16755
opened Oct 24, 2025 by
marceldev89
Loading…
rpc: use changes relating to the ggml tensor library for machine learning
XXHash64 instead of FNV-1a for hashing tensors
ggml
cann: improve device ID handling and aclnnArange checks
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#16752
opened Oct 24, 2025 by
noemotiovon
Loading…
llama : disable pipeline parallelism if compute buffer allocation fails
#16748
opened Oct 23, 2025 by
slaren
Loading…
llama: consistent ctx <-> buf order for KV cache
ggml
changes relating to the ggml tensor library for machine learning
#16746
opened Oct 23, 2025 by
JohannesGaessler
Loading…
ggml: fix cuda kernel launch configuration for k_compute_batched_ptrs to support large batch
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#16744
opened Oct 23, 2025 by
leejet
Loading…
get_rows & dequantize function implementation for repacked weights of type q6_K (q6_Kx8)
ggml
changes relating to the ggml tensor library for machine learning
#16743
opened Oct 23, 2025 by
swetha097
Loading…
ggml-cpu: arm64: q4_K repack gemm and gemv implementations
ggml
changes relating to the ggml tensor library for machine learning
#16739
opened Oct 23, 2025 by
Alcpz
Loading…
sycl: add REPEAT_BACK operation support
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#16734
opened Oct 23, 2025 by
shani-f
Loading…
llama-server : Reduce log level of a debug message leaking prompt contents
examples
server
#16727
opened Oct 22, 2025 by
l-austenfeld
Loading…
CUDA: support for weight clamp in top-k norm
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#16702
opened Oct 21, 2025 by
am17an
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.