-
Notifications
You must be signed in to change notification settings - Fork 17.4k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
common/gemma4 : fix multi-turn tool call parsing with literal <|tool_call> in content
testing
Everything test related
#22367
opened Apr 25, 2026 by
arkavo-com
Contributor
•
Draft
Convert argv from UTF-16 on Windows for non-ASCII -p prompts
examples
#22366
opened Apr 25, 2026 by
duzenko
Loading…
[Tensor Parallel] Fix recurrent state serialization for partial reads and writes
ggml
changes relating to the ggml tensor library for machine learning
#22362
opened Apr 25, 2026 by
gaugarg-nv
Contributor
Loading…
Add DeepSeek V4 GGUF conversion
python
python script changes
#22359
opened Apr 25, 2026 by
nisparks
Contributor
Loading…
convert : support input_scale for fp8 modelopt
python
python script changes
#22356
opened Apr 25, 2026 by
CISC
Member
Loading…
ggml : revert to -lm linking instead of find_library
ggml
changes relating to the ggml tensor library for machine learning
#22355
opened Apr 25, 2026 by
angt
Member
Loading…
rpc: add ipv6 support
examples
ggml
changes relating to the ggml tensor library for machine learning
#22350
opened Apr 25, 2026 by
alphaonex86
Loading…
fix: read n_embd before vocab_only early return for mmproj init
#22348
opened Apr 25, 2026 by
ChenYFan
Loading…
ggml-cpu: optimize avx2 q6_k
ggml
changes relating to the ggml tensor library for machine learning
merge ready
A maintainer can use this label to indicate that they consider the changes final and ready to merge.
#22345
opened Apr 25, 2026 by
netrunnereve
Collaborator
Loading…
ggml-webgpu: fast matrix-vector multiplication for i-quants
ggml
changes relating to the ggml tensor library for machine learning
WebGPU
#22344
opened Apr 25, 2026 by
SharmaRithik
Loading…
chat: preserve media markers for typed-content templates
#22342
opened Apr 25, 2026 by
AlexonOliveiraRH
Loading…
2 tasks done
ggml: implement changes relating to the ggml tensor library for machine learning
testing
Everything test related
gguf_init_from_buffer
ggml
#22341
opened Apr 24, 2026 by
giladgd
Contributor
Loading…
common: fix missing exports in llama-common
#22340
opened Apr 24, 2026 by
max-krasnyansky
Member
Loading…
cpu : re-enable fast gelu_quick_f16
ggml
changes relating to the ggml tensor library for machine learning
merge ready
A maintainer can use this label to indicate that they consider the changes final and ready to merge.
#22339
opened Apr 24, 2026 by
CISC
Member
Loading…
server: respect per-request enable_thinking toggle via extra_body
examples
server
#22336
opened Apr 24, 2026 by
pju-hoge
Loading…
opencl: refactor Adreno q4_0
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
ggml-webgpu: silence subgroup_uniformity diagnostic in mul_mat_vec
ggml
changes relating to the ggml tensor library for machine learning
WebGPU
#22332
opened Apr 24, 2026 by
SharmaRithik
Loading…
ggml-cpu: optimize q8 quantization on x86 SIMD
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
#22331
opened Apr 24, 2026 by
bitRAKE
Contributor
Loading…
CUDA: better coalesce data-access for contiguous concat
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#22330
opened Apr 24, 2026 by
ORippler
Collaborator
Loading…
llama-bench: fix numerical instability in stdev() calculation
examples
#22329
opened Apr 24, 2026 by
abhinavuser
Loading…
Update server README.md to clarify cache directory choice behavior
examples
server
#22328
opened Apr 24, 2026 by
de1eted-user
Loading…
Update README.md to add AI Playground to the UI list
#22326
opened Apr 24, 2026 by
qiacheng
Loading…
common : re-arm reasoning budget after DONE on new <think>
testing
Everything test related
wontfix
This will not be worked on
#22323
opened Apr 24, 2026 by
BruceJillis
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2026-04-22.