-
Notifications
You must be signed in to change notification settings - Fork 17.7k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
server: support input_image in function_call_output (#20663)
examples
server
#22575
opened May 1, 2026 by
Empressia
Loading…
hexagon: enable non-contiguous row tensor support for unary ops
ggml
changes relating to the ggml tensor library for machine learning
Hexagon
#22574
opened May 1, 2026 by
aparmp-quic
Contributor
Loading…
llama-quant : fix
--tensor-type when default qtype is overriden
#22572
opened May 1, 2026 by
ddh0
Contributor
Loading…
Swap out F16 for BF16 in Q8_1 activations to avoid overflowing values
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#22571
opened May 1, 2026 by
bartowski1182
Contributor
•
Draft
[Draft] feat: implement paged KV cache and attention
examples
ggml
changes relating to the ggml tensor library for machine learning
model
Model specific
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
ggml-cpu: fix msvc c2440 cast error for m512bh and m256bh in sgemm.cpp
ggml
changes relating to the ggml tensor library for machine learning
#22568
opened Apr 30, 2026 by
nanodan52
Loading…
devops: SYCL: upgraded the default compute-runtime version
devops
improvements to build systems and github actions
#22567
opened Apr 30, 2026 by
WizardlyBump17
Contributor
Loading…
fix: consistent memory breakdown for models loaded with Everything test related
no_alloc
testing
#22566
opened Apr 30, 2026 by
giladgd
Contributor
Loading…
cmake: fix MATH_LIBRARY check on Windows MSVC
ggml
changes relating to the ggml tensor library for machine learning
#22564
opened Apr 30, 2026 by
ServeurpersoCom
Contributor
Loading…
chat: preserve media markers for typed-content templates
#22563
opened Apr 30, 2026 by
AlexonOliveiraRH
Loading…
server : avoid checkpoint data host copies
examples
server
#22558
opened Apr 30, 2026 by
ggerganov
Member
Loading…
ggml-virtgpu: fix transitive dependency in headers
ggml
changes relating to the ggml tensor library for machine learning
#22557
opened Apr 30, 2026 by
Juste-Leo2
Loading…
kleidiai : update to v1.24.0 and use release archive
ggml
changes relating to the ggml tensor library for machine learning
#22549
opened Apr 30, 2026 by
chaxu01
Collaborator
Loading…
spec : allow for multiple spec types (chains of speculators)
examples
server
#22546
opened Apr 30, 2026 by
petersid2022
Contributor
Loading…
docs : update speculative decoding parameters after refactor (#22397)
documentation
Improvements or additions to documentation
fix: CUDA device PCI bus ID de-dupe OOMing (ignoring other 3 gpus entirely)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#22533
opened Apr 29, 2026 by
lucyknada
Loading…
[Model] Support MiniCPM-V 4.6
documentation
Improvements or additions to documentation
examples
python
python script changes
#22529
opened Apr 29, 2026 by
tc-mb
Contributor
Loading…
sycl: Add optional USM system allocations
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#22526
opened Apr 29, 2026 by
ifdu
Loading…
ggml-cpu: optimize ggml_gemm_q4_K_8x8_q8_K interleaving/staging for AVX-512 (and AVX2)
ggml
changes relating to the ggml tensor library for machine learning
#22525
opened Apr 29, 2026 by
HyeongiJeon
Loading…
Programmatic Dependent Launch (PDL) for more performance on newer NVIDIA GPUs (Hopper+)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
Previous Next
ProTip!
Adding no:label will show everything without a label.