Add Kimi-K2.5 support #19170

AesSedai · 2026-01-29T06:28:11Z

Adding support for https://huggingface.co/moonshotai/Kimi-K2.5

Since this model includes compressed-tensors (INT4 for the conditional experts), I moved the dequant_model to the prepare_tensors call at @compilade's suggestion. The model conversion fails otherwise because the quantization_config is nested under the text_config in the config.json.

Additionally, this model adds some new keys for the vision tower, prefixed as vt_, and the preprocessor_config.json has the expected fields nested in the media_proc_cfg key.

This PR does not include the "hacked" Q4_0 changes by @jukofyork, referred to in this comment.

I have added a first pass at vision support, heavily aided by LLM assistance. I entirely expect @ngxson to tear it to shreds or call me a dummy and show me an easier way to add that vision support :)

AesSedai · 2026-01-29T06:37:43Z

I was testing the VL locally with this image, and the thinking block shows the oddness:

This is an interesting image. The user is asking me to identify what's in the picture. Let me look carefully.

The image shows what appears to be a furry animal resting on a tree branch. It looks like it has grayish-brown fur, and there are distinctive white markings on its face/ears that look like little tufts or tassels. The animal appears to be sleeping or resting with its face tucked in.

Based on the appearance - specifically the white ear tufts and the overall body shape - this looks like a **Common Marmoset** (Callithrix jacchus), also known as the White-tufted-ear Marmoset. These are small New World monkeys native to Brazil. The white ear tufts are their most distinctive feature.

However, I should also consider if it could be something else. Could it be a raccoon? No, the body shape is wrong and raccoons don't have white ear tufts like that. Could it be a type of lemur? Some lemurs have ear tufts, but the body shape here looks more like a marmoset or tamarin. The Common Marmoset fits best with those prominent white ear tufts.

The animal is curled up on a branch, likely sleeping or resting, with its tail visible (appears to be a long, furry tail). The background is green foliage, suggesting a natural or semi-natural habitat.

So my answer should identify it as a Common Marmoset or White-tufted Marmoset, pointing out the distinctive ear tufts and the fact that it appears to be resting on a branch.

Wait, looking closer at the image... is it possible this is multiple animals? Let me check. The pattern seems to show overlapping forms. Actually, looking carefully, this might be TWO marmosets huddled together? Or is it just one curled up?

Actually, looking at the fur pattern and the way the bodies are arranged, this appears to be two common marmosets huddled together on the branch, sleeping. I can see what looks like two heads with white ear tufts close together. Yes, that makes sense - they often sleep huddled together for warmth and security.

So the answer is: Common Marmosets (White-tufted Marmosets), likely two of them, sleeping or resting on a branch.

I should mention:
- Species: Common Marmoset (Callithrix jacchus) or White-tufted-ear Marmoset
- Distinctive white ear tufts
- They appear to be sleeping/resting on a tree branch
- Possibly two individuals huddled together
- Native to Brazil
- Small New World monkeys

This seems accurate based on the visual evidence.

Specifically, this line lead me to believe there's a problem:

The pattern seems to show overlapping forms

and I've seen that in other local generations with this image too, eg:

Actually, looking at the pattern again, this looks like it might be multiple images stacked or a glitch effect? No, looking closely, it appears to be a single image of a raccoon resting on a branch, but there might be some artifacting or the image is showing the raccoon in a specific pose. The repetition pattern suggests it might be a glitch or the image is tiled strangely? No, actually, looking carefully, it seems like the image might be corrupted or displaying incorrectly, or it's an artistic effect?

Wait, no, looking at the image again - it appears to be a raccoon lying on a tree branch, and the image quality or compression creates a somewhat repetitive pattern, but it's clearly a raccoon. The distinctive facial markings (black mask, white nose/forehead), the gray fur, the bushy tail possibly visible - this is definitely a Common Raccoon (Procyon lotor).

In comparison, this is a bit of the thinking from the OpenRouter API for Kimi-K2.5:

The user wants to know what's in the picture. Looking at the image, it's clearly a raccoon lying on a tree branch. The raccoon has the distinctive black mask around its eyes, gray fur, and is draped over the branch in a relaxed or tired pose. The background shows a forest or wooded area with green foliage.

This is a straightforward image description task. I should identify the animal correctly as a raccoon and describe what it's doing (resting on a branch). I don't need to overcomplicate this or add fictional elements since the user asked a direct question about the image content.

Vastly different feel in the confidence of its answer based on what the VL sees.

CISC · 2026-01-29T08:48:50Z

While the mmproj conversion appears to work and the model loads and can decode images, I've got some weird output when using the vision component that leads me to believe there is a conversion issue somewhere or some other missing component. I think I need some review from @ngxson to help get it working correctly.

Yep, seems something is not quite right yet.

CISC · 2026-01-29T08:51:06Z

convert_hf_to_gguf.py

+                        **cfg["media_proc_cfg"],
+                    }
+                # merge configs
+                self.preprocessor_config = {**self.preprocessor_config, **cfg}


self.preprocessor_config is empty at this point, so not really necessary to merge, but will allow it for consistent looks.

Add new kimi-k2.5 keys to mtmd convert Update V_MMPROJ tensor mapping for new mm_projector.proj keys Update V_M_IMP_NORM for new mm_projector.pre_norm key

AesSedai · 2026-02-01T10:36:25Z

Vision is working now for images, uploaded MMPROJ files to my repo.

@ngxson I left comments about the places that confused me the most.

the resize_position_embeddings_3d - might be combinable with the clip_graph::resize_position_embeddings if the tensors are handled better?
clip_graph::build_rope_2d_interleaved roughly makes sense to me from a 10,000 foot view, but I was thinking that maybe zipping or transposing the pos_w / pos_h tensors might make the square peg fit in the round hole with a bit of a different math approach?
I have no idea why inp = ggml_add(ctx0, inp, learned_pos_embd); wasn't working in build_vit by passing in the learned_pos_embed.

I think the rest of the changes are pretty sane.

AesSedai · 2026-02-01T11:06:02Z

Some test samples that I ran locally. A very basic OCR test:

A more complicated OCR test that includes transcription:

And the interpretation of the raccoon photo from earlier:

The two things that concern me still are:

Image 1: there is a mention of a "vertical black line/border on the left side" in the thinking, plus mention of a "Border: There is a thick vertical black line running along the left side of the image" in the response. The image padding is black, so perhaps something related to that?
Image 3: In the thinking, item 6 mentions: "There's a visible seam or line in the image, suggesting it might be a composite or stitched image, or perhaps just an artifact". There isn't a seam like that, so I'm concerned.

segmond · 2026-02-01T12:57:40Z

Great work AesSedai! I just downloaded the BF16 for mmproj. Is there any reason to get anything higher than Q8_0? What about ctk/ctv is there any good reason to run them in f16 instead of lower since the model is INT4?

segmond · 2026-02-01T17:32:05Z

I'm happy to report that I have tested this branch and it works great. I ran it with the Q4_X quant and my ctk/ctv at q8_0. Using the BF16 mmproj.

AesSedai · 2026-02-01T19:27:38Z

@segmond Thanks, for the MMPROJ some cards are more or less compatible with different versions. The BF16's don't work very well on my 3090s IIRC. The Q8_0 should be fine to use quality-wise.

Regarding CTK / CTV, you do not want to quantize the cache on this model at all. The model weight quantization is different than the cache quantization. MLA / GQA already comes with some pretty severe compression on the cache so by further quantizing it you'll degrade it more. Besides, the context is very lightweight anyways. Something like 165k context in FP16 is like ballpark 10GB or so.

AesSedai requested a review from CISC as a code owner January 29, 2026 06:28

github-actions bot added the python python script changes label Jan 29, 2026

AesSedai marked this pull request as draft January 29, 2026 06:40

loci-dev mentioned this pull request Jan 29, 2026

UPSTREAM PR #19170: Add Kimi-K2.5 support auroralabs-loci/llama.cpp#1068

Open

CISC reviewed Jan 29, 2026

View reviewed changes

AesSedai force-pushed the kimi-k2.5 branch from a23624c to 0add958 Compare February 1, 2026 10:15

github-actions bot added the examples label Feb 1, 2026

AesSedai added 3 commits February 1, 2026 02:15

Move dequant_model to after the text_config merge

042c3cb

Add new kimi-k2.5 keys to mtmd convert Update V_MMPROJ tensor mapping for new mm_projector.proj keys Update V_M_IMP_NORM for new mm_projector.pre_norm key

Fix a couple of oversights

a4c9a08

Add image support for Kimi-K2.5

9c44981

AesSedai force-pushed the kimi-k2.5 branch from 0add958 to 9c44981 Compare February 1, 2026 10:16

Revert changes to KimiVLForConditionalGeneration

9b14cb8

loci-dev mentioned this pull request Feb 1, 2026

UPSTREAM PR #19170: Add Kimi-K2.5 support auroralabs-loci/llama.cpp#1119

Open

Fix an assert crash

37a386d

AesSedai marked this pull request as ready for review February 1, 2026 11:27

AesSedai requested a review from ngxson as a code owner February 1, 2026 11:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Kimi-K2.5 support #19170

Add Kimi-K2.5 support #19170

AesSedai commented Jan 29, 2026 •

edited

Loading

Uh oh!

AesSedai commented Jan 29, 2026 •

edited

Loading

Uh oh!

CISC commented Jan 29, 2026

Uh oh!

CISC Jan 29, 2026

Uh oh!

AesSedai commented Feb 1, 2026 •

edited

Loading

Uh oh!

AesSedai commented Feb 1, 2026 •

edited

Loading

Uh oh!

segmond commented Feb 1, 2026

Uh oh!

segmond commented Feb 1, 2026

Uh oh!

AesSedai commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add Kimi-K2.5 support #19170

Are you sure you want to change the base?

Add Kimi-K2.5 support #19170

Conversation

AesSedai commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AesSedai commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CISC commented Jan 29, 2026

Uh oh!

CISC Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

AesSedai commented Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AesSedai commented Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

segmond commented Feb 1, 2026

Uh oh!

segmond commented Feb 1, 2026

Uh oh!

AesSedai commented Feb 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AesSedai commented Jan 29, 2026 •

edited

Loading

AesSedai commented Jan 29, 2026 •

edited

Loading

AesSedai commented Feb 1, 2026 •

edited

Loading

AesSedai commented Feb 1, 2026 •

edited

Loading