clip.cpp / gguf-py: Support for Qwen2.5 VL - WIP / REVIEW NEEDED (#11483) #12119
+366
−59
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fellas, I gotta say - I grossly underestimated this task...
Work is still in progress in our fork here (comment here if you want to review the code, so we can update the PR):
https://github.com/Independent-AI-Labs/llama.cpp/commits/debug/build/
Qwen2.5 is a total overhaul of the Qwen VLM and introduces a bunch of concepts that require special handling.
I've documented my approach in this paper here:
https://github.com/Independent-AI-Labs/local-super-agents/blob/main/res/docs/papers/Implementing%20Qwen2.5VL.pdf
Comments & ideas are welcome!