Skip to content

Comments

UPSTREAM PR #19669: test(server): add multi-image and no-image vision API tests#1186

Open
loci-dev wants to merge 1 commit intomainfrom
loci/pr-19669-test-vision-api-multiimage-noimage-tests
Open

UPSTREAM PR #19669: test(server): add multi-image and no-image vision API tests#1186
loci-dev wants to merge 1 commit intomainfrom
loci/pr-19669-test-vision-api-multiimage-noimage-tests

Conversation

@loci-dev
Copy link

Note

Source pull request: ggml-org/llama.cpp#19669

Summary

  • Add three new test cases to test_vision_api.py addressing the TODO at line 73 for testing with multiple images and no images
  • Narrow the remaining TODO to audio-only (needs a model that supports audio input, which tinygemma3 lacks)

New Tests

Test What it verifies
test_vision_chat_completion_multiple_images Server handles multiple image_url content parts in a single request
test_vision_chat_completion_no_image Text-only string messages work on a multimodal model
test_vision_chat_completion_no_image_content_parts Content parts array with only text type (no image_url) works correctly

Test plan

  • All 3 new tests pass locally (macOS, Apple M4 Pro, Metal backend)
  • All 20 vision API tests pass with zero regressions
  • Tests use existing tinygemma3 preset; multi-image test uses n_ctx=2048 to fit both images

Add three new test cases to test_vision_api.py that address the TODO
for testing with multiple images and no images:

- test_vision_chat_completion_multiple_images: verifies the server
  handles multiple image_url content parts in a single request
- test_vision_chat_completion_no_image: verifies text-only messages
  work correctly on a multimodal model
- test_vision_chat_completion_no_image_content_parts: verifies
  content parts with only text type (no image_url) work correctly

The audio test TODO is narrowed to note it needs a model with audio
input support, which the current tinygemma3 test model lacks.

Co-authored-by: Cursor <cursoragent@cursor.com>
@loci-review
Copy link

loci-review bot commented Feb 17, 2026

No meaningful performance changes were detected across 115587 analyzed functions in the following binaries: build.bin.llama-tts, build.bin.libmtmd.so, build.bin.llama-cvector-generator, build.bin.libllama.so, build.bin.llama-bench, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.llama-quantize, build.bin.libggml-cpu.so, build.bin.libggml-base.so, build.bin.libggml.so, build.bin.llama-tokenize, build.bin.llama-qwen2vl-cli.

🔎 Full breakdown: Loci Inspector.
💬 Questions? Tag @loci-dev.

@loci-dev loci-dev force-pushed the main branch 6 times, most recently from a6ecec6 to 9ea4a65 Compare February 21, 2026 02:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant