Summary
Some OpenAI-compatible providers models support vision, it would be great if an A2A server gets out of the box vision capabilities. For example a browser-agent might want to solve captchas, and it needs vision.
It's important to note that while the Inference Gateway supports those models, not all providers have them, so an error will be thrown when the chosen model is attaching an image to the payload - the operator have to choose the right model for the right tasks.
Acceptance Criteria
Summary
Some OpenAI-compatible providers models support vision, it would be great if an A2A server gets out of the box vision capabilities. For example a browser-agent might want to solve captchas, and it needs vision.
It's important to note that while the Inference Gateway supports those models, not all providers have them, so an error will be thrown when the chosen model is attaching an image to the payload - the operator have to choose the right model for the right tasks.
Acceptance Criteria