Skip to content

[Bug] Error - Can't see the Image #8756

@krsnewwave

Description

@krsnewwave

What happened?

I'm using Databricks to do the DogPictureSignature example in the site. The answer is always:

Prediction(
    answer="I'm sorry, but I can't see the image. Please upload an image of a dog for me to identify the breed."
)

I've tried it in multiple releases, from 3.0.3, back.

I can also inspect the trace via mlflow and the message format makes sense (image_url, to url, etc)

Image

I even copy-pasted the messages to a standard chat completions call, and it answers correctly.

[
  {
    "role": "system",
    "content": "Your input fields are:\n1. `image_1` (Image): An image of a dog\nYour output fields are:\n1. `answer` (str): The dog breed of the dog in the image\nAll interactions will be structured in the following way, with the appropriate values filled in.\n\n[[ ## image_1 ## ]]\n{image_1}\n\n[[ ## answer ## ]]\n{answer}\n\n[[ ## completed ## ]]\nIn adhering to this structure, your objective is: \n        Output the dog breed of the dog in the image."
  },
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "[[ ## image_1 ## ]]\n"
      },
      {
        "type": "image_url",
        "image_url": {
          "url": "https://picsum.photos/id/237/200/300"
        }
      },
      {
        "type": "text",
        "text": "\n\nRespond with the corresponding output fields, starting with the field `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`."
      }
    ]
  }
]

Steps to reproduce

Config

import mlflow

mlflow.dspy.autolog()

model_name = 'databricks/gpt-4-1' #  azure openai serving endpoint
lm = dspy.LM(
    model=model_name, 
)

dspy.configure(lm=lm, cache=False)

Actuall Call

class DogPictureSignature(dspy.Signature):
    """Output the dog breed of the dog in the image."""
    image_1: dspy.Image = dspy.InputField(desc="An image of a dog")
    answer: str = dspy.OutputField(desc="The dog breed of the dog in the image")

image_url = "https://picsum.photos/id/237/200/300"
classify = dspy.Predict(DogPictureSignature)
classify(image_1=dspy.Image.from_url(image_url))

Out

Prediction(
    answer="I'm sorry, but I can't see the image. Please upload an image of a dog for me to identify the breed."
)

DSPy version

3.0.3

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions