Problem with including images in the request specifically in the JSON structure #2754

AzizDXT · 2025-02-24T19:05:22Z

📌 Issue Title:

AI Prioritizes Image Requests Over Text Due to JSON Structure

📝 Description:

I am encountering an issue where, when sending both a text request and an image request together, the AI prioritizes the image over the text. This seems to happen because the image request appears at the end of the JSON structure, making the AI treat it as the latest message and focus on it instead of processing the text properly.

🔍 Problem Details:

When sending a combined request (text + image), the AI gives more importance to the image.
It seems that the order of elements in the JSON request affects the AI's behavior.
The text request is being ignored or not given enough weight because the image is positioned at the end.

⚡ Expected Behavior:

The AI should consider both the text and the image equally, rather than prioritizing the image simply because it appears last in the JSON structure.
The text should remain part of the conversation history, even if an image is attached in the request.

💡 Possible Solutions (Suggestions):

Rearrange JSON Structure: Allow defining a priority field or ensure messages are processed based on timestamps instead of their position in JSON.
Explicit Context Handling: Ensure the AI processes all message components (text + image) in order rather than favoring the latest entry.
Modify AI Processing Logic: Ensure the AI weights text and images appropriately rather than prioritizing based on the request order.

hlohaus · 2025-02-26T13:29:10Z

May you provide an example? Each vendor employs a unique method for image response.

@AzizDXT

AzizDXT · 2025-02-26T14:18:58Z

I will activate Logs and then send you the exact problem and explain it in detail.

AzizDXT · 2025-02-26T17:56:47Z

May you provide an example? Each vendor employs a unique method for image response.

@AzizDXT

For clarification, the issue occurs when sending a request via the API or in the main interface of G4F. When I upload an image for interpretation or analysis and then ask a related question in a new message within the same conversation, the model forgets the image or does not include it in the request. I want the image to be included in the request and context so that the model can reference it when answering follow-up questions.

AzizDXT · 2025-02-26T18:10:09Z

It is preferable for the request to be like this and for G4F to accept it because the basic format in G4F is that the image should always be at the bottom of the body.

  "model": "gpt-4-vision-preview",
  "messages": [
    {
      "role": "system",
      "content": "You are an AI assistant that can analyze images and retain context."
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Analyze this image and describe what you see."
        },
        {
          "type": "image_url",
          "image_url": "https://example.com/sample-image.jpg"
        }
      ]
    },
    {
      "role": "assistant",
      "content": "This image contains a sunset over the ocean with a few scattered clouds."
    },
    {
      "role": "user",
      "content": "What colors are in the sunset?"
    }
  ],
  "max_tokens": 500
}

hlohaus · 2025-02-26T18:17:04Z

I believe the issue has been resolved in the latest update. Image selections should persist across requests. A media selection box with image preview is now implemented. @AzizDXT

AzizDXT · 2025-02-26T18:19:29Z

I will retest the interface again I didn't notice there was an update five hours ago.

AzizDXT · 2025-02-26T18:28:04Z

The problem still exists. I will send you an example of the exact problem and it will become clear to you what the problem is.

AzizDXT · 2025-02-26T18:30:31Z

Look at my conversation here, and you’ll see that the image is still prioritized. Based on my analysis, when the image is at the bottom of the JSON section, the AI gives it priority for analysis or explanation. I just want the request to be modified before being sent to the providers so that the image's position aligns with the message, rather than always being at the end of the request.

Blackbox AI with BLACKBOXAI
Hello! How can I assist you today?

(7 words, 26 chars, 9 tokens)

User Input:

"explain this pic"

(3 words, 14 chars, 3 tokens)

Blackbox AI with BLACKBOXAI
Image Description

Type of Image: Pixel Art

Subject: A stylized cat

Color Palette:

Main Color: Orange (fur)

Accent Colors: White (belly, face markings), Blue (eyes)

Background:

Depicts a forest setting with dark tones

Subtle light particles or sparkles in the background

Floral elements at the bottom, adding a pop of color

Visual Elements:

Style:

Pixelated, giving a retro/video game feel

Simplified shapes and vibrant colors

Expression:

The cat appears calm and inquisitive with large, expressive eyes

Overall demeanor suggests a friendly and whimsical nature

Composition:

Framing: The cat is centered, drawing attention

Depth: The background features blurred trees, enhancing focus on the cat

This image combines pixel art charm with a serene forest ambiance, evoking a whimsical and magical atmosphere.

(144 words, 744 chars, 235 tokens)

User Input:

"how are you?"

(3 words, 9 chars, 4 tokens)

Blackbox AI with BLACKBOXAI
I'm here to help! If you have any questions or need assistance, feel free to ask.

Image Analysis

Description:

The image depicts a pixel art style representation of a cat.

The cat is primarily orange with white accents on its chest and face.

It has bright blue eyes and a faint smile, giving it a friendly appearance.

Background Elements:

The background appears to be a dark forest setting with hints of foliage.

Small flowers at the bottom add color to the scene.

Art Style:

The artwork features a pixelated style, characteristic of retro video games.

The use of contrasting colors enhances the cat's features.

If you need more details or have specific questions, just let me know!

(126 words, 574 chars, 163 tokens)

User Input:

"5 * 5?"

(2 words, 2 chars, 5 tokens)

Blackbox AI with BLACKBOXAI
Answer to Mathematical Question:
The result of 5 × 5 is 25.

Image Characteristics:

Content: The image appears to depict a pixel art representation of a cat.

Color Palette: Predominantly orange with white and blue accents.

Setting: The background suggests a whimsical forest environment with flowers and subtle lighting effects.

If you have any specific questions about the image or need further assistance, feel free to ask!

(67 words, 351 chars, 98 tokens)

AzizDXT · 2025-02-26T19:18:48Z

I found the solution and it is simple, just add a system message stating that an image has been uploaded, which you can use according to the current user’s request or benefit from later.

hlohaus · 2025-02-27T09:56:47Z

Understood, @AzizDXT . However, I will not implement this solution as it is only compatible with Blackbox. We require a more universally applicable feature.

AzizDXT · 2025-02-27T09:59:00Z

Understood, @AzizDXT . However, I will not implement this solution as it is only compatible with Blackbox. We require a more universally applicable feature.

Exactly, I will try to find a comprehensive solution to this problem, and if I find I may contribute to the development of G4F.

AzizDXT added the bug Something isn't working label Feb 24, 2025

AzizDXT assigned xtekky Feb 24, 2025

hlohaus added the respond label Feb 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with including images in the request specifically in the JSON structure #2754

Problem with including images in the request specifically in the JSON structure #2754

AzizDXT commented Feb 24, 2025

hlohaus commented Feb 26, 2025

AzizDXT commented Feb 26, 2025 •

edited

Loading

AzizDXT commented Feb 26, 2025 •

edited

Loading

AzizDXT commented Feb 26, 2025

hlohaus commented Feb 26, 2025

AzizDXT commented Feb 26, 2025

AzizDXT commented Feb 26, 2025

AzizDXT commented Feb 26, 2025

AzizDXT commented Feb 26, 2025

hlohaus commented Feb 27, 2025

AzizDXT commented Feb 27, 2025

Problem with including images in the request specifically in the JSON structure #2754

Problem with including images in the request specifically in the JSON structure #2754

Comments

AzizDXT commented Feb 24, 2025

📌 Issue Title:

📝 Description:

🔍 Problem Details:

⚡ Expected Behavior:

💡 Possible Solutions (Suggestions):

hlohaus commented Feb 26, 2025

AzizDXT commented Feb 26, 2025 • edited Loading

AzizDXT commented Feb 26, 2025 • edited Loading

AzizDXT commented Feb 26, 2025

It is preferable for the request to be like this and for G4F to accept it because the basic format in G4F is that the image should always be at the bottom of the body.

hlohaus commented Feb 26, 2025

AzizDXT commented Feb 26, 2025

AzizDXT commented Feb 26, 2025

The problem still exists. I will send you an example of the exact problem and it will become clear to you what the problem is.

AzizDXT commented Feb 26, 2025

User Input:

User Input:

User Input:

AzizDXT commented Feb 26, 2025

hlohaus commented Feb 27, 2025

AzizDXT commented Feb 27, 2025

AzizDXT commented Feb 26, 2025 •

edited

Loading

AzizDXT commented Feb 26, 2025 •

edited

Loading