Current browser screenshot:为空 #939

0three · 2025-03-25T07:32:45Z

我在3.24日最新代码的llm.py的738行打印了传给大模型的params，发现Current browser screenshot:为空，模型似乎并没有接收到当前浏览器的截图，以下是具体的参数：

{
  "model": "qwen-max",
  "messages": [
    {
      "role": "system",
      "content": "You are OpenManus, an all-capable AI assistant, aimed at solving any task presented by the user. You have various tools at your disposal that you can call upon to efficiently complete complex requests. Whether it's programming, information retrieval, file processing, or web browsing, you can handle it all. The initial directory is: C:\\Users\\Govern-Nation\\Downloads\\OpenManus-324\\OpenManus-main\\workspace"
    },
    {
      "role": "user",
      "content": "帮我查询xxx的个人信息，并生成一份详细的简历，以doc格式保存"
    },
    {
      "role": "user",
      "content": "\nWhat should I do next to achieve my goal?\n\nWhen you see [Current state starts here], focus on the following:\n- Current URL and page title\n- Available tabs\n- Interactive elements and their indices\n- Content above or below the viewport (if indicated)\n- Any action results or errors\n\nFor browser interactions:\n- To navigate: browser_use with action=\"go_to_url\", url=\"...\"\n- To click: browser_use with action=\"click_element\", index=N\n- To type: browser_use with action=\"input_text\", index=N, text=\"...\"\n- To extract: browser_use with action=\"extract_content\", goal=\"...\"\n- To scroll: browser_use with action=\"scroll_down\" or \"scroll_up\"\n\nConsider both what's visible and what might be beyond the current viewport.\nBe methodical - remember your progress and what you've learned so far.\n"
    },
    {
      "role": "assistant",
      "content": "为了查询xxx的个人信息并生成一份详细的简历，我需要首先在网络上搜索相关信息。不过，在开始之前，我需要澄清一些事项：\n\n1. 我们将使用公开可用的信息来创建简历。如果xxx有公开的社交媒体账号、个人网站或LinkedIn页面，我们可能会找到一些有用的信息。\n2. 由于隐私问题，某些个人信息可能无法获取，而且我们应该尊重个人隐私。\n3. 生成的简历将基于找到的信息，并且可能不会包含所有细节，因为不是所有的个人信息都在线上公开。\n\n现在，我将开始搜索“xxx”的个人信息。请稍等片刻。",
      "tool_calls": [
        {
          "id": "call_0fbefb0714b142b1906593",
          "type": "function",
          "function": {
            "name": "browser_use",
            "arguments": "{\"action\": \"web_search\", \"query\": \"xxx\"}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "Observed output of cmd `browser_use` executed:\nSearched for 'xxx' and navigated to first result: https://scholar.google.com/citations?user=KYCxGy4AAAAJ&hl=zh-CN\nAll results:https://scholar.google.com/citations?user=KYCxGy4AAAAJ&hl=zh-CN\nhttps://zhuyr97.github.io/\nhttps://openreview.net/profile?id=~Yurui_Zhu1\nhttps://github.com/zhuyr97\n\nhttps://be.linkedin.com/in/yurui-zhu-8a5022336\nhttps://dblp.org/pid/282/6550\nhttps://ieeexplore.ieee.org/author/37088565620\nhttps://www.semanticscholar.org/author/Yurui-Zhu/2109439130\nhttps://www.researchgate.net/profile/Yurui-Zhu",
      "name": "browser_use",
      "tool_call_id": "call_0fbefb0714b142b1906593"
    },
    {
      "role": "user",
      "content": "Current browser screenshot:"
    },
    {
      "role": "user",
      "content": "\nWhat should I do next to achieve my goal?\n\nWhen you see [Current state starts here], focus on the following:\n- Current URL and page title\n   URL: https://scholar.google.com/citations?user=KYCxGy4AAAAJ&hl=zh-CN\n   Title: \u202aYurui Zhu\u202c - \u202aGoogle 学术搜索\u202c\n- Available tabs\n   1 tab(s) available\n- Interactive elements and their indices\n- Content above or below the viewport (if indicated)\n- Any action results or errors\n\nFor browser interactions:\n- To navigate: browser_use with action=\"go_to_url\", url=\"...\"\n- To click: browser_use with action=\"click_element\", index=N\n- To type: browser_use with action=\"input_text\", index=N, text=\"...\"\n- To extract: browser_use with action=\"extract_content\", goal=\"...\"\n- To scroll: browser_use with action=\"scroll_down\" or \"scroll_up\"\n\nConsider both what's visible and what might be beyond the current viewport.\nBe methodical - remember your progress and what you've learned so far.\n"
    }
  ],
  "tools": [...],
  "tool_choice": "auto",
  "timeout": 300,
  "max_tokens": 8192,
  "temperature": 0.0
}

但是我之前下载的一份代码是有截图的，输出内容如下：

    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Current browser screenshot:"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "data:image/jpeg;base64,iVBORw0KGgoAAAANSUhEUgAABQA...."
          }
        }
      ]
    },

想问下这是怎么回事？

The text was updated successfully, but these errors were encountered:

LiKeJian · 2025-03-26T02:38:43Z

感觉问题太多不如 OWL

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Current browser screenshot:为空 #939

Current browser screenshot:为空 #939

0three commented Mar 25, 2025 •

edited

Loading

LiKeJian commented Mar 26, 2025

Current browser screenshot:为空 #939

Current browser screenshot:为空 #939

Comments

0three commented Mar 25, 2025 • edited Loading

LiKeJian commented Mar 26, 2025

0three commented Mar 25, 2025 •

edited

Loading