Add image file viewing support to FileEditor #1016

SmartManoj · 2025-11-04T09:54:12Z

FileEditor now supports viewing image files (.png, .jpg, .jpeg, .gif, .webp, .bmp) by returning base64-encoded image data and displaying it as ImageContent. Updated FileEditorObservation and tool description to handle image data, and added tests to verify image handling and backward compatibility with text files.

Before:

After:

FileEditor now supports viewing image files (.png, .jpg, .jpeg, .gif, .webp, .bmp) by returning base64-encoded image data and displaying it as ImageContent. Updated FileEditorObservation and tool description to handle image data, and added tests to verify image handling and backward compatibility with text files.

simonrosenberg · 2025-11-04T11:46:26Z

Thank you @SmartManoj !

Perhaps we should wait for #929 to be merged before merging this though. Will let you know very soon

blacksmith-sh · 2025-11-06T13:01:23Z

[Automatic Post]: I have assigned @xingyaoww as a reviewer based on git blame information. Thanks in advance for the help!

xingyaoww

Can we add a testcase that actually load an image file and assert the file editor will return ImageContent without error?

Introduces a test to verify that viewing a PNG image file with FileEditor returns an ImageContent object containing a base64-encoded image URL. This ensures image files are handled correctly by the view command.

Replaces single quotes with double quotes for byte string literals in the PNG image data and updates an attribute check to use double quotes for consistency in test_view_image_file_returns_image_content.

enyst · 2025-11-08T00:23:11Z

This is very nice, thank you for the work on it!

simonrosenberg · 2025-11-08T08:22:57Z

openhands-tools/openhands/tools/file_editor/editor.py

+                mime_type = "image/png"  # default
+                if image_base64.startswith("/9j/"):
+                    mime_type = "image/jpeg"
+                elif image_base64.startswith("iVBORw0KGgo"):
+                    mime_type = "image/png"
+                elif image_base64.startswith("R0lGODlh"):
+                    mime_type = "image/gif"
+                elif image_base64.startswith("UklGR"):
+                    mime_type = "image/webp"
+                elif image_base64.startswith("Qk"):
+                    mime_type = "image/bmp"


Thank you for doing this!

I am no expert but all those if/elifs seem a bit hacky here. Would it make sense to use a library instead?

import mimetypes mime_type, _ = mimetypes.guess_type(path) mime_type = mime_type or "application/octet-stream"

or

import imghdr image_type = imghdr.what(path) mime_type = f"image/{image_type}" if image_type else "application/octet-stream"

Would it make sense to use a library instead?

Yes.

Similar snippet in codebase:

software-agent-sdk/openhands-tools/openhands/tools/browser_use/definition.py

Lines 50 to 59 in 3aa647e

if self.screenshot_data:

mime_type = "image/png"

if self.screenshot_data.startswith("/9j/"):

mime_type = "image/jpeg"

elif self.screenshot_data.startswith("iVBORw0KGgo"):

mime_type = "image/png"

elif self.screenshot_data.startswith("R0lGODlh"):

mime_type = "image/gif"

elif self.screenshot_data.startswith("UklGR"):

mime_type = "image/webp"

@OpenHands /codereview-roasted read the comments above
the following block of code is 1. duplicated (copy-pasted from software-agent-sdk/openhands-tools/openhands/tools/browser_use/definition.py#L50-L59) 2. a bit hacky.
What are you suggestions to improve this?

if self.screenshot_data: mime_type = "image/png" if self.screenshot_data.startswith("/9j/"): mime_type = "image/jpeg" elif self.screenshot_data.startswith("iVBORw0KGgo"): mime_type = "image/png" elif self.screenshot_data.startswith("R0lGODlh"): mime_type = "image/gif" elif self.screenshot_data.startswith("UklGR"): mime_type = "image/webp"

I'm on it! simonrosenberg can track my progress at all-hands.dev

Code Review Summary

I've analyzed the duplicated and hacky MIME type detection code in PR #1016. Here's my assessment:

Problems Identified:

Code Duplication: The exact same MIME type detection logic (lines 344-354 in file_editor/editor.py) is copy-pasted from browser_use/definition.py (lines 50-59)

Wrong Approach: The code detects MIME types by pattern-matching base64-encoded strings rather than checking actual file data. This is brittle because you're inspecting an encoding artifact, not the actual file format.

Ignoring Available Data: In file_editor/editor.py, you have the file path but still read the entire file, encode it to base64, and pattern-match strings - when Python's mimetypes.guess_type(path) would do this correctly in one line.

Recommended Solutions:

For file_editor.py (SIMPLEST FIX):
Replace lines 344-354 with:

import mimetypes mime_type, _ = mimetypes.guess_type(path) if not mime_type or not mime_type.startswith('image/'): mime_type = 'application/octet-stream'

This is simpler (3 lines vs 10), faster (no file inspection needed), more correct (uses standard library), and more maintainable (handles all formats automatically).

For long-term maintainability:
Create a shared utility function in openhands-tools/openhands/tools/utils/mime_type.py that:

Uses mimetypes.guess_type() when a file path is available

Falls back to checking actual magic numbers (not base64 prefixes) for raw data

Can be used by both file_editor and browser_use to eliminate duplication

Verdict:

❌ Needs rework - Violates DRY principle and uses incorrect approach for type detection

The current approach works but is unnecessarily complex and will become a maintenance burden. Using Python's standard library is the pragmatic, simple solution that eliminates special cases.

_{View full conversation}

Perhaps we can implement the recommended solution @SmartManoj ? I dont think we need the long-term maintainability solution yet

mime_type = 'application/octet-stream'

@simonrosenberg, what do @OpenHands think about quote-style?

Indeed " is preferred!

Replaced base64 prefix checks with mimetypes.guess_type for determining image MIME type when reading image files. Falls back to 'image/png' if MIME type is not detected or not an image. This simplifies and improves accuracy of MIME type detection.

Replaces single quotes with double quotes in MIME type string comparison and assignment for consistency in the file editor.

xingyaoww · 2025-11-08T19:02:07Z

@OpenHands review the changes introduce in the current PR, then help me add an integration test under tests/integration/tests very similar to its existing format.

The test should work similar to examples/01_standalone_sdk/17_image_input.py:

it sets up the workspace by downloading the logo in example 17
it ask the agent what the color of the logo is (blue, yellow, or green)
in check phase, it should check the agent answered logo have yellow color

openhands-ai · 2025-11-08T19:02:17Z

I'm on it! xingyaoww can track my progress at all-hands.dev

This test verifies that the FileEditor can view and display image files by having an agent analyze the OpenHands logo and identify its colors. The test: 1. Downloads the OpenHands logo to the workspace 2. Asks the agent to view the logo.png file and identify its colors 3. Verifies that the agent correctly identifies yellow as one of the colors This test requires a vision-capable LLM model. Co-authored-by: openhands <[email protected]>

openhands-ai · 2025-11-08T19:07:17Z

Summary

I've successfully completed your request! Here's what was done:

✅ Reviewed PR #1016 Changes

The PR adds image file viewing support to FileEditor:

FileEditor now supports viewing image files (.png, .jpg, .jpeg, .gif, .webp, .bmp)
Images are returned as base64-encoded data wrapped in ImageContent
File validation was updated to allow image files (previously rejected as binary)
Unit tests verify image handling and backward compatibility

✅ Created Integration Test

Added tests/integration/tests/t08_image_file_viewing.py that:

Setup Phase: Downloads the OpenHands logo from the same URL used in example 17
Instruction: Asks the agent to view logo.png and identify if it's blue, yellow, or green
Verification Phase: Checks that the agent's response mentions "yellow" color
Vision Check: Ensures the test runs only with vision-capable LLM models (like example 17)

The test follows the existing integration test pattern (extends BaseIntegrationTest) and will be automatically discovered by the test runner.

✅ Quality Checks

All pre-commit hooks passed (formatting, linting, type checking)
Changes committed with proper co-author attribution
Pushed to branch img to update PR Add image file viewing support to FileEditor #1016

The integration test is now ready to be executed either through GitHub Actions (using the integration-test label) or locally via run_infer.py.

_{View full conversation}

Replace if-elif chain with dictionary-based lookup for cleaner, more maintainable code. Extract detection logic into reusable detect_image_mime_type() helper function. Add comprehensive test coverage for JPEG, PNG, GIF, WebP, and unknown formats. Co-authored-by: openhands <[email protected]>

github-actions · 2025-11-10T14:44:58Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2025-11-10T14:47:49Z

🧪 Integration Tests Results

Overall Success Rate: 95.8%
Total Cost: $0.44
Models Tested: 3
Timestamp: 2025-11-10 14:47:47 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

litellm_proxy_gpt_5_mini_2025_08_07: 📥 View & Download Logs
litellm_proxy_deepseek_deepseek_chat: 📥 View & Download Logs
litellm_proxy_claude_sonnet_4_5_20250929: 📥 View & Download Logs

📊 Summary

Model	Success Rate	Tests Passed	Total Tests	Cost
litellm_proxy_gpt_5_mini_2025_08_07	100.0%	8/8	8	$0.05
litellm_proxy_deepseek_deepseek_chat	87.5%	7/8	8	$0.02
litellm_proxy_claude_sonnet_4_5_20250929	100.0%	8/8	8	$0.37

📋 Detailed Results

litellm_proxy_gpt_5_mini_2025_08_07

Success Rate: 100.0% (8/8)
Total Cost: $0.05
Run Suffix: litellm_proxy_gpt_5_mini_2025_08_07_79ad28f_gpt5_mini_run_N8_20251110_144525

litellm_proxy_deepseek_deepseek_chat

Success Rate: 87.5% (7/8)
Total Cost: $0.02
Run Suffix: litellm_proxy_deepseek_deepseek_chat_79ad28f_deepseek_run_N8_20251110_144525

Failed Tests:

t08_image_file_viewing: Test execution failed: This test requires a vision-capable LLM model. Please use a model that supports image input. (Cost: $0.00)

Tests with Errors:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

litellm_proxy_claude_sonnet_4_5_20250929

Success Rate: 100.0% (8/8)
Total Cost: $0.37
Run Suffix: litellm_proxy_claude_sonnet_4_5_20250929_79ad28f_sonnet_run_N8_20251110_144526

xingyaoww · 2025-11-10T15:22:05Z

tests/integration/tests/t08_image_file_viewing.py

+        # Verify that the LLM supports vision
+        if not self.llm.vision_is_active():
+            raise ValueError(
+                "This test requires a vision-capable LLM model. "
+                "Please use a model that supports image input."
+            )


@OpenHands can you implement a way to SKIP integration tests for LLMs that don't support it?

I'm on it! xingyaoww can track my progress at all-hands.dev

✅ Implementation Complete: Skip Integration Tests for Unsupported LLMs

I've successfully implemented a comprehensive solution to gracefully skip integration tests when LLMs don't support required capabilities (e.g., vision). Here's what was done:

Changes Made

1. Added SkipTest Exception (tests/integration/base.py)

New exception class to signal when a test should be skipped

Can be raised by any test requiring specific LLM capabilities

2. Extended Data Models (tests/integration/base.py, tests/integration/schemas.py)

Added skipped: bool field to TestResult and TestResultData

Added skipped_tests: int counter to ModelTestResults

Updated from_eval_outputs() to properly track skipped tests

3. Updated Test Runner (tests/integration/run_infer.py)

Catches SkipTest exception and returns proper EvalOutput with skipped=True

Enhanced console output to display skipped tests with ⊘ symbol

Shows skipped test count in summary when > 0

4. Updated Image Test (tests/integration/tests/t08_image_file_viewing.py)

Changed from ValueError to SkipTest when vision not supported

Test now skips gracefully instead of failing

5. Enhanced Reporting (tests/integration/utils/generate_markdown_report.py)

Added "Skipped" column to summary table

Shows skipped tests count in detailed results

Lists skipped tests separately from failed tests

Benefits

Tests that require specific LLM capabilities (like vision) now skip gracefully instead of failing

Clear visibility into which tests were skipped and why

Skipped tests don't count against success rate

Better test organization and reporting

All changes have passed pre-commit checks (type checking, linting, formatting) and have been pushed to the PR branch.

_{View full conversation}

- Added SkipTest exception in base.py to signal when tests should be skipped - Extended TestResult and TestResultData schemas with 'skipped' field - Updated ModelTestResults to track skipped_tests count - Modified run_infer.py to catch SkipTest and handle skipped tests appropriately - Updated t08_image_file_viewing.py to raise SkipTest instead of ValueError when vision not supported - Enhanced console output to display skipped tests with ⊘ symbol - Updated markdown report generator to show skipped tests in summary and detailed results This allows integration tests to gracefully skip when LLMs lack required capabilities (e.g., vision support) rather than failing. Co-authored-by: openhands <[email protected]>

github-actions · 2025-11-10T16:33:13Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2025-11-10T16:36:39Z

🧪 Integration Tests Results

Overall Success Rate: 95.8%
Total Cost: $0.49
Models Tested: 3
Timestamp: 2025-11-10 16:36:37 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

litellm_proxy_gpt_5_mini_2025_08_07: 📥 View & Download Logs
litellm_proxy_deepseek_deepseek_chat: 📥 View & Download Logs
litellm_proxy_claude_sonnet_4_5_20250929: 📥 View & Download Logs

📊 Summary

Model	Success Rate	Tests Passed	Skipped	Total Tests	Cost
litellm_proxy_gpt_5_mini_2025_08_07	100.0%	8/8	0	8	$0.05
litellm_proxy_deepseek_deepseek_chat	87.5%	7/8	1	8	$0.02
litellm_proxy_claude_sonnet_4_5_20250929	100.0%	8/8	0	8	$0.42

📋 Detailed Results

litellm_proxy_gpt_5_mini_2025_08_07

Success Rate: 100.0% (8/8)
Total Cost: $0.05
Run Suffix: litellm_proxy_gpt_5_mini_2025_08_07_262bd60_gpt5_mini_run_N8_20251110_163346

litellm_proxy_deepseek_deepseek_chat

Success Rate: 87.5% (7/8)
Total Cost: $0.02
Run Suffix: litellm_proxy_deepseek_deepseek_chat_262bd60_deepseek_run_N8_20251110_163339
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

litellm_proxy_claude_sonnet_4_5_20250929

Success Rate: 100.0% (8/8)
Total Cost: $0.42
Run Suffix: litellm_proxy_claude_sonnet_4_5_20250929_262bd60_sonnet_run_N8_20251110_163341

github-actions · 2025-11-10T16:44:35Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2025-11-10T16:56:49Z

🧪 Integration Tests Results

Overall Success Rate: 95.8%
Total Cost: $0.46
Models Tested: 3
Timestamp: 2025-11-10 16:56:47 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

litellm_proxy_deepseek_deepseek_chat: 📥 View & Download Logs
litellm_proxy_gpt_5_mini_2025_08_07: 📥 View & Download Logs
litellm_proxy_claude_sonnet_4_5_20250929: 📥 View & Download Logs

📊 Summary

Model	Success Rate	Tests Passed	Skipped	Total Tests	Cost
litellm_proxy_deepseek_deepseek_chat	100.0%	7/7	1	8	$0.02
litellm_proxy_gpt_5_mini_2025_08_07	100.0%	8/8	0	8	$0.04
litellm_proxy_claude_sonnet_4_5_20250929	100.0%	8/8	0	8	$0.40

📋 Detailed Results

litellm_proxy_deepseek_deepseek_chat

Success Rate: 100.0% (7/7)
Total Cost: $0.02
Run Suffix: litellm_proxy_deepseek_deepseek_chat_4af032e_deepseek_run_N8_20251110_164515
Skipped Tests: 1/8

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

litellm_proxy_gpt_5_mini_2025_08_07

Success Rate: 100.0% (8/8)
Total Cost: $0.04
Run Suffix: litellm_proxy_gpt_5_mini_2025_08_07_4af032e_gpt5_mini_run_N8_20251110_164515

litellm_proxy_claude_sonnet_4_5_20250929

Success Rate: 100.0% (8/8)
Total Cost: $0.40
Run Suffix: litellm_proxy_claude_sonnet_4_5_20250929_4af032e_sonnet_run_N8_20251110_164514

xingyaoww

LGTM! Thank you

xingyaoww · 2025-11-10T20:47:09Z

openhands-tools/openhands/tools/file_editor/definition.py

 TOOL_DESCRIPTION = """Custom editing tool for viewing, creating and editing files in plain-text format
 * State is persistent across command calls and discussions with the user
 * If `path` is a text file, `view` displays the result of applying `cat -n`. If `path` is a directory, `view` lists non-hidden files and directories up to 2 levels deep
+* If `path` is an image file (.png, .jpg, .jpeg, .gif, .webp, .bmp), `view` displays the image content


@OpenHands creates an issue in this repo: we should be able to decide if we want to include this line of instruction based on whether the LLM supports image inputs -- when using OSS models that don't support LLM inputs, we should not include this line

I'm on it! xingyaoww can track my progress at all-hands.dev

Summary

I've successfully addressed the PR comment from @xingyaoww about making the image viewing instruction conditional based on LLM vision support.

Changes Made

✅ Request Checklist:

Image viewing instruction is now conditionally included based on LLM vision support

OSS models without vision support won't see the misleading instruction

Vision-capable models will still get the image viewing instruction

Tests added to verify both scenarios

All pre-commit checks pass

Changes committed and pushed to the img branch

📝 Implementation Details:

Modified openhands-tools/openhands/tools/file_editor/definition.py:

Removed hardcoded image viewing line from TOOL_DESCRIPTION constant

Added conditional logic in FileEditorTool.create() that checks conv_state.agent.llm.vision_is_active()

When vision is supported, the instruction is dynamically inserted at the correct position

When vision is not supported, the instruction is omitted entirely

Added tests in tests/tools/file_editor/test_file_editor_tool.py:

test_file_editor_tool_image_viewing_line_with_vision_enabled(): Verifies instruction appears with gpt-4o-mini

test_file_editor_tool_image_viewing_line_with_vision_disabled(): Verifies instruction is absent with gpt-3.5-turbo

✅ Verification:

All existing tests pass (10/10 in test_file_editor_tool.py)

New tests pass and validate the conditional behavior

Pre-commit hooks pass (Ruff format, Ruff lint, pycodestyle, pyright)

Changes pushed to remote branch img

The FileEditor tool now intelligently adapts its description based on the LLM's capabilities, preventing confusion when using models without vision support.

_{View full conversation}

The FileEditor tool now conditionally includes the image viewing instruction in its description based on whether the LLM supports vision capabilities. This prevents confusion when using OSS models that don't support image inputs. - Remove image viewing line from base TOOL_DESCRIPTION - Add conditional logic in FileEditorTool.create() to include the image viewing line only when conv_state.agent.llm.vision_is_active() returns True - Add tests to verify the behavior with vision-enabled and vision-disabled LLMs Co-authored-by: openhands <[email protected]>

github-actions · 2025-11-11T14:45:27Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2025-11-11T14:49:32Z

🧪 Integration Tests Results

Overall Success Rate: 71.9%
Total Cost: $0.50
Models Tested: 4
Timestamp: 2025-11-11 14:49:30 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

litellm_proxy_gpt_5_mini_2025_08_07: 📥 View & Download Logs
litellm_proxy_moonshot_kimi_k2_thinking: 📥 View & Download Logs
litellm_proxy_deepseek_deepseek_chat: 📥 View & Download Logs
litellm_proxy_claude_sonnet_4_5_20250929: 📥 View & Download Logs

📊 Summary

Model	Success Rate	Tests Passed	Skipped	Total Tests	Cost
litellm_proxy_gpt_5_mini_2025_08_07	100.0%	8/8	0	8	$0.04
litellm_proxy_moonshot_kimi_k2_thinking	0.0%	0/8	1	8	$0.00
litellm_proxy_deepseek_deepseek_chat	87.5%	7/8	1	8	$0.02
litellm_proxy_claude_sonnet_4_5_20250929	100.0%	8/8	0	8	$0.44

📋 Detailed Results

litellm_proxy_gpt_5_mini_2025_08_07

Success Rate: 100.0% (8/8)
Total Cost: $0.04
Run Suffix: litellm_proxy_gpt_5_mini_2025_08_07_fe9257d_gpt5_mini_run_N8_20251111_144555

litellm_proxy_moonshot_kimi_k2_thinking

Success Rate: 0.0% (0/8)
Total Cost: $0.00
Run Suffix: litellm_proxy_moonshot_kimi_k2_thinking_fe9257d_kimi_k2_run_N8_20251111_144554
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

Failed Tests:

t02_add_bash_hello: Test execution failed: Conversation run failed for id=afee178a-e1e2-41be-ac5a-4480fcdc0afc: litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}] LiteLLM Retried: 1 times, LiteLLM Max Retries: 2", 'type': 'invalid_request_error', 'param': None, 'code': '400'}} (Cost: $0.00)
t04_git_staging: Test execution failed: Conversation run failed for id=d33cd0e9-152a-48d2-9173-6f2f2a492711: litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}] LiteLLM Retried: 1 times, LiteLLM Max Retries: 2", 'type': 'invalid_request_error', 'param': None, 'code': '400'}} (Cost: $0.00)
t07_interactive_commands: Test execution failed: Conversation run failed for id=966723df-e411-4788-9750-436427ed8491: litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}] LiteLLM Retried: 1 times, LiteLLM Max Retries: 2", 'type': 'invalid_request_error', 'param': None, 'code': '400'}} (Cost: $0.00)
t03_jupyter_write_file: Test execution failed: Conversation run failed for id=d6ebfde7-5e34-4a3f-a661-a5f1d6217745: litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}] LiteLLM Retried: 1 times, LiteLLM Max Retries: 2", 'type': 'invalid_request_error', 'param': None, 'code': '400'}} (Cost: $0.00)
t06_github_pr_browsing: Test execution failed: Conversation run failed for id=406cfae5-c3b5-49fd-b36e-f708a0fed128: litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}] LiteLLM Retried: 1 times, LiteLLM Max Retries: 2", 'type': 'invalid_request_error', 'param': None, 'code': '400'}} (Cost: $0.00)
t01_fix_simple_typo: Test execution failed: Conversation run failed for id=2b48e597-e048-46ce-9323-93886f6d11fa: litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}] LiteLLM Retried: 1 times, LiteLLM Max Retries: 2", 'type': 'invalid_request_error', 'param': None, 'code': '400'}} (Cost: $0.00)
t05_simple_browsing: Test execution failed: Conversation run failed for id=0d8f6394-7f72-4321-aa1e-84c94993a98d: litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}] LiteLLM Retried: 1 times, LiteLLM Max Retries: 2", 'type': 'invalid_request_error', 'param': None, 'code': '400'}} (Cost: $0.00)

litellm_proxy_deepseek_deepseek_chat

Success Rate: 87.5% (7/8)
Total Cost: $0.02
Run Suffix: litellm_proxy_deepseek_deepseek_chat_fe9257d_deepseek_run_N8_20251111_144548
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

litellm_proxy_claude_sonnet_4_5_20250929

Success Rate: 100.0% (8/8)
Total Cost: $0.44
Run Suffix: litellm_proxy_claude_sonnet_4_5_20250929_fe9257d_sonnet_run_N8_20251111_144549

github-actions · 2025-11-11T14:58:13Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2025-11-11T15:02:10Z

🧪 Integration Tests Results

Overall Success Rate: 71.9%
Total Cost: $0.46
Models Tested: 4
Timestamp: 2025-11-11 15:02:08 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

litellm_proxy_gpt_5_mini_2025_08_07: 📥 View & Download Logs
litellm_proxy_claude_sonnet_4_5_20250929: 📥 View & Download Logs
litellm_proxy_deepseek_deepseek_chat: 📥 View & Download Logs
litellm_proxy_moonshot_kimi_k2_thinking: 📥 View & Download Logs

📊 Summary

Model	Success Rate	Tests Passed	Skipped	Total Tests	Cost
litellm_proxy_gpt_5_mini_2025_08_07	100.0%	8/8	0	8	$0.04
litellm_proxy_claude_sonnet_4_5_20250929	100.0%	8/8	0	8	$0.38
litellm_proxy_deepseek_deepseek_chat	87.5%	7/8	1	8	$0.04
litellm_proxy_moonshot_kimi_k2_thinking	0.0%	0/8	1	8	$0.00

📋 Detailed Results

litellm_proxy_gpt_5_mini_2025_08_07

Success Rate: 100.0% (8/8)
Total Cost: $0.04
Run Suffix: litellm_proxy_gpt_5_mini_2025_08_07_07ab4b2_gpt5_mini_run_N8_20251111_145839

litellm_proxy_claude_sonnet_4_5_20250929

Success Rate: 100.0% (8/8)
Total Cost: $0.38
Run Suffix: litellm_proxy_claude_sonnet_4_5_20250929_07ab4b2_sonnet_run_N8_20251111_145841

litellm_proxy_deepseek_deepseek_chat

Success Rate: 87.5% (7/8)
Total Cost: $0.04
Run Suffix: litellm_proxy_deepseek_deepseek_chat_07ab4b2_deepseek_run_N8_20251111_145833
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

litellm_proxy_moonshot_kimi_k2_thinking

Success Rate: 0.0% (0/8)
Total Cost: $0.00
Run Suffix: litellm_proxy_moonshot_kimi_k2_thinking_07ab4b2_kimi_k2_run_N8_20251111_145843
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

Failed Tests:

t06_github_pr_browsing: Test execution failed: Conversation run failed for id=0dd957eb-36be-49da-b9f3-7bc4d9d29724: litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}] LiteLLM Retried: 1 times, LiteLLM Max Retries: 2", 'type': 'invalid_request_error', 'param': None, 'code': '400'}} (Cost: $0.00)
t04_git_staging: Test execution failed: Conversation run failed for id=b7a47619-1a03-460f-84f7-400af28162de: litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}] LiteLLM Retried: 1 times, LiteLLM Max Retries: 2", 'type': 'invalid_request_error', 'param': None, 'code': '400'}} (Cost: $0.00)
t03_jupyter_write_file: Test execution failed: Conversation run failed for id=25646974-7e75-4e7c-9671-1880512a750e: litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}] LiteLLM Retried: 1 times, LiteLLM Max Retries: 2", 'type': 'invalid_request_error', 'param': None, 'code': '400'}} (Cost: $0.00)
t05_simple_browsing: Test execution failed: Conversation run failed for id=a70d2545-04c8-4751-b2ea-b1f9ad76baa6: litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}] LiteLLM Retried: 1 times, LiteLLM Max Retries: 2", 'type': 'invalid_request_error', 'param': None, 'code': '400'}} (Cost: $0.00)
t01_fix_simple_typo: Test execution failed: Conversation run failed for id=3bca5078-179b-4b99-86df-aec054f863db: litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}] LiteLLM Retried: 1 times, LiteLLM Max Retries: 2", 'type': 'invalid_request_error', 'param': None, 'code': '400'}} (Cost: $0.00)
t02_add_bash_hello: Test execution failed: Conversation run failed for id=7b612077-20ad-422b-905d-66de6c54e318: litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}] LiteLLM Retried: 1 times, LiteLLM Max Retries: 2", 'type': 'invalid_request_error', 'param': None, 'code': '400'}} (Cost: $0.00)
t07_interactive_commands: Test execution failed: Conversation run failed for id=c124ed41-afec-4594-ade0-ef2c3ae0a9b6: litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}]. Received Model Group=moonshot/kimi-k2-thinking\nAvailable Model Group Fallbacks=None\nError doing the fallback: litellm.BadRequestError: MoonshotException - Invalid request: Your request exceeded model token limit: 262144No fallback model group found for original model_group=moonshot/kimi-k2-thinking. Fallbacks=[{'qwen3-coder-480b': ['qwen3-coder-480b-or']}, {'glm-4.5': ['glm-4.5-or']}] LiteLLM Retried: 1 times, LiteLLM Max Retries: 2", 'type': 'invalid_request_error', 'param': None, 'code': '400'}} (Cost: $0.00)

SmartManoj added 2 commits November 4, 2025 15:21

Merge branch 'main' into img

68c990a

SmartManoj added 4 commits November 6, 2025 09:49

Merge branch 'main' into img

33f0a3f

ruff

9735516

ruff

2f0e725

Update image file validation test assertion

a883f91

SmartManoj force-pushed the img branch from 0eca9fe to a883f91 Compare November 6, 2025 04:58

format

d1f4666

blacksmith-sh bot requested a review from xingyaoww November 6, 2025 13:01

Merge branch 'main' into img

384522b

xingyaoww reviewed Nov 6, 2025

View reviewed changes

SmartManoj added 3 commits November 7, 2025 02:55

Add test for viewing image files in FileEditor

5fa71b8

Introduces a test to verify that viewing a PNG image file with FileEditor returns an ImageContent object containing a base64-encoded image URL. This ensures image files are handled correctly by the view command.

Merge branch 'main' into img

1b38b0e

Use double quotes for byte strings in PNG test

d7e27e8

Replaces single quotes with double quotes for byte string literals in the PNG image data and updates an attribute check to use double quotes for consistency in test_view_image_file_returns_image_content.

Merge branch 'main' into img

eb9373b

simonrosenberg reviewed Nov 8, 2025

View reviewed changes

SmartManoj and others added 3 commits November 8, 2025 16:55

Use double quotes for image MIME type check

a1801b8

Replaces single quotes with double quotes in MIME type string comparison and assignment for consistency in the file editor.

Merge branch 'main' into img

3f908b9

simonrosenberg force-pushed the img branch from 3f6d031 to ffcaabc Compare November 9, 2025 20:10

OpenHands deleted a comment from openhands-ai bot Nov 9, 2025

xingyaoww added the integration-test Runs the integration tests and comments the results label Nov 10, 2025

xingyaoww reviewed Nov 10, 2025

View reviewed changes

xingyaoww force-pushed the img branch from 79ad28f to 262bd60 Compare November 10, 2025 16:30

xingyaoww added integration-test Runs the integration tests and comments the results and removed integration-test Runs the integration tests and comments the results labels Nov 10, 2025

xingyaoww approved these changes Nov 10, 2025

View reviewed changes

xingyaoww reviewed Nov 10, 2025

View reviewed changes

xingyaoww force-pushed the img branch from 4af032e to dac9357 Compare November 11, 2025 01:29

Merge branch 'main' into img

fe9257d

xingyaoww added integration-test Runs the integration tests and comments the results and removed integration-test Runs the integration tests and comments the results labels Nov 11, 2025

Merge branch 'main' into img

07ab4b2

xingyaoww added integration-test Runs the integration tests and comments the results and removed integration-test Runs the integration tests and comments the results labels Nov 11, 2025

xingyaoww merged commit 488806e into OpenHands:main Nov 11, 2025
24 checks passed

xingyaoww mentioned this pull request Nov 11, 2025

Support kimi-k2 extended thinking, fix prompt caching stats, fix max output #1133

Merged

	if self.screenshot_data:
	mime_type = "image/png"
	if self.screenshot_data.startswith("/9j/"):
	mime_type = "image/jpeg"
	elif self.screenshot_data.startswith("iVBORw0KGgo"):
	mime_type = "image/png"
	elif self.screenshot_data.startswith("R0lGODlh"):
	mime_type = "image/gif"
	elif self.screenshot_data.startswith("UklGR"):
	mime_type = "image/webp"

Add image file viewing support to FileEditor #1016

Add image file viewing support to FileEditor #1016

Conversation

SmartManoj commented Nov 4, 2025

Uh oh!

simonrosenberg commented Nov 4, 2025

Uh oh!

blacksmith-sh bot commented Nov 6, 2025

Uh oh!

xingyaoww left a comment

Choose a reason for hiding this comment

Uh oh!

enyst commented Nov 8, 2025

Uh oh!

simonrosenberg Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SmartManoj Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

simonrosenberg Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Nov 8, 2025

Choose a reason for hiding this comment

Code Review Summary

Problems Identified:

Recommended Solutions:

Verdict:

Uh oh!

simonrosenberg Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SmartManoj Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

simonrosenberg Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

xingyaoww commented Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openhands-ai bot commented Nov 8, 2025

Uh oh!

openhands-ai bot commented Nov 8, 2025

Summary

✅ Reviewed PR #1016 Changes

✅ Created Integration Test

✅ Quality Checks

Uh oh!

github-actions bot commented Nov 10, 2025

Uh oh!

github-actions bot commented Nov 10, 2025

🧪 Integration Tests Results

📁 Detailed Logs & Artifacts

📊 Summary

📋 Detailed Results

litellm_proxy_gpt_5_mini_2025_08_07

litellm_proxy_deepseek_deepseek_chat

litellm_proxy_claude_sonnet_4_5_20250929

Uh oh!

xingyaoww Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

openhands-ai bot Nov 10, 2025

Choose a reason for hiding this comment

✅ Implementation Complete: Skip Integration Tests for Unsupported LLMs

Changes Made

Benefits

Uh oh!

simonrosenberg Nov 8, 2025 •

edited

Loading

simonrosenberg Nov 8, 2025 •

edited

Loading

simonrosenberg Nov 8, 2025 •

edited

Loading

SmartManoj Nov 8, 2025 •

edited

Loading

xingyaoww commented Nov 8, 2025 •

edited

Loading