Skip to content

Conversation

@ryanhoangt
Copy link
Collaborator

@ryanhoangt ryanhoangt commented Dec 3, 2025


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:641e720-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-641e720-python \
  ghcr.io/openhands/agent-server:641e720-python

All tags pushed for this build

ghcr.io/openhands/agent-server:641e720-golang-amd64
ghcr.io/openhands/agent-server:641e720-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:641e720-golang-arm64
ghcr.io/openhands/agent-server:641e720-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:641e720-java-amd64
ghcr.io/openhands/agent-server:641e720-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:641e720-java-arm64
ghcr.io/openhands/agent-server:641e720-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:641e720-python-amd64
ghcr.io/openhands/agent-server:641e720-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:641e720-python-arm64
ghcr.io/openhands/agent-server:641e720-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:641e720-golang
ghcr.io/openhands/agent-server:641e720-java
ghcr.io/openhands/agent-server:641e720-python

About Multi-Architecture Support

  • Each variant tag (e.g., 641e720-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 641e720-python-amd64) are also available if needed

SmartManoj and others added 26 commits November 4, 2025 14:55
Enhanced _check_chromium_available to check common Windows installation paths for Chrome and Edge executables. Also added support for Playwright cache detection on Windows using LOCALAPPDATA.
Removed redundant os.name check and always define Windows Chromium and Edge paths. Simplified Playwright cache candidate logic by unconditionally including the Windows path.
Simplifies and generalizes the construction of Windows browser executable paths by iterating over environment variables and browser definitions. This reduces code duplication and improves maintainability.
Cleaned up unnecessary trailing whitespace in the _check_chromium_available function for improved code style.
Remove incorrect skip logic that prevented checking %LOCALAPPDATA% for
Microsoft Edge installations. While Edge is typically installed system-wide,
it can also be installed per-user in enterprise environments at
%LOCALAPPDATA%\Microsoft\Edge\Application\msedge.exe. This fix ensures
the browser detection can find Edge in both installation scenarios.
Renamed the 'browsers' variable to 'windows_browsers' for clarity and updated its usage in the Chromium availability check function.
Introduces WindowsBrowserToolExecutor in a new impl_windows.py for improved Chromium detection on Windows. Updates BrowserToolSet to use the Windows-specific executor when running on Windows, and refactors impl.py to remove Windows-specific code paths.
Moved Chromium detection functions into BrowserToolExecutor as instance methods and provided a Windows-specific override in WindowsBrowserToolExecutor. This improves platform extensibility and removes reliance on module-level function overrides.
Updated tests to instantiate BrowserToolExecutor and WindowsBrowserToolExecutor for Chromium detection methods, replacing direct function calls. This aligns tests with the refactored implementation and improves test accuracy for platform-specific logic.
Updated test cases in browser cleanup and initialization tests to use patch.object on BrowserToolExecutor for _ensure_chromium_available, instead of patching the function by import path. This improves test robustness and clarity by directly targeting the class method.
Refactored browser path search to short-circuit on first found executable for efficiency and clarified environment variable handling. Updated test to use a more accurate mock for os.environ.get.
@ryanhoangt ryanhoangt added the test-examples Run all applicable "examples/" files. Expensive operation. label Dec 3, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

🔄 Running Examples with litellm_proxy/claude-sonnet-4-5-20250929

Generated: 2025-12-03 16:46:20 UTC

Example Status Duration Cost
01_standalone_sdk/02_custom_tools.py ✅ PASS 48.6s $0.07
01_standalone_sdk/03_activate_skill.py ✅ PASS 36.2s $0.03
01_standalone_sdk/05_use_llm_registry.py ✅ PASS 20.0s $0.01
01_standalone_sdk/07_mcp_integration.py ✅ PASS 1m 25s $0.10
01_standalone_sdk/09_pause_example.py ✅ PASS 24.8s $0.03
01_standalone_sdk/10_persistence.py ✅ PASS 55.9s $0.07
01_standalone_sdk/11_async.py ✅ PASS 1m 19s $0.09
01_standalone_sdk/12_custom_secrets.py ✅ PASS 25.6s $0.02
01_standalone_sdk/13_get_llm_metrics.py ✅ PASS 44.6s $0.04
01_standalone_sdk/14_context_condenser.py ✅ PASS 2m 31s $0.45
01_standalone_sdk/17_image_input.py ✅ PASS 28.1s $0.03
01_standalone_sdk/18_send_message_while_processing.py ✅ PASS 39.9s $0.03
01_standalone_sdk/19_llm_routing.py ✅ PASS 24.5s $0.05
01_standalone_sdk/20_stuck_detector.py ✅ PASS 34.6s $0.03
01_standalone_sdk/21_generate_extraneous_conversation_costs.py ✅ PASS 18.3s $0.01
01_standalone_sdk/22_anthropic_thinking.py ✅ PASS 29.7s $0.03
01_standalone_sdk/23_responses_reasoning.py ✅ PASS 37.4s $0.01
01_standalone_sdk/24_planning_agent_workflow.py ✅ PASS 9m 5s $0.97
01_standalone_sdk/25_agent_delegation.py ✅ PASS 3m 22s $0.40
01_standalone_sdk/26_custom_visualizer.py ✅ PASS 37.6s $0.06
01_standalone_sdk/28_ask_agent_example.py ✅ PASS 1m 4s $0.08
01_standalone_sdk/29_llm_streaming.py ✅ PASS 1m 14s $0.06
01_standalone_sdk/30_tom_agent.py ❌ FAIL
Missing EXAMPLE_COST marker in stdout
42.8s --
02_remote_agent_server/01_convo_with_local_agent_server.py ✅ PASS 2m 0s $0.16
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py ✅ PASS 3m 24s $0.10
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py ✅ PASS 2m 1s $0.22
02_remote_agent_server/04_convo_with_api_sandboxed_server.py ✅ PASS 3m 42s $0.15

❌ Some tests failed

Total: 27 | Passed: 26 | Failed: 1 | Total Cost: $3.28

Failed examples:

  • examples/01_standalone_sdk/30_tom_agent.py: Missing EXAMPLE_COST marker in stdout

View full workflow run

@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-tools/openhands/tools/browser_use
   definition.py1181785%43–46, 62, 65–66, 69–71, 82–83, 85–86, 88, 567, 571
   impl.py18412333%25, 34, 36–38, 40–41, 48–50, 52–56, 61, 98, 107–110, 113, 118–121, 123, 132–135, 148, 178–179, 190–191, 206, 212, 226–227, 229–238, 241–250, 252–253, 259, 264–267, 275, 277–278, 283–284, 288–289, 294–295, 299–300, 304–305, 309, 311–312, 314–317, 320–321, 327, 329, 331, 340–341, 345–346, 350–351, 356–357, 363–367, 371–376, 380–383, 385–387, 390, 394–397
TOTAL12869605852% 

@ryanhoangt ryanhoangt added test-examples Run all applicable "examples/" files. Expensive operation. and removed test-examples Run all applicable "examples/" files. Expensive operation. labels Dec 3, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

🔄 Running Examples with litellm_proxy/claude-sonnet-4-5-20250929

Generated: 2025-12-03 16:55:10 UTC

Example Status Duration Cost
01_standalone_sdk/02_custom_tools.py ✅ PASS 44.8s $0.07
01_standalone_sdk/03_activate_skill.py ✅ PASS 30.5s $0.04
01_standalone_sdk/05_use_llm_registry.py ✅ PASS 19.1s $0.02
01_standalone_sdk/07_mcp_integration.py ✅ PASS 1m 12s $0.08
01_standalone_sdk/09_pause_example.py ✅ PASS 27.6s $0.03
01_standalone_sdk/10_persistence.py ✅ PASS 59.9s $0.07
01_standalone_sdk/11_async.py ✅ PASS 1m 10s $0.08
01_standalone_sdk/12_custom_secrets.py ✅ PASS 22.2s $0.03
01_standalone_sdk/13_get_llm_metrics.py ✅ PASS 46.7s $0.04
01_standalone_sdk/14_context_condenser.py ✅ PASS 4m 5s $0.75
01_standalone_sdk/17_image_input.py ✅ PASS 33.4s $0.05
01_standalone_sdk/18_send_message_while_processing.py ✅ PASS 30.2s $0.03
01_standalone_sdk/19_llm_routing.py ✅ PASS 24.5s $0.05
01_standalone_sdk/20_stuck_detector.py ✅ PASS 26.8s $0.02
01_standalone_sdk/21_generate_extraneous_conversation_costs.py ✅ PASS 18.0s $0.01
01_standalone_sdk/22_anthropic_thinking.py ✅ PASS 29.1s $0.03
01_standalone_sdk/23_responses_reasoning.py ✅ PASS 36.5s $0.01
01_standalone_sdk/24_planning_agent_workflow.py ✅ PASS 6m 8s $0.60
01_standalone_sdk/25_agent_delegation.py ✅ PASS 3m 12s $0.49
01_standalone_sdk/26_custom_visualizer.py ✅ PASS 45.9s $0.07
01_standalone_sdk/28_ask_agent_example.py ✅ PASS 49.3s $0.08
01_standalone_sdk/29_llm_streaming.py ✅ PASS 1m 14s $0.08
01_standalone_sdk/30_tom_agent.py ❌ FAIL
Missing EXAMPLE_COST marker in stdout
51.2s --
02_remote_agent_server/01_convo_with_local_agent_server.py ✅ PASS 1m 50s $0.16
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py ✅ PASS 2m 28s $0.11
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py ✅ PASS 3m 5s $0.16
02_remote_agent_server/04_convo_with_api_sandboxed_server.py ✅ PASS 2m 22s $0.11

❌ Some tests failed

Total: 27 | Passed: 26 | Failed: 1 | Total Cost: $3.29

Failed examples:

  • examples/01_standalone_sdk/30_tom_agent.py: Missing EXAMPLE_COST marker in stdout

View full workflow run

@ryanhoangt ryanhoangt added test-examples Run all applicable "examples/" files. Expensive operation. and removed test-examples Run all applicable "examples/" files. Expensive operation. labels Dec 3, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

🔄 Running Examples with litellm_proxy/claude-sonnet-4-5-20250929

Generated: 2025-12-03 17:26:32 UTC

Example Status Duration Cost
01_standalone_sdk/02_custom_tools.py ✅ PASS 47.2s $0.08
01_standalone_sdk/03_activate_skill.py ✅ PASS 29.2s $0.04
01_standalone_sdk/05_use_llm_registry.py ✅ PASS 18.0s $0.02
01_standalone_sdk/07_mcp_integration.py ✅ PASS 1m 43s $0.12
01_standalone_sdk/09_pause_example.py ✅ PASS 26.8s $0.03
01_standalone_sdk/10_persistence.py ✅ PASS 58.3s $0.07
01_standalone_sdk/11_async.py ✅ PASS 1m 6s $0.10
01_standalone_sdk/12_custom_secrets.py ✅ PASS 23.4s $0.04
01_standalone_sdk/13_get_llm_metrics.py ✅ PASS 43.6s $0.04
01_standalone_sdk/14_context_condenser.py ✅ PASS 5m 36s $0.95
01_standalone_sdk/17_image_input.py ✅ PASS 28.3s $0.05
01_standalone_sdk/18_send_message_while_processing.py ✅ PASS 25.6s $0.02
01_standalone_sdk/19_llm_routing.py ✅ PASS 21.8s $0.04
01_standalone_sdk/20_stuck_detector.py ✅ PASS 29.4s $0.02
01_standalone_sdk/21_generate_extraneous_conversation_costs.py ✅ PASS 17.4s $0.01
01_standalone_sdk/22_anthropic_thinking.py ✅ PASS 26.5s $0.03
01_standalone_sdk/23_responses_reasoning.py ✅ PASS 1m 18s $0.02
01_standalone_sdk/24_planning_agent_workflow.py ✅ PASS 6m 15s $0.61
01_standalone_sdk/25_agent_delegation.py ✅ PASS 3m 15s $0.51
01_standalone_sdk/26_custom_visualizer.py ✅ PASS 34.2s $0.06
01_standalone_sdk/28_ask_agent_example.py ✅ PASS 53.2s $0.08
01_standalone_sdk/29_llm_streaming.py ✅ PASS 1m 5s $0.07
01_standalone_sdk/30_tom_agent.py ✅ PASS 38.5s $0.04
02_remote_agent_server/01_convo_with_local_agent_server.py ✅ PASS 1m 50s $0.17
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py ✅ PASS 2m 52s $0.11
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py ✅ PASS 2m 58s $0.17
02_remote_agent_server/04_convo_with_api_sandboxed_server.py ✅ PASS 2m 4s $0.12

✅ All tests passed!

Total: 27 | Passed: 27 | Failed: 0 | Total Cost: $3.61

View full workflow run

SmartManoj and others added 6 commits December 4, 2025 05:20
BrowserToolExecutor now checks common Linux and macOS installation paths for Chromium and Chrome if not found in PATH. Corresponding tests were added to verify detection via these standard paths.
Renamed the example script for consistency or to reflect updated numbering in the standalone SDK examples directory.
Adds a print statement to display the accumulated cost from the LLM metrics at the end of the Tom agent consultation example.
Changed references from 30_windows.py to 31_windows.py in the exemption list and workflow configuration to reflect the new filename.
@ryanhoangt ryanhoangt added test-examples Run all applicable "examples/" files. Expensive operation. and removed test-examples Run all applicable "examples/" files. Expensive operation. labels Dec 4, 2025
@openhands-ai
Copy link

openhands-ai bot commented Dec 4, 2025

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Run Examples Scripts

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1309 at branch `smj-chromium`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

@ryanhoangt ryanhoangt added test-examples Run all applicable "examples/" files. Expensive operation. and removed test-examples Run all applicable "examples/" files. Expensive operation. labels Dec 4, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 4, 2025

🔄 Running Examples with litellm_proxy/claude-haiku-4-5-20251001

Generated: 2025-12-04 13:03:09 UTC

Example Status Duration Cost
01_standalone_sdk/02_custom_tools.py ✅ PASS 30.7s $0.03
01_standalone_sdk/03_activate_skill.py ✅ PASS 19.4s $0.02
01_standalone_sdk/05_use_llm_registry.py ✅ PASS 16.3s $0.01
01_standalone_sdk/07_mcp_integration.py ✅ PASS 1m 6s $0.04
01_standalone_sdk/09_pause_example.py ✅ PASS 21.1s $0.01
01_standalone_sdk/10_persistence.py ✅ PASS 43.8s $0.02
01_standalone_sdk/11_async.py ✅ PASS 44.4s $0.03
01_standalone_sdk/12_custom_secrets.py ✅ PASS 15.3s $0.01
01_standalone_sdk/13_get_llm_metrics.py ✅ PASS 38.4s $0.01
01_standalone_sdk/14_context_condenser.py ✅ PASS 2m 31s $0.27
01_standalone_sdk/17_image_input.py ✅ PASS 18.4s $0.02
01_standalone_sdk/18_send_message_while_processing.py ✅ PASS 25.2s $0.01
01_standalone_sdk/19_llm_routing.py ✅ PASS 17.8s $0.02
01_standalone_sdk/20_stuck_detector.py ✅ PASS 30.0s $0.02
01_standalone_sdk/21_generate_extraneous_conversation_costs.py ✅ PASS 10.2s $0.00
01_standalone_sdk/22_anthropic_thinking.py ✅ PASS 14.9s $0.01
01_standalone_sdk/23_responses_reasoning.py ✅ PASS 41.4s $0.01
01_standalone_sdk/24_planning_agent_workflow.py ✅ PASS 4m 11s $0.27
01_standalone_sdk/25_agent_delegation.py ✅ PASS 2m 12s $0.16
01_standalone_sdk/26_custom_visualizer.py ✅ PASS 21.9s $0.02
01_standalone_sdk/28_ask_agent_example.py ✅ PASS 36.9s $0.02
01_standalone_sdk/29_llm_streaming.py ✅ PASS 38.3s $0.02
01_standalone_sdk/30_tom_agent.py ✅ PASS 25.1s $0.01
01_standalone_sdk/31_windows.py ✅ PASS 1m 1s $0.05
02_remote_agent_server/01_convo_with_local_agent_server.py ✅ PASS 1m 2s $0.03
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py ✅ PASS 2m 4s $0.08
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py ✅ PASS 53.7s $0.05
02_remote_agent_server/04_convo_with_api_sandboxed_server.py ✅ PASS 1m 33s $0.03

✅ All tests passed!

Total: 28 | Passed: 28 | Failed: 0 | Total Cost: $1.28

View full workflow run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test-examples Run all applicable "examples/" files. Expensive operation.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants