Skip to content

Conversation

@xingyaoww
Copy link
Collaborator

@xingyaoww xingyaoww commented Nov 6, 2025

Summary

This PR implements Ray's suggestion to use shell: bash in the GitHub Actions workflow for running examples. This ensures that failures won't get masked by automatically applying set -euo pipefail.

Changes

  1. Added shell: bash to the "Run examples" step: This is the main long-running script that tests all examples with a real LLM
  2. Removed redundant set -e: Since shell: bash automatically applies set -euo pipefail (which includes -e plus additional safety flags), the explicit set -e is no longer needed
  3. Added shell: bash to the "Read examples report" step: For consistency and safety in bash script execution

Benefits

  • Better error handling: The -e flag ensures the script exits on any command failure
  • Stricter variable checking: The -u flag treats unset variables as errors
  • Pipefail safety: The -o pipefail option ensures that failures in piped commands are not masked

Context

From the conversation thread, Ray noted:

BTW when using a long script in a GitHub action, you usually want to use shell: bash which adds
set -euo pipefail
So failures won't get masked.

This change follows that best practice recommendation.

Testing

  • ✅ Pre-commit hooks passed (yamlfmt)
  • The workflow syntax remains valid (no changes to the actual script logic)

Co-authored-by: openhands [email protected]

@xingyaoww can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:92d13d3-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-92d13d3-python \
  ghcr.io/openhands/agent-server:92d13d3-python

All tags pushed for this build

ghcr.io/openhands/agent-server:92d13d3-golang-amd64
ghcr.io/openhands/agent-server:v1.0.0_golang_tag_1.21-bookworm_binary-amd64
ghcr.io/openhands/agent-server:92d13d3-golang-arm64
ghcr.io/openhands/agent-server:v1.0.0_golang_tag_1.21-bookworm_binary-arm64
ghcr.io/openhands/agent-server:92d13d3-java-amd64
ghcr.io/openhands/agent-server:v1.0.0_eclipse-temurin_tag_17-jdk_binary-amd64
ghcr.io/openhands/agent-server:92d13d3-java-arm64
ghcr.io/openhands/agent-server:v1.0.0_eclipse-temurin_tag_17-jdk_binary-arm64
ghcr.io/openhands/agent-server:92d13d3-python-amd64
ghcr.io/openhands/agent-server:v1.0.0_nikolaik_s_python-nodejs_tag_python3.12-nodejs22_binary-amd64
ghcr.io/openhands/agent-server:92d13d3-python-arm64
ghcr.io/openhands/agent-server:v1.0.0_nikolaik_s_python-nodejs_tag_python3.12-nodejs22_binary-arm64
ghcr.io/openhands/agent-server:92d13d3-golang
ghcr.io/openhands/agent-server:92d13d3-java
ghcr.io/openhands/agent-server:92d13d3-python

About Multi-Architecture Support

  • Each variant tag (e.g., 92d13d3-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 92d13d3-python-amd64) are also available if needed

Add 'shell: bash' to bash script steps in the run-examples workflow.
This automatically applies 'set -euo pipefail' which ensures failures
won't get masked, as suggested by Ray.

Also removed the redundant 'set -e' since shell:bash provides the more
robust 'set -euo pipefail' automatically.

Co-authored-by: openhands <[email protected]>
@xingyaoww xingyaoww added the test-examples Run all applicable "examples/" files. Expensive operation. label Nov 6, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Nov 6, 2025

🔄 Running Examples with openhands/claude-haiku-4-5-20251001

Last updated: 2025-11-06 23:23:47 UTC

Example Status Duration Cost
01_standalone_sdk/02_custom_tools.py ✅ PASS 82s $0.12
01_standalone_sdk/03_activate_skill.py ✅ PASS 11s $0.01
01_standalone_sdk/05_use_llm_registry.py ✅ PASS 9s $0.01
01_standalone_sdk/07_mcp_integration.py ✅ PASS 40s $0.02
01_standalone_sdk/09_pause_example.py ✅ PASS 11s $0.01
01_standalone_sdk/10_persistence.py ✅ PASS 32s $0.02
01_standalone_sdk/11_async.py ✅ PASS 31s $0.02
01_standalone_sdk/12_custom_secrets.py ✅ PASS 18s $0.01
01_standalone_sdk/13_get_llm_metrics.py ✅ PASS 29s $0.01
01_standalone_sdk/14_context_condenser.py ✅ PASS 178s $0.34
01_standalone_sdk/17_image_input.py ✅ PASS 16s $0.02
01_standalone_sdk/18_send_message_while_processing.py ✅ PASS 20s $0.01
01_standalone_sdk/19_llm_routing.py ✅ PASS 13s $0.01
01_standalone_sdk/20_stuck_detector.py ✅ PASS 15s $0.01
01_standalone_sdk/21_generate_extraneous_conversation_costs.py ✅ PASS 11s $0.01
01_standalone_sdk/22_anthropic_thinking.py ✅ PASS 14s $0.01
01_standalone_sdk/23_responses_reasoning.py ✅ PASS 46s $0.01
01_standalone_sdk/24_planning_agent_workflow.py ✅ PASS 255s $0.27
01_standalone_sdk/25_agent_delegation.py ❌ FAIL (exit: 1) 56s $0.00
01_standalone_sdk/26_custom_visualizer.py ✅ PASS 21s $0.00N/A
02_remote_agent_server/01_convo_with_local_agent_server.py ✅ PASS 65s $0.05
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py ✅ PASS 107s $0.05
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py ✅ PASS 57s $0.04

❌ Some tests failed

Total: 23 | Passed: 22 | Failed: 1

View full workflow run

@openhands-ai
Copy link

openhands-ai bot commented Nov 6, 2025

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Run Examples Scripts

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1057 at branch `use-shell-bash-in-workflow`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test-examples Run all applicable "examples/" files. Expensive operation.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants