Skip to content

fix: guard print() calls in run_conversation() against OSError#858

Closed
teyrebaz33 wants to merge 1 commit intoNousResearch:mainfrom
teyrebaz33:fix/845-guard-print-oserror
Closed

fix: guard print() calls in run_conversation() against OSError#858
teyrebaz33 wants to merge 1 commit intoNousResearch:mainfrom
teyrebaz33:fix/845-guard-print-oserror

Conversation

@teyrebaz33
Copy link
Contributor

Closes #845

Problem

When hermes-agent runs as a systemd service or headless daemon, the stdout pipe can become unavailable (idle timeout, buffer exhaustion, socket reset), causing print() to raise OSError: [Errno 5] Input/output error. This crashes run_conversation() and causes cron jobs to be marked as failed.

Changes

Wrap 7 affected print() calls in try/except OSError in run_agent.py:

Line Description Fallback
~3664 Context length discovery pass
~3694 Interrupt during API call pass
~3737 API retry warning logger.warning
~3922 API failed after N attempts logger.warning
~3958 All retries exhausted logger.error
~4191 quiet_mode 💬 display pass
~4360 Error handler ❌ message logger.error

Cosmetic lines are silently dropped. Error handler lines fall back to logger so the message is preserved.

Testing

3 new tests in TestPrintOSErrorGuard. Full suite: 2863 passed.

When hermes-agent runs as a systemd service or headless daemon,
the stdout pipe can become unavailable (idle timeout, buffer
exhaustion, socket reset), causing print() to raise OSError
[Errno 5] Input/output error.

Wrap 7 affected print() calls in try/except OSError:
- Cosmetic lines (quiet_mode display, interrupt/retry status,
  context length discovery): silently swallowed with pass
- Error handler lines: fall back to logger.error/warning so
  the message is not lost

Closes NousResearch#845
@teyrebaz33
Copy link
Contributor Author

The failing check (test_vision_tools.py::TestErrorLoggingExcInfo::test_analysis_error_logs_exc_info) is unrelated to this PR — fix is in #804.

@teknium1
Copy link
Contributor

Closing — the approach of wrapping individual print() calls doesn't scale. There are 68 print() calls in run_conversation() alone, and this PR guards only 7 of them (leaving adjacent prints in the same blocks unguarded). We're implementing a systematic fix that covers all of them without touching each call site. Thanks for bringing this to our attention — the issue is real and the root cause analysis in #845 was excellent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: guard print() calls in run_conversation() against OSError when stdout is unavailable (systemd/headless)

2 participants