Fix: LLM telemetry logging #1028

enyst · 2025-11-04T21:33:53Z

Summary

Extend telemetry _safe_json to serialize any Pydantic BaseModel via model_dump(mode="json", exclude_none=True) to avoid circular-reference and Pydantic serialization warnings when logging complex objects (e.g., tools/specs) that may appear in the log context for remote agent/server scenarios
Add targeted test ensuring logging succeeds when the log context includes a self-referencing Pydantic model with excluded runtime-only fields

Problem
When running the agent remotely (Docker/remote runtime), enabling log_completions caused telemetry logging to fail:

If the default log directory wasn't present inside the container: log_dir does not exist: logs/completions
After creating a directory, log files were created but empty, with warnings like Circular reference detected.
This happened because Telemetry.log_llm_call attempted to JSON-serialize arbitrary objects from the logging context (e.g., Pydantic-based tool/config objects) using generic fallbacks (dict), which can include self-references or non-serializable fields (like executors), leading to circular references and failed writes.

Changes

telemetry.py: update _safe_json
- Handle ModelResponse and ResponsesAPIResponse (unchanged) via model_dump(mode="json", exclude_none=True)
- NEW: handle any Pydantic BaseModel via model_dump(mode="json", exclude_none=True) to ensure safe, cycle-free serialization respecting field exclusions
- Fallback to dict or str(obj) remains for plain objects
tests/sdk/llm/test_llm_telemetry.py: add test_log_completion_with_pydantic_objects_in_context
- Simulates a Pydantic model with a self-referencing, excluded field (executor)
- Verifies no circular reference warnings and that a valid JSON log is produced

Compatibility / Impact

Backward compatible: only improves robustness of telemetry serialization. Existing behavior for non-Pydantic objects remains unchanged.
No public API changes.

Testing

Ran pre-commit hooks on changed files
Ran telemetry tests (35/35 pass) and the new test passes
Existing integration tests for log_completions also pass locally

Docs

No docs changes required; this is an internal robustness fix

Fixes #1027

@enyst can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:523e655-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-523e655-python \
  ghcr.io/openhands/agent-server:523e655-python

All tags pushed for this build

ghcr.io/openhands/agent-server:523e655-golang-amd64
ghcr.io/openhands/agent-server:v1.0.0a5_golang_tag_1.21-bookworm_binary-amd64
ghcr.io/openhands/agent-server:523e655-golang-arm64
ghcr.io/openhands/agent-server:v1.0.0a5_golang_tag_1.21-bookworm_binary-arm64
ghcr.io/openhands/agent-server:523e655-java-amd64
ghcr.io/openhands/agent-server:v1.0.0a5_eclipse-temurin_tag_17-jdk_binary-amd64
ghcr.io/openhands/agent-server:523e655-java-arm64
ghcr.io/openhands/agent-server:v1.0.0a5_eclipse-temurin_tag_17-jdk_binary-arm64
ghcr.io/openhands/agent-server:523e655-python-amd64
ghcr.io/openhands/agent-server:v1.0.0a5_nikolaik_s_python-nodejs_tag_python3.12-nodejs22_binary-amd64
ghcr.io/openhands/agent-server:523e655-python-arm64
ghcr.io/openhands/agent-server:v1.0.0a5_nikolaik_s_python-nodejs_tag_python3.12-nodejs22_binary-arm64
ghcr.io/openhands/agent-server:523e655-golang
ghcr.io/openhands/agent-server:523e655-java
ghcr.io/openhands/agent-server:523e655-python

About Multi-Architecture Support

Each variant tag (e.g., 523e655-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 523e655-python-amd64) are also available if needed

- Extend _safe_json to handle any Pydantic BaseModel via model_dump(mode="json", exclude_none=True) to avoid circular references when logging tool/context objects in remote agent/server scenarios - Add test that simulates a self-referencing Pydantic object in log context to ensure no warnings Fixes #1027 Co-authored-by: openhands <[email protected]>

github-actions · 2025-11-04T21:38:33Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/llm/utils
telemetry.py	135	57	57%	87, 93, 102, 106, 109, 114–115, 150, 157, 179, 183–184, 192–194, 200, 217–219, 222–225, 227, 235–236, 240–242, 245–248, 253, 258, 261, 266, 269, 275, 281, 283, 286–287, 292, 297–301, 308–309, 312, 314, 317–320
TOTAL	11664	5407	53%

- Remove fallback/local import of BaseModel; rely on top-level import - Keep robust serialization for Pydantic models via model_dump(mode="json", exclude_none=True) Co-authored-by: openhands <[email protected]>

- Import BaseModel and Field at module top - Remove mid-function import Co-authored-by: openhands <[email protected]>

blacksmith-sh · 2025-11-07T12:59:20Z

[Automatic Post]: I have assigned @xingyaoww as a reviewer based on git blame information. Thanks in advance for the help!

openhands-ai bot mentioned this pull request Nov 4, 2025

LLM logging does not work when running agent remotely #1027

Closed

enyst and others added 3 commits November 4, 2025 21:46

llm(telemetry): simplify _safe_json imports and use top-level BaseModel

a3250c4

- Remove fallback/local import of BaseModel; rely on top-level import - Keep robust serialization for Pydantic models via model_dump(mode="json", exclude_none=True) Co-authored-by: openhands <[email protected]>

tests(telemetry): move Pydantic imports to top-level per style guidance

1e8a306

- Import BaseModel and Field at module top - Remove mid-function import Co-authored-by: openhands <[email protected]>

Merge branch 'main' into openhands/fix-remote-llm-telemetry-logging

bab3eac

blacksmith-sh bot requested a review from xingyaoww November 7, 2025 12:59

enyst changed the title ~~Fix: robust LLM telemetry logging for remote agent/server runs~~ Fix: LLM telemetry logging Nov 8, 2025

xingyaoww approved these changes Nov 8, 2025

View reviewed changes

enyst merged commit 572ea21 into main Nov 8, 2025
21 checks passed

enyst deleted the openhands/fix-remote-llm-telemetry-logging branch November 8, 2025 19:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: LLM telemetry logging #1028

Fix: LLM telemetry logging #1028

enyst commented Nov 4, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Nov 4, 2025 •

edited

Loading

Uh oh!

blacksmith-sh bot commented Nov 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix: LLM telemetry logging #1028

Fix: LLM telemetry logging #1028

Conversation

enyst commented Nov 4, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blacksmith-sh bot commented Nov 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

enyst commented Nov 4, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Nov 4, 2025 •

edited

Loading