Skip to content

Add lightweight action trace artifacts for simulator/WebKit failures #744

@shaun0927

Description

@shaun0927

#744 Add lightweight action trace artifacts for simulator/WebKit failures

Why OpenSafari should reflect this

Playwright and OpenTelemetry show that reliable automation needs action-level evidence: step name, timing, context, retry/timeout state, and links to screenshots/logs. OpenSafari currently has metrics and several reports, but failed iOS Safari/native/Flutter live validations still require reconstructing what happened from scattered tool output.

This should be reflected because OpenSafari's differentiator is direct iOS Safari + simulator control. When private APIs, WebKit sockets, AX trees, or Flutter VM Service calls fail, merge reviewers need a compact artifact that explains the failure without enabling heavy tracing by default.

Scope / how to implement

  • Add a lightweight OpenSafari trace artifact contract, preferably JSON-first and dependency-free.
  • Capture at minimum: run id, device id, tool/action name, start/end timestamps, duration, status, timeout, retry count if known, context source (webkit/native/flutter), and optional artifact paths (screenshot, console log, HAR, crash log).
  • Keep collection opt-in for live validation or failure-only paths; do not add always-on heavy tracing.
  • Reuse existing metrics/reporting surfaces where possible (src/metrics/*, src/qa/*, src/orchestration/*).
  • Add unit tests for serialization/redaction/bounded size.

Decisions needed before implementation

  1. Artifact location: test-output/opensafari-traces/, tests/*/output/, or caller-provided path?
  2. Default policy: opt-in env var, failure-only, or scenario-runner-only first?
  3. Redaction policy: which labels/URLs/headers are safe to include by default?
  4. Whether PR 1 should be schema+docs only, or include scenario-runner integration.

Success criteria

  • A documented trace schema exists and is stable enough for CI artifacts.
  • A targeted test proves trace events are bounded, redactable, and preserve timing/status.
  • Existing MCP tool behavior is unchanged when tracing is disabled.
  • No new production dependency is introduced.

Post-merge OpenSafari live validation

  • Run a failing/timeout scenario against a booted simulator and confirm the trace identifies the failing action, device, timeout, and last artifact path.
  • Run a passing scenario with tracing disabled and confirm no trace artifact is emitted.
  • Attach the trace artifact to a PR/check run or local test-output directory so maintainers can inspect it after failure.

Ambiguity review

This issue intentionally excludes adopting Playwright's trace format or OpenTelemetry SDK exporters. The first mergeable unit is OpenSafari-native trace evidence only.

Direction and necessity review (2026 OSS comparison)

  • Aligned: yes — trace artifacts support OpenSafari's simulator/WebKit reliability without re-platforming to Playwright/Appium.
  • Necessary: yes — current logs/metrics are useful but not sufficient as a single failure artifact for merge/post-merge live validation.
  • Minimal first PR: schema + dependency-free writer + scenario-runner integration only; no OpenTelemetry SDK/exporter.

Metadata

Metadata

Assignees

No one assigned

    Labels

    automation-roadmapOpenSafari automation roadmap work itemsenhancementNew feature or requestreliabilityReliability and stability

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions