[observability] Agentic Observability Report — 2026-03-26 #23082

2026-03-26T10:32:24Z

github-actions[bot]
bot Mar 26, 2026

Executive Summary

This report covers the last 14 days of agentic workflow activity in github/gh-aw. Due to high run volume, the analysis window is bounded to the 100 most recent runs, all of which fall within 2026-03-26T07:13–10:23 UTC (~3 hours). No escalation-eligible episodes were detected. All 100 episodes are standalone with high confidence, no orchestration chains, and no MCP failures. The primary operational signal is a cluster of single-run failures for newer workflows that lack baselines, and one high-severity resource-heavy run for Go Fan.

Key Metrics

Metric	Value
Date range analyzed	2026-03-26T07:13 → 10:23 UTC (~3 hrs)
Workflows analyzed	28 unique
Runs analyzed	100
Episodes analyzed	100 (all standalone)
High-confidence episodes	100 (100%)
Runs: success	17
Runs: failure	6
Runs: skipped	74
Runs: in-progress/null	3
Runs classified "risky"	0
High/medium severity assessments	1 (Go Fan — resource_heavy, high)
Low severity assessments	15 (overkill_for_agentic)
Escalation-eligible episodes	0
Total tokens consumed	9,061,317
Estimated cost	~$1.19
Total Actions minutes	194
MCP failures	0
Missing tools	0

Highest Risk Episodes

No episodes crossed the escalation threshold. The one high-severity signal is a single run of Go Fan:

Go Fan — resource_heavy_for_domain (high severity)

Run: §23582472622
1,735,634 tokens · 7.8m · 8 action-minutes · exploratory/broad/selective_write/heavy
Assessment: 12 tool types used, 1 write action, heavy resource profile for its domain (General Automation)
No cohort baseline found — this is the only recent run for this workflow
Verdict: Single occurrence, no repeated pattern. Monitor for recurrence.

Episode Regressions

No regressions detected. All episodes with baselines were classified stable:

AI Moderator: 6 runs (1 failure), cohort-matched baseline, stable — the failure run shows identical posture/turns/blocked-requests vs baseline. Likely an upstream trigger failure, not agent misbehavior.
Grumpy Code Reviewer 🔥: 6 runs (0 failures), cohort-matched baseline, stable.

The following 5 workflows failed with no baseline found — first-run or low-frequency workflows where regression detection is not yet possible:

Workflow	Duration	Tokens	Status
Auto-Triage Issues	3.1m	102K	failure
Daily News	2.1m	0	failure
Dev	13.5m	1.37M	failure
Daily MCP Tool Concurrency Analysis	6.6m	465K	failure
Architecture Diagram Generator	9.1m	553K	failure

Recommended Actions

Go Fan — Review the run for unnecessary tool calls. 12 tool types and a heavy resource profile for a General Automation task is atypical. If this becomes a pattern after more runs, consider narrowing tool scope.
Failing single-run workflows — Investigate the 5 workflows above that failed without a baseline. Once they succeed, baselines will be established and regression detection will engage.
Observability coverage gap — All 100 runs land in a ~3-hour window. This suggests very high daily volume. Consider increasing count to 300+ for future reports, or running the observability kit on a longer cadence with date-bounded queries to ensure multi-day coverage.

Optimization Candidates (overkill_for_agentic, low severity)

The following 15 workflows consistently produce zero turns, zero tool types, and read-only posture. They are assessed as potentially replaceable with deterministic automation. All are low severity and no action is required — listed for portfolio awareness only.

Workflow	Runs in window	Conclusions
/cloclo	12	skipped
Q	12	skipped
Scout	12	skipped
Archie	11	skipped
AI Moderator	6	mixed (see above)
Issue Monster	5	success
ACE Editor Session	3	skipped
Documentation Unbloat	3	skipped
Mergefest	3	skipped
Plan Command	3	skipped
Glossary Maintainer	1	in-progress
Poem Bot	1	skipped
Resource Summarizer Agent	4	skipped
Workflow Craft Agent	1	skipped
Auto-Triage Issues	1	failure

The high skip rate (74 of 100 runs) is likely by design — workflows activate only on matching triggers. No action recommended unless persistent skipping on intended triggers is suspected.

Resource-Heavy Runs Detail

Workflow	Tokens	Duration	Cost	Actuation	Result
Functional Pragmatist	3,051,841	17.4m	est. $0	selective_write	success
Contribution Check	1,776,117	15.4m	est. $0	write_heavy	success
Go Fan	1,735,634	7.8m	$1.19	selective_write	success
Dev	1,376,007	13.5m	est. $0	read_only	failure
Architecture Diagram Generator	553,782	9.1m	est. $0	read_only	failure
Daily MCP Tool Concurrency Analysis	465,693	6.6m	est. $0	read_only	failure

Go Fan is the only run with actual cost logged ($1.19). All other runs show $0 estimated cost — cost attribution may be missing for Claude/Codex engines or the runs did not use billable tokens.

Episode Model Observations

0 edges recorded — no inter-run lineage detected in this window
0 agentic_assessments at episode level — assessments are only present at run level
All 100 episodes are standalone kind with no_shared_lineage_markers reason
This is consistent with a single-day snapshot where no workflow_run chains had time to form

If orchestrator→worker patterns exist in this repository (e.g., a scheduling workflow that triggers others), they would only appear if both the parent and child runs fell within the same 100-run window. A wider time window or larger count is needed for DAG lineage analysis.

References:

§23582472622 — Go Fan (resource_heavy, high)
§23588592720 — AI Moderator (failure, stable baseline)
§23586783098 — Dev (failure, no baseline)

AI generated by Agentic Observability Kit · history

expires on Apr 2, 2026, 10:32 AM UTC

2026-03-30T08:54:46Z

github-actions[bot]
bot Mar 30, 2026
Author

This discussion has been marked as outdated by Agentic Observability Kit.

A newer discussion is available at Discussion #23527.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[observability] Agentic Observability Report — 2026-03-26 #23082

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[observability] Agentic Observability Report — 2026-03-26 #23082

Uh oh!

github-actions[bot] bot Mar 26, 2026

Executive Summary

Key Metrics

Highest Risk Episodes

Episode Regressions

Recommended Actions

Replies: 1 comment

Uh oh!

github-actions[bot] bot Mar 30, 2026 Author

github-actions[bot]
bot Mar 26, 2026

github-actions[bot]
bot Mar 30, 2026
Author