Skip to content

Spec: consume getTopEventEmitters in diagnostics UX #41

Description

@affandar

@Sharjeel09877 before we wire getTopEventEmitters into the TUI/portal UX, please start with a short spec for how this diagnostic is going to be used.

Context:

What the API returns:

  • workerNodeId
  • eventType
  • eventCount
  • sessionCount
  • firstSeenAt
  • lastSeenAt

Please write a spec covering:

  1. Which UX surface should consume it first:
    • TUI stats inspector
    • portal stats/admin panel
    • agent-tuner inspect tool
    • some combination of the above
  2. What operator question it answers. Examples:
    • Which worker/event bucket is spamming CMS writes?
    • Is one worker responsible for most noisy events?
    • Is noisy event volume concentrated in one session or spread across many sessions?
  3. Default time windows and limits:
    • e.g. last 15m / 1h / 24h
    • default and max row count
  4. How rows should be grouped/rendered:
    • worker + event type
    • event count
    • session count
    • first/last seen
  5. What actions, if any, should be available from the surface:
    • filter by worker
    • jump to sessions/events
    • copy diagnostic JSON
  6. Whether the signal also needs an agent-tuner inspect tool per the repo observability rule.
  7. Tests expected for the final implementation.

Important constraints:

  • Keep the read bounded.
  • Do not introduce an unbounded all-time query path.
  • Do not add raw SQL outside the catalog/provider layer.
  • Do not make this a visual-only portal feature if the agent tuner also needs it for investigations.

Follow-up implementation should wait until this usage spec is clear.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions