Skip to content

fix(tui): use nonblocking send for ListSubAgents refresh events (#3802)#3813

Merged
Hmbown merged 1 commit into
mainfrom
codex/0866-3802-channel-backpressure
Jun 30, 2026
Merged

fix(tui): use nonblocking send for ListSubAgents refresh events (#3802)#3813
Hmbown merged 1 commit into
mainfrom
codex/0866-3802-channel-backpressure

Conversation

@Hmbown

@Hmbown Hmbown commented Jun 30, 2026

Copy link
Copy Markdown
Owner

Closes #3802 (child of #3800).

Problem

High sub-agent fanout creates status storms that put pressure on the bounded engine op channel (32) and event channel (256). Because Op::ListSubAgents sends used .send().await, the engine or TUI event loop could stall when the receiver was not draining fast enough.

Change

Switched all ListSubAgents refresh-type send paths from blocking send().await to non-blocking try_send():

  1. crates/tui/src/core/engine.rsOp::ListSubAgents handler now uses tx_event.try_send(Event::AgentList{...}) with a tracing::debug! on channel-full drop.
  2. crates/tui/src/core/engine/handle.rs — Added EngineHandle::try_send() non-blocking method that delegates to self.tx_op.try_send(op).
  3. crates/tui/src/tui/ui.rs — Converted 3 engine_handle.send(Op::ListSubAgents).await call sites to engine_handle.try_send(Op::ListSubAgents).

Guardrails (per the issue)

  • Critical events unchanged: approval prompts, tool completions/results, fatal errors, cancellation, and parent completion all remain on send().await.
  • Only refresh-type events use try_send: ListSubAgents is a sidebar refresh that converges on the next drain cycle.
  • No capacity increase: channel buffer sizes unchanged.

Tests

  • engine_handle_try_send_does_not_block_when_op_channel_is_full
  • list_subagents_event_try_send_does_not_block_when_event_channel_full

@claude

claude Bot commented Jun 30, 2026

Copy link
Copy Markdown

Claude finished @Hmbown's task in 1m 23s —— View job


Code Review — PR #3813: fix(tui): use nonblocking send for ListSubAgents refresh events

Summary: Correct fix for the backpressure problem. The guardrail is respected — critical event paths remain on .send().await, only the refresh ListSubAgents ops switch to try_send. No logic errors or race conditions found. Two minor observations below.


Findings

1. Closed and Full are conflated in the debug log (engine.rs:1534–1538)

if let Err(_e) = self.tx_event.try_send(Event::AgentList { agents }) {
    tracing::debug!(
        "Event channel full; dropping ListSubAgents refresh (will retry next drain)"
    );
}

TrySendError has two variants: Full (expected under load) and Closed (receiver dropped — TUI is gone). Both hit the same message. A Closed error will repeat every drain cycle until the engine itself shuts down and is harder to diagnose from logs. Consider including _e in the message or branching on the variant:

match self.tx_event.try_send(Event::AgentList { agents }) {
    Ok(()) => {}
    Err(tokio::sync::mpsc::error::TrySendError::Full(_)) => {
        tracing::debug!("event channel full; dropping ListSubAgents refresh");
    }
    Err(tokio::sync::mpsc::error::TrySendError::Closed(_)) => {
        tracing::warn!("event channel closed; dropping ListSubAgents refresh");
    }
}

Fix this →


2. Second test uses #[tokio::test] but has no await points (tests.rs:6055–6075)

#[tokio::test]
async fn list_subagents_event_try_send_does_not_block_when_event_channel_full() {

This test is purely synchronous — try_send doesn't need a runtime. Using #[test] / fn is cheaper and clearer about the non-async nature. Not a blocker, but worth tidying.


What's correct

  • All three ListSubAgents call sites in ui.rs are correctly converted; no other Op variants are touched.
  • The EngineHandle::try_send wrapper is clean and appropriately thin.
  • The first test (engine_handle_try_send_does_not_block_when_op_channel_is_full) correctly exercises the handle's method and uses a synchronous #[test] as it should.
  • Critical paths (approval, tool completions, fatal errors, cancellation) are verified untouched by inspection of the diff.
  • The one-refresh-per-drain-batch invariant (subagent_list_refresh_requested flag) is preserved under try_send.

@Hmbown Hmbown merged commit 0f98170 into main Jun 30, 2026
19 checks passed
@Hmbown Hmbown deleted the codex/0866-3802-channel-backpressure branch June 30, 2026 01:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

v0.8.66: Remove engine/TUI channel backpressure from sub-agent status storms

1 participant