selectTurnThinkingLevel makes a Haiku LLM call to pick a complexity level for the upcoming turn. It is called unconditionally every time generateAssistantReply runs — including on every timeout-resume and auth-resume slice, where the turn context is identical to the original invocation.
Root cause: respond.ts:843 has no resumedFromSessionRecord guard, and AgentTurnSessionRecord does not persist thinkingSelection, so resumed slices can't skip or reuse the prior result.
Impact: each resume slice pays an extra Haiku round-trip for a classification that will return the same answer (temperature: 0, same input). With AGENT_TURN_TIMEOUT_RESUME_MAX_SLICES = 48, a long sandbox task can generate 10–48× the expected router calls per turn.
Affected files:
packages/junior/src/chat/respond.ts — unconditional selectTurnThinkingLevel call at line 843
packages/junior/src/chat/state/turn-session.ts — AgentTurnSessionRecord has no thinkingSelection field
packages/junior/src/chat/services/turn-session-record.ts — persistRunningSessionRecord / persistAuthPauseSessionRecord / persistTimeoutSessionRecord need to carry the selection forward
Proposed fix:
- Add an optional
thinkingSelection field to AgentTurnSessionRecord and its storage/parse path (backward-compatible, fall back to recompute if absent)
- Persist the selection at the same time the first safe Pi boundary is written
- Skip
selectTurnThinkingLevel on resume when a persisted selection is available, adding a "app.ai.thinking_level_source": "session_record" | "router" span attr for traceability
See also: #431 (compaction rework, related session-record work)
Action taken on behalf of David Cramer.
selectTurnThinkingLevelmakes a Haiku LLM call to pick a complexity level for the upcoming turn. It is called unconditionally every timegenerateAssistantReplyruns — including on every timeout-resume and auth-resume slice, where the turn context is identical to the original invocation.Root cause:
respond.ts:843has noresumedFromSessionRecordguard, andAgentTurnSessionRecorddoes not persistthinkingSelection, so resumed slices can't skip or reuse the prior result.Impact: each resume slice pays an extra Haiku round-trip for a classification that will return the same answer (
temperature: 0, same input). WithAGENT_TURN_TIMEOUT_RESUME_MAX_SLICES = 48, a long sandbox task can generate 10–48× the expected router calls per turn.Affected files:
packages/junior/src/chat/respond.ts— unconditionalselectTurnThinkingLevelcall at line 843packages/junior/src/chat/state/turn-session.ts—AgentTurnSessionRecordhas nothinkingSelectionfieldpackages/junior/src/chat/services/turn-session-record.ts—persistRunningSessionRecord/persistAuthPauseSessionRecord/persistTimeoutSessionRecordneed to carry the selection forwardProposed fix:
thinkingSelectionfield toAgentTurnSessionRecordand its storage/parse path (backward-compatible, fall back to recompute if absent)selectTurnThinkingLevelon resume when a persisted selection is available, adding a"app.ai.thinking_level_source": "session_record" | "router"span attr for traceabilitySee also: #431 (compaction rework, related session-record work)
Action taken on behalf of David Cramer.