feat: add stopOnRunFinished + improve activity tracking#1191
feat: add stopOnRunFinished + improve activity tracking#1191Gkrumbach07 wants to merge 1 commit intomainfrom
Conversation
New feature: spec.stopOnRunFinished — when true, the backend auto-stops the session on RUN_FINISHED event. Enables one-shot automation sessions that stop cleanly without inactivity timeout. Changes: - CRD: add stopOnRunFinished boolean to spec - Backend types: add to AgenticSessionSpec and CreateAgenticSessionRequest - Backend session handler: pass through to CR on create - AG-UI proxy: detect RUN_FINISHED + check spec → trigger stop with in-memory cache to avoid repeated k8s API calls - AG-UI proxy: all events now reset inactivity timer (was only 4 types) - AG-UI proxy: reduce activity debounce from 60s to 10s - Amber GHA: use stop-on-run-finished, fix/custom prompt split, shell-driven batch, session reuse, security fixes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughChanges introduce auto-stop functionality for agentic sessions. When a session is created with Changes
Sequence Diagram(s)sequenceDiagram
participant Agent as Agent/Client
participant Proxy as WebSocket Proxy
participant Cache as stopOnRunFinished<br/>Cache
participant API as Kubernetes API
Agent->>Proxy: Emit RUN_FINISHED event
Proxy->>Proxy: persistStreamedEvent()<br/>detects event.type
Proxy->>Cache: checkAndStopOnRunFinished(project, session)
Cache-->>Proxy: Check if cached?
alt Not cached
Proxy->>API: Query AgenticSession spec
API-->>Proxy: Return stopOnRunFinished flag
Proxy->>Cache: Store result
else Cached
Cache-->>Proxy: Return cached value
end
alt stopOnRunFinished = true
Proxy->>API: Update session annotations<br/>(phase: Stopped,<br/>stop-reason: run-finished)
API-->>Proxy: Acknowledged
end
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 2 warnings)
✅ Passed checks (3 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
✨ Simplify code
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/amber-issue-handler.yml (1)
68-99:⚠️ Potential issue | 🟠 MajorPin
ambient-actionto commit SHALines 68–99 and 218–277 use
ambient-code/ambient-action@v0.0.4(mutable tag). Per coding guidelines, GitHub Actions must be pinned to commit SHAs to prevent tag retargeting and supply-chain drift. Replace with a specific commit SHA.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/amber-issue-handler.yml around lines 68 - 99, The workflow currently uses a mutable tag for the action (uses: ambient-code/ambient-action@v0.0.4); replace that tag with the corresponding immutable commit SHA (e.g., ambient-code/ambient-action@<COMMIT_SHA>) in every occurrence (the `uses:` lines in the Create session step and the other block around lines 218–277) so the action is pinned; update both `uses: ambient-code/ambient-action@v0.0.4` instances to the specific commit SHA and verify the SHA is the exact commit from the action repo before committing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/backend/handlers/sessions.go`:
- Around line 744-746: parseSpec() is not reading spec["stopOnRunFinished"], so
a create-time write via req.StopOnRunFinished only persists in the CR but never
surfaces through GetSession/ListSessions; update parseSpec(spec) to detect the
"stopOnRunFinished" key (handle both bool and pointer semantics as used
elsewhere), set the corresponding field in the returned session spec struct
(matching the type used by GetSession/ListSessions), and ensure parseSpec
returns the true value when the CR stores true so the API round-trips correctly.
In `@components/backend/websocket/agui_proxy.go`:
- Around line 60-63: The cache stopOnRunFinishedCache is keyed by sessionName
only which collides across namespaces; change it to use a composite key that
includes project/namespace plus session name (e.g. fmt.Sprintf("%s/%s",
session.GetNamespace() or session.Project, sessionName) or a small struct key)
wherever the map is set or read (the lazy population on first RUN_FINISHED, the
early-return check, and subsequent lookups around the RUN_FINISHED handling).
Update the comment to reflect "Key: project/sessionName" and ensure all
references that read or write stopOnRunFinishedCache (including the code paths
currently using sessionName alone) are updated to compute and use the composite
key.
- Around line 542-553: The stop-on-RUN_FINISHED flow in
checkAndStopOnRunFinished has two issues: it must retry on resource version
conflicts and must key the cache by namespace; change stopOnRunFinishedCache to
use projectName + "/" + sessionName (same form as updateLastActivityTime) and
wrap the fetch-modify-update sequence inside a RetryOnConflict (or equivalent
fetch-and-patch loop) in checkAndStopOnRunFinished so concurrent
UpdateStatus/Update races are retried instead of failing silently.
---
Outside diff comments:
In @.github/workflows/amber-issue-handler.yml:
- Around line 68-99: The workflow currently uses a mutable tag for the action
(uses: ambient-code/ambient-action@v0.0.4); replace that tag with the
corresponding immutable commit SHA (e.g.,
ambient-code/ambient-action@<COMMIT_SHA>) in every occurrence (the `uses:` lines
in the Create session step and the other block around lines 218–277) so the
action is pinned; update both `uses: ambient-code/ambient-action@v0.0.4`
instances to the specific commit SHA and verify the SHA is the exact commit from
the action repo before committing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 534cb425-4b0a-4862-ac9e-26ae6973dbf1
📒 Files selected for processing (5)
.github/workflows/amber-issue-handler.ymlcomponents/backend/handlers/sessions.gocomponents/backend/types/session.gocomponents/backend/websocket/agui_proxy.gocomponents/manifests/base/crds/agenticsessions-crd.yaml
| if req.StopOnRunFinished != nil && *req.StopOnRunFinished { | ||
| spec["stopOnRunFinished"] = true | ||
| } |
There was a problem hiding this comment.
stopOnRunFinished is write-only right now
Line 744 adds the create-time write, but parseSpec() still never reads spec.stopOnRunFinished. GetSession, ListSessions, and the other handlers that return parseSpec(spec) will still surface this field as false/omitted even when the CR stores true, so the new API does not round-trip.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/backend/handlers/sessions.go` around lines 744 - 746, parseSpec()
is not reading spec["stopOnRunFinished"], so a create-time write via
req.StopOnRunFinished only persists in the CR but never surfaces through
GetSession/ListSessions; update parseSpec(spec) to detect the
"stopOnRunFinished" key (handle both bool and pointer semantics as used
elsewhere), set the corresponding field in the returned session spec struct
(matching the type used by GetSession/ListSessions), and ensure parseSpec
returns the true value when the CR stores true so the API round-trips correctly.
| // stopOnRunFinishedCache tracks which sessions have stopOnRunFinished set. | ||
| // Populated lazily on first RUN_FINISHED event, avoids repeated k8s API calls. | ||
| // Key: sessionName, Value: bool | ||
| var stopOnRunFinishedCache sync.Map |
There was a problem hiding this comment.
Cache this by project/session, not sessionName alone
AgenticSessions are namespaced. With the current keying, a cached false for one project can make Line 1010 return early for a same-named session in another project, and that RUN_FINISHED will never even read its CR.
Also applies to: 1008-1013, 1025-1026
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/backend/websocket/agui_proxy.go` around lines 60 - 63, The cache
stopOnRunFinishedCache is keyed by sessionName only which collides across
namespaces; change it to use a composite key that includes project/namespace
plus session name (e.g. fmt.Sprintf("%s/%s", session.GetNamespace() or
session.Project, sessionName) or a small struct key) wherever the map is set or
read (the lazy population on first RUN_FINISHED, the early-return check, and
subsequent lookups around the RUN_FINISHED handling). Update the comment to
reflect "Key: project/sessionName" and ensure all references that read or write
stopOnRunFinishedCache (including the code paths currently using sessionName
alone) are updated to compute and use the composite key.
| // Update lastActivityTime on CR for any event (debounced). | ||
| if eventType != "" { | ||
| if projectName, ok := sessionProjectMap.Load(sessionID); ok { | ||
| updateLastActivityTime(projectName.(string), sessionID, eventType == types.EventTypeRunStarted) | ||
| } | ||
| } | ||
|
|
||
| // Stop session on RUN_FINISHED if stopOnRunFinished is set. | ||
| if eventType == types.EventTypeRunFinished { | ||
| if projectName, ok := sessionProjectMap.Load(sessionID); ok { | ||
| go checkAndStopOnRunFinished(projectName.(string), sessionID) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '538,556p' components/backend/websocket/agui_proxy.go
sed -n '1000,1105p' components/backend/websocket/agui_proxy.go
rg -n 'RetryOnConflict|Patch\(' components/backend/websocket/agui_proxy.go || trueRepository: ambient-code/platform
Length of output: 4576
checkAndStopOnRunFinished needs conflict retry and namespaced cache key
The stop-on-RUN_FINISHED path has two functional bugs:
-
Silent failure on resource conflict:
checkAndStopOnRunFinished()fetches the object, modifies it, then callsUpdate()with noRetryOnConflictwrapper. Concurrent writes (e.g., fromupdateLastActivityTime()callingUpdateStatus()) can advance the resource version, causingUpdate()to fail silently with a logged error, leaving the session running. -
Cache key collision across namespaces:
stopOnRunFinishedCacheuses baresessionNameas the key, but AgenticSession resources are namespace-scoped byprojectName. Sessions with identical names in different projects will collide in cache, causing incorrect stop decisions. For comparison,updateLastActivityTimecorrectly usesprojectName + "/" + sessionNameas its cache key.
Use RetryOnConflict (or fetch-and-patch) for the stop operation and key the cache as projectName + "/" + sessionName.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/backend/websocket/agui_proxy.go` around lines 542 - 553, The
stop-on-RUN_FINISHED flow in checkAndStopOnRunFinished has two issues: it must
retry on resource version conflicts and must key the cache by namespace; change
stopOnRunFinishedCache to use projectName + "/" + sessionName (same form as
updateLastActivityTime) and wrap the fetch-modify-update sequence inside a
RetryOnConflict (or equivalent fetch-and-patch loop) in
checkAndStopOnRunFinished so concurrent UpdateStatus/Update races are retried
instead of failing silently.
Summary
Adds
stopOnRunFinishedCRD field and improves inactivity detection. Builds on #1180.Changes
stopOnRunFinished(new CRD field)true, the backend auto-stops the session onRUN_FINISHEDeventsync.Map) to avoid k8s API call on everyRUN_FINISHEDfor sessions without the flagActivity tracking improvements
RUN_STARTED,TEXT_MESSAGE_START,TEXT_MESSAGE_CONTENT,TOOL_CALL_START)Amber GHA updates
stop-on-run-finished: 'true'withtimeout: '0'(no inactivity timeout)@amberalone → fix prompt,@amber <text>→ custom||fallback instead of string concatenationTest plan
stopOnRunFinished: true— verify it stops onRUN_FINISHED@ambercomment on a PR — verify fix prompt runs@amber do somethingon an issue — verify custom prompt runs🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Improvements