Main 3 by raphaeltm · Pull Request #27 · DefangLabs/simple-agent-manager

raphaeltm · 2026-05-21T16:13:09Z

No description provided.

Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

…tainer-cache-experiments-01krb4 Experiment with Cloudflare devcontainer cache backends

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Four production fixes: disable unattended-upgrades on VMs, deduplicate workspace creation race condition, extract task callback route to fix 401 auth, implement MCP token sliding window with 8h TTL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tm#966) * fix: extract task callback route before projectsRoutes The task callback route (POST /:projectId/tasks/:taskId/status/callback) was blocked by projectsRoutes.use('/*', requireAuth()) which leaks session auth middleware to all sibling subrouters at the same base path. The leaked requireAuth() ran BEFORE the callback route's own verifyCallbackToken JWT auth, rejecting the VM agent's Bearer token request with 401. Fix: Extract the callback route into its own Hono subrouter (callback.ts) and mount it at /api/projects BEFORE projectsRoutes in index.ts, following the same pattern used for deploymentIdentityTokenRoute and nodeAcpHeartbeatRoute. This is the fourth instance of the Hono middleware scope leak bug class. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add MCP token sliding window refresh + 8h TTL MCP tokens had a 4-hour TTL with no refresh mechanism. Agents running tasks longer than 4 hours lost MCP tool access permanently. Changes: - Increase DEFAULT_MCP_TOKEN_TTL_SECONDS from 4h to 8h (inactivity timeout) - Add DEFAULT_MCP_TOKEN_MAX_LIFETIME_SECONDS (24h hard cap) - Add sliding window to validateMcpToken(): refresh KV TTL on each use, throttled to >50% of TTL elapsed to avoid excessive KV writes - Add lastRefreshedAt field to McpTokenData for throttle tracking - Fail-closed: malformed createdAt causes immediate token revocation - Add MCP_TOKEN_MAX_LIFETIME_SECONDS env var (configurable per principle XI) - Pass env through to all validateMcpToken call sites Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add callback auth routing + MCP sliding window tests - Integration test proving task callback accepts Bearer JWT through combined app routes (not blocked by session auth middleware) - Unit tests for MCP token sliding window: throttle, max lifetime, capped TTL, fail-closed on malformed createdAt - Fix existing mcp-token test for new max-lifetime behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: add task callback middleware leak post-mortem + update .env.example Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: move task to active * fix: update test fixtures for MCP token max-lifetime validation MCP token sliding window adds a createdAt-based max lifetime check. Test fixtures using hardcoded dates from months ago now exceed the 24h cap. Update all MCP token fixtures to use new Date().toISOString(). Also update task-runner-completion source contract test to read from callback.ts instead of crud.ts after route extraction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: archive completed task * fix: address security review findings - Enforce expectedScope: 'workspace' on task callback JWT verification (prevents node-scoped tokens from reaching workspace-mutation endpoint) - Return updatedData from validateMcpToken after sliding window refresh (callers now receive current state matching what is persisted in KV) - Add non-atomicity comment to sliding window refresh (documents known KV race condition consistent with checkMcpRateLimit pattern) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: make KV delete best-effort in token expiry paths kv.delete() on expired/malformed tokens is a cleanup courtesy — the KV TTL will expire the entry anyway. Making it fire-and-forget prevents a KV service hiccup from turning a token expiry into an unhandled 500. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…aphaeltm#968) * fix: disable apt-daily timers and harden IPv6 firewall in cloud-init - Disable apt-daily.timer, apt-daily-upgrade.timer, and unattended-upgrades in cloud-init runcmd before vm-agent starts. These Ubuntu timers can trigger systemd daemon-reexec which kills the vm-agent mid-work. Ephemeral VMs gain nothing from auto-upgrades. - Load ip6_tables kernel module before ip6tables commands in the firewall script. Some Hetzner images ship without the module loaded, causing all ip6tables commands to fail silently. The IPv6 firewall block is now conditional — if the module can't be loaded, IPv6 rules are skipped with a log warning instead of failing the entire script. - Make ip6tables-save error-tolerant for systems without IPv6 support. - Add 5 new tests covering timer disables and IPv6 module loading. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: move cloud-init firewall hygiene task to active Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: address review findings — conditionality test, YAML-parsed timer tests - Add test verifying ip6tables DROP/ACCEPT rules are inside the modprobe conditional block, not executed unconditionally (MEDIUM finding) - Refactor timer ordering tests to use YAML.parse instead of string splitting - Update stale ip6tables-save assertion to match current contract Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add retry policy for transient Hetzner capacity failures (422) Hetzner VM provisioning can return HTTP 422 when capacity is temporarily exhausted for a server type/region. These are now retried with bounded exponential backoff (15s initial, 2min max, 5 attempts default) while permanent 422s (invalid config) are thrown immediately. - Add isTransientCapacityError() to classify 422s by message pattern - Wrap placement loop with capacity retry in createVM() - Log retry attempts with server_type, location, attempt#, delay - Distinguish "capacity exhausted" from "invalid configuration" errors - All retry params configurable via constructor/HetznerProviderConfig - Add comprehensive tests for retry, backoff, exhaustion, and logging Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: wire capacity retry env vars to API layer Add HETZNER_CAPACITY_RETRY_INITIAL_DELAY_MS, HETZNER_CAPACITY_RETRY_MAX_DELAY_MS, and HETZNER_CAPACITY_RETRY_MAX_ATTEMPTS to Env interface and buildProviderConfig. Operators can now tune retry behavior without code changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: address test-engineer review findings for capacity retry - Add tests for untested regex variants (resources temporarily unavailable, resource unavailable, could not find) - Add maxAttempts=1 edge case test - Add .cause assertion on capacity exhaustion error - Add mixed 412+422 scenario test - Add assertion that console.warn is NOT emitted on final exhaustion attempt - Clean up double-call pattern in non-capacity 422 test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: reduce test duplication in capacity retry tests Extract helper functions (capacityErrorResponse, placementErrorResponse, successResponse, mockAlwaysCapacityError) to eliminate repeated Response construction patterns flagged by SonarCloud duplication check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Auto-committed by SAM on agent completion.

raphaeltm#971) Add two tests to the task callback auth routing regression suite: - Invalid Bearer token is handled by callback auth (not session auth) - Workspace ID mismatch returns 403 These tests verify the callback route's own auth gates work correctly, complementing the existing regression tests that verify session auth middleware doesn't leak onto callback routes. Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Auto-committed by SAM on agent completion.

When a task fails, the session can stay "Active" because the stopSession RPC to the ProjectData DO is best-effort and can fail silently. The UI should cross-reference task status. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

) * chore: move task to active Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add priority/updates/all tabs to notification panel Replace the existing All/Unread tabs with three new tabs: - Priority: shows needs_input and task_complete notifications - Updates: shows progress, error, session_ended, pr_created - All: shows everything (unchanged behavior) The Priority tab is the default, helping users quickly find agent input requests and completed tasks without scrolling through status updates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add Playwright visual audit for notification panel tabs Covers Priority/Updates/All tab filtering, empty states, long text, many items, and multi-project grouping at mobile (375x667) and desktop (1280x800) viewports. Asserts no horizontal overflow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: archive notification-panel-priority-tabs task * test: add bell badge vs priority badge distinction test Adds a unit test verifying the bell icon shows total unread count (4) while the Priority tab badge shows only priority unread count (2). Also checks off completed task file items. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add ARIA tab semantics, updates badge, and touch targets - Add role="tablist", role="tab", aria-selected, aria-controls to notification filter tabs for screen reader accessibility - Add role="tabpanel" with aria-labelledby to notification list - Add arrow key navigation between tabs - Add updatesUnreadCount badge to Updates tab - Increase tab touch targets to 44px minimum (min-h-[44px]) - Update test selectors from getByRole('button') to getByRole('tab') Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add 99+ badge test, desktop All tab and Updates empty state tests Addresses test-engineer review findings: - Unit test for 99+ badge truncation branch - Playwright desktop All tab test with overflow assertion - Playwright desktop Updates empty state test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: trigger CI re-run for preflight markers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…eltm#974) * chore: move task to active Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: reconcile session state when task reaches terminal status Three-layer fix for sessions staying "Active" when tasks fail: 1. UI: getSessionState() and isActiveSession() now cross-reference task.status — if the task is failed/completed/cancelled, the session shows as terminated regardless of the session DO status. 2. Backend: Add failSession() (distinct from stopSession) so sessions record failure explicitly. failTask() now calls failSession with a single retry before giving up. 3. Handle 'failed' session status in the UI alongside 'stopped'. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: update test fixture to reflect active session with in-progress task The test for "does not show fork button for active sessions" had a session with status='active' but task.status='failed'. With the new task-status cross-referencing, this is correctly treated as terminated. Updated to use task.status='in_progress' for a truly active session. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: archive task file Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use failSession in crud.ts task-failure path + update docs Address review findings: - crud.ts task status update route now calls failSession() instead of stopSession() when transitioning to 'failed' status - Added 'failed' session status to workspace-lifecycle.md docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address cloudflare specialist review findings HIGH fixes: - isTerminated in parseChatSessionListRow now includes 'failed' status - session.failed added to SESSION_LIFECYCLE_EVENTS for sidebar refresh - useChatWebSocket handles session.failed alongside session.stopped - failSession() uses cursor.rowsWritten to skip duplicate events MEDIUM fixes: - ActivityFeed formats session.failed events with error message - Retry sleep reduced from 1000ms to 100ms (less DO event loop blocking) - Removed unused _errorMessage param from sessions.failSession() Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add missing tests from test-engineer review - Priority ordering: task terminal takes precedence over idle/agentCompleted - isActiveSession: direct failed status path - Fork button reconciliation: session active + task failed shows fork button Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: extract shared terminateSession helper to reduce duplication Reduces code duplication between stopSession and failSession flagged by SonarCloud (30.1% duplication on new code). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: use it.each to reduce test duplication Consolidates structurally similar getSessionState and isActiveSession tests into parameterized test tables to reduce SonarCloud duplication. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ltm#976) * blur prototype * feat: add blur overlay, speed and noise size controls to WebGL background - Add u_scale uniform to shader for configurable noise granularity - Add speed/noiseSize options to useWebGLBackground hook (defaults: 0.4x, 1.02) - Add blur+dim overlay div to ProjectAgentChat and SamPrototype (10px blur, 0.10 dim) - Update prototype with Speed and Noise size sliders, defaults matching preferred settings Settings: Blur 10px, Dim 0.10, Green 1.00, Speed 0.4x, Noise size 1.02 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: retrigger checks with updated PR body * ci: retrigger with proper preflight markers --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: keep cancel prompt sessions follow-up ready * chore: refresh PR evidence checks * fix: restart opencode after prompt cancel --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

) * task: move DO-only chat task to active Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: DO-only chat architecture with typewriter animation Remove the direct ACP WebSocket connection from WorkspaceChatView. Route ALL messages through the Durable Object WebSocket (single source). Send prompts via REST API (POST /sessions/:sessionId/prompt). - Add TypewriterText component for word-by-word animation of batched content - Derive agent state (idle/prompting/responding) from message flow - Remove useProjectAgentSession import from WorkspaceChatView - Remove dual-source conversationItems merge (was causing React raphaeltm#185) - Un-deprecate sendFollowUpPrompt REST API helper The hook file useProjectAgentSession.ts stays in the codebase — it's still used by ProjectMessageView via useSessionLifecycle. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: correct export ordering in acp-client index Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * task: archive DO-only chat task Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review findings — idle stuck state and a11y - Reset agentActivity to 'idle' when sendFollowUpPrompt REST call fails, preventing the input from being stuck in "Agent is working..." forever - Transition to 'responding' on any assistant message (not just from 'prompting'), handling reconnect with in-progress agent output - Add prefers-reduced-motion guard to TypewriterText (WCAG 2.1 SC 2.3.3) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: convert ProjectMessageView to DO-only chat architecture Remove ACP WebSocket from ProjectMessageView, matching the DO-only architecture already applied to WorkspaceChatView. All messages now flow through the Durable Object WebSocket; prompts are sent via REST API. - Simplify useSessionLifecycle: remove useProjectAgentSession, derive agent activity state (idle/prompting/responding) from message flow - Simplify useConnectionRecovery: remove 6-mechanism ACP recovery, keep DO WebSocket reconnection and idle resume via REST API - Wire follow-up prompts through sendFollowUpPrompt REST API - Integrate TypewriterText for latest assistant message animation - Remove AgentErrorBanner (depended on ACP session types) - Remove ACP-specific tests (cancel button, ACP connecting, agent offline banner, DO+ACP merge, scroll position stability) - Update resume tests to remove ACP sendPrompt/reconnect assertions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* task: move restore cancel button to active Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: restore cancel button in agent working indicator PR raphaeltm#978 removed the ACP WebSocket from ProjectMessageView, which also removed the cancel button because it relied on sending session/cancel over that WebSocket. The VM agent already has a REST cancel endpoint and the API service function exists — this commit wires them together. - Add POST /sessions/:sessionId/cancel API route in chat.ts - Add cancelAgentPrompt() client API function - Add handleCancelPrompt to useSessionLifecycle hook - Restore cancel button in ProjectMessageView working indicator - Add cancel button to WorkspaceChatView working indicator - Add integration tests for the cancel API route Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: sort exports alphabetically in api/index.ts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * task: archive restore cancel button task Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address cloudflare specialist review findings - Add userId filter to agentSessions query for defence-in-depth (HIGH) - Add double-tap guard (cancellingRef) to prevent duplicate cancel requests - Don't clear agentActivity on catch — keep spinner visible on real errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add behavioral test for cancel button in ProjectMessageView Renders the component, triggers 'responding' state via WebSocket message, clicks the Cancel button, and asserts cancelAgentPrompt was called with the correct project/session IDs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: improve cancel button accessibility and touch targets - Add role="status" to agent working indicator bar for screen readers - Add aria-label="Cancel agent" to both cancel buttons - Increase touch target to min-h-[44px] for mobile usability Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Harden Go CLI quality gates * Document CLI quality gate gap --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

…wn blur, update Mistral models (raphaeltm#1069) - Remove sam-glass-card-motion and glass-card-glow from Card glass variant so glass cards no longer behave like buttons (hover scale + active press) - Add backdrop-blur-xl to ModelSelect dropdown for proper glassmorphic blur - Update Mistral model catalog with latest IDs: Medium 3.5, Small 4, Large 3, Medium 3.1, Devstral 2, Codestral, Magistral Medium 1.2, Ministral 3 (14B/8B/3B) Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

… models (raphaeltm#1071) * feat: dynamic Vibe model config + fix Mistral API model IDs Generate a dynamic [[models]] entry in Vibe's config.toml when the user selects a model that isn't a built-in alias. This uses the raw Mistral API model ID as both the TOML alias and API name, so the UI model catalog can list any Mistral model without requiring vm-agent changes. Also fixes model catalog IDs to match actual Mistral API identifiers (e.g. mistral-medium-3-5-2604, not mistral-medium-3-5-26-04). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: consolidate Vibe config tests into table-driven test to reduce duplication Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* task: move fix-amp-agent-cli-install to active Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: install @sourcegraph/amp CLI alongside acp-amp bridge The acp-amp Python package is only the ACP bridge wrapper. The actual amp CLI binary (@sourcegraph/amp npm package) must also be installed for acp-amp to function. Chain npm install after uv install in the amp agent install command. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: update amp install command assertion in shared tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * task: archive fix-amp-agent-cli-install Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: drop which-amp guard and add install script passthrough test Address go-specialist review findings: - Remove conditional `which amp` guard from install command — npm install -g is idempotent, so the guard only creates a partial-install trap - Add TestAgentInstallScriptAmpPassesThrough to verify the non-npm code path leaves the amp install command unchanged - Add comment on isNpmBased: false explaining the implicit npm dependency Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: set amp isNpmBased=true so Node.js bootstrap runs in containers The amp install command chains `npm install -g @sourcegraph/amp` after the uv install of acp-amp. With isNpmBased=false, the agentInstallScript function skipped the Node.js bootstrap preamble, causing `npm: not found` in devcontainers that don't ship with Node.js pre-installed. Setting isNpmBased=true ensures the preamble installs nodejs/npm when missing before the install command runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Auto-committed by SAM on agent completion.

PR raphaeltm#1068 set go-version to 1.25.0 which doesn't exist, breaking all CI runs since the workflow file fails validation before any jobs start. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

PR raphaeltm#1068 added a duplicate `cli:` key in the changes job outputs, causing YAML parse failure and breaking all CI runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The cli filter was defined twice in the paths-filter config — once with basic paths and once with additional paths (sonar, rules). Merged into a single entry to fix the YAML duplicate key error. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

1. model-catalog.test.ts: devstral model ID was 'devstral-2512' but catalog has 'devstral-2-2512', and it's in groups[1] not groups[0]. Fixed to search all groups with flatMap. 2. packages/cli/go.mod: specified go 1.25.0 which doesn't exist yet, causing 'no such tool "covdata"' error. Changed to go 1.24.0. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… agent support Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: explicit SAM provider selection for Claude Code and Codex Users must now explicitly opt-in to SAM as an AI provider for Claude Code and OpenAI Codex agents. No more silent platform proxy fallback. - Add AgentProviderMode type ('sam' | 'user-api-key' | 'oauth') to shared types - Add provider_mode column to agent_settings table (migration 0054) - Gate platform proxy on providerMode === 'sam' in runtime.ts (no-silent-fallback) - Update agent catalog to show configured status based on providerMode - Add admin AI allowance API (GET/PUT/DELETE per user) for ceilings - Enforce admin allowance ceilings in user budget validation - Add provider mode selector UI in AgentSettingsCard for claude-code/codex - Add tests for budget ceiling enforcement and agent status display Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * task: move explicit SAM provider selection active * test: fix SAM provider status import order * test: fix SAM provider API import order * fix: require explicit SAM provider only for Claude and Codex * docs: record explicit SAM provider workflow * test: cover explicit SAM provider workflow * docs: sync explicit SAM provider setup * refactor: reduce AI allowance route duplication * refactor: share AI budget limit helpers * refactor: consolidate AI budget limits * refactor: simplify provider budget helpers * refactor: simplify AI budget parser * refactor: inline AI budget limit parsing * fix: route explicit SAM providers through proxy * fix: use responses wire api for Codex proxy * fix: proxy OpenAI responses API for Codex * refactor: share AI proxy request guards * fix: satisfy ai proxy import order * fix: add auth to /models endpoint, update CLAUDE.md for provider modes - Add prepareAIProxyRequest() gate to GET /ai/v1/models — previously unauthenticated, leaking the list of allowed models to any caller - Update CLAUDE.md Agent Authentication section for three-mode system (user-api-key, oauth, sam) and add explicit-sam-provider-selection to Recent Changes - Document monthly cost cap fail-open window with risk context - File backlog task for allowedModelTiers enforcement at proxy gate Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…phaeltm#1084) * feat: add gpt-5.4-mini and gpt-5.4 to platform AI proxy allowed models Users selecting gpt-5.4-mini from the model dropdown got: "Model 'gpt-5.4-mini' is not available" The model catalog (model-catalog.ts) listed gpt-5.4-mini as a selectable option for Codex, but the platform AI proxy allowlist (PLATFORM_AI_MODELS in ai-services.ts) didn't include it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: sync PLATFORM_AI_MODELS with all dropdown models for Claude Code and Codex Adds all 14 missing models from the UI model catalog dropdown to the platform AI models allowlist. Previously only 6 OpenAI and 3 Anthropic models were in the allowlist — users selecting any other model from the dropdown would get a "model not allowed" error when using SAM provider mode. Added Anthropic: claude-opus-4-7, claude-sonnet-4-5-20250514, claude-sonnet-4-20250514, claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022, claude-3-opus-20240229 Added OpenAI: gpt-5.5-pro, gpt-5.5, gpt-5.3-codex, gpt-5.2-codex, gpt-5.1-codex-max, gpt-5.1-codex-mini, o4-mini, o3 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: sync model catalog with current API offerings Update both the UI dropdown (model-catalog.ts) and the platform AI proxy allowlist (ai-services.ts) to match what's actually available from the Anthropic and OpenAI APIs as of May 2026. Anthropic changes: - Fix claude-sonnet-4-5 ID: 20250514 → 20250929 (correct dated version) - Remove retired models: claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022, claude-3-opus-20240229 (all no longer available via API) - Add claude-opus-4-5-20251101, claude-opus-4-1-20250805 (available legacy) - Fix context windows: Opus 4.7/4.6 and Sonnet 4.6 are 1M tokens, not 200k OpenAI changes: - Add gpt-5.4-pro ($30/$180), gpt-5.4-nano ($0.20/$1.25) — current models - Add gpt-5-mini to dropdown (still available, deprecating Aug 2026) - Remove gpt-5.2 (not a valid model ID; gpt-5.2-codex is the correct one) - Fix pricing from official API docs (gpt-5.5: $5/$30, gpt-5.4: $2.50/$15, gpt-5.4-mini: $0.75/$4.50, o4-mini: $0.55/$2.20, o3: $2/$8) - Fix context windows (5.4 series: 400k, not 1M) - Better grouping: Latest / Older / Reasoning / Legacy Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: move task to active Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: use standard tier for gpt-5.4-nano (low-cost reserved for Workers AI) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: archive completed task Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: model routing for o3/o4-mini and test maintenance - Fix isOpenAIModel() to match 'o3' and 'o4-*' prefixes — previously these were misrouted to Workers AI instead of OpenAI AI Gateway - Update ai-proxy test to reference gpt-5.5 instead of removed gpt-5.2 - Add cross-catalog invariant test ensuring every dropdown model has a corresponding PLATFORM_AI_MODELS entry Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: compact isOpenAIModel to stay under 800-line file size limit Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…ltm#1085) The glassy transparency (backdrop-filter blur) on the session header tab was broken by commit 6e00d96 which moved the header from absolute positioning into the normal document flow. Messages no longer scrolled behind the header, so there was nothing to blur. Changes: - Remove `glass-composited` from SessionHeader and ErrorBanner (the `transform: translateZ(0)` created a new stacking context that interfered with backdrop-filter rendering) - Make FloatingHeader absolutely positioned over the scroll content so messages pass behind it and the blur effect is visible - Add spacer in Virtuoso Header to prevent messages from hiding behind the overlaid header Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…#1086) The glass-composited class was intentionally removed in the prior PR because its transform: translateZ(0) broke backdrop-filter blur. Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…aphaeltm#1087) * fix: Codex OAuth (BYO auth.json) crashes with missing OPENAI_API_KEY Two-sided fix for Codex users who bring their own OAuth token (auth.json): API side: The passthrough proxy exclusion on runtime.ts:133 only checked for Claude Code + OAuth, allowing Codex + OAuth users to receive an inferenceConfig with provider "openai-passthrough" they shouldn't get. Extended the condition to also exclude Codex OAuth credentials. VM agent side: Belt-and-suspenders guard in codexProxyProviderConfigFromCredential — when the credential kind is "oauth-token" (auth-file injection), skip generating a proxy provider config that would write env_key = "OPENAI_API_KEY" to config.toml. That env var is never set in auth-file mode, causing Codex to crash. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: trigger re-run with updated PR description Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: trigger with corrected preflight section names Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

* Show recovery container status in chat header * Reduce recovery badge test duplication * Address recovery badge quality findings * Consolidate workspace badge test setup --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

Publish SAM's 2026-05-21 daily development journal about the node readiness freshness fix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

raphaeltm and others added 30 commits May 11, 2026 14:42

feat: use Cloudflare registry for devcontainer cache

0af2c51

docs: add devcontainer cache validation evidence

f936ecb

refactor: split task runner workspace creation

5aa4b63

chore: clear devcontainer cache sonar hotspots

5f0a9e3

docs: add daily journal — every task needs one owner

ce4a6a2

Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

Merge pull request raphaeltm#963 from raphaeltm/sam/cloudflare-devcon…

1a475e9

…tainer-cache-experiments-01krb4 Experiment with Cloudflare devcontainer cache backends

task: add fix-agent-auth-failures

3e8ef22

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

task: add notification-panel-priority-tabs

c113eab

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

task: add prevent-duplicate-workspace-dispatch

49ec7be

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

task: add vm-agent-cloud-init-firewall-hygiene

b44ab72

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

task: add hetzner-capacity-retry

be05aa0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: save agent work

299be0a

Auto-committed by SAM on agent completion.

chore: save agent work

8816c83

Auto-committed by SAM on agent completion.

task: add vm agent container recovery

b02e11a

task: add DO-only chat architecture with typewriter animation

99b8a98

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: keep cancelled prompts ready for follow-up (raphaeltm#975)

ce39b03

* fix: keep cancel prompt sessions follow-up ready * chore: refresh PR evidence checks * fix: restart opencode after prompt cancel --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

docs: add SAM journal state catch-up

3bfe57f

Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

task: add fork/retry new chat screen redesign

614f5cd

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

task: add restore cancel button task

c5ff03f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

simple-agent-manager Bot and others added 29 commits May 19, 2026 14:37

Harden Go CLI quality gates (raphaeltm#1068)

746129e

* Harden Go CLI quality gates * Document CLI quality gate gap --------- Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

docs: publish SAM journal for May 19 (raphaeltm#1070)

2e5c199

Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

chore: save agent work

211152f

Auto-committed by SAM on agent completion.

fix(ci): use Go 1.24 for CLI test job (1.25 doesn't exist)

a89cc18

PR raphaeltm#1068 set go-version to 1.25.0 which doesn't exist, breaking all CI runs since the workflow file fails validation before any jobs start. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: pre-persist orchestration prompts (raphaeltm#1074)

f44da48

Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

ci: trigger fresh run after go-version fix

12995f8

fix(ci): remove duplicated 'cli' output key in ci.yml

ca25ae2

PR raphaeltm#1068 added a duplicate `cli:` key in the changes job outputs, causing YAML parse failure and breaking all CI runs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

blog: publish "The Workspace Was the Wrong Shape"

49914ea

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

blog: reframe tab transition as managing files to guiding agents

9424769

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

blog: editorial revisions — workspace visibility, provisioning speed,…

96caae4

… agent support Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

task: add sync-model-catalog-api-offerings

7f1d458

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

task: add amp project chat mcp wiring

1197bc9

docs: publish SAM journal for May 20 (raphaeltm#1088)

b4a3f8f

Co-authored-by: Raphaël Titsworth-Morin <raphael@raphaeltm.com>

task: add amp sam mcp bridge

093e9b2

fix: prevent false stale node readiness gating (raphaeltm#1092)

68655a1

docs: publish sam readiness journal

bb19887

Publish SAM's 2026-05-21 daily development journal about the node readiness freshness fix.

task: add url-driven ui state for linkable project resources

182c05e

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

merge

c32b45e

raphaeltm merged commit 7cd7afd into main May 21, 2026
11 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Main 3#27

Main 3#27
raphaeltm merged 600 commits into
mainfrom
main-3

raphaeltm commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

raphaeltm commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants