Skip to content

fix(junior): stabilize durable Slack conversation work#537

Merged
dcramer merged 26 commits into
mainfrom
fix/resume-requester-identity-from-session
Jun 9, 2026
Merged

fix(junior): stabilize durable Slack conversation work#537
dcramer merged 26 commits into
mainfrom
fix/resume-requester-identity-from-session

Conversation

@sentry-junior

@sentry-junior sentry-junior Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

This PR stabilizes durable Slack conversation continuation work and tightens the requester identity contract used across that flow. Continuation paths now rebuild runtime Requester state from persisted turn requester data instead of a fresh Slack profile lookup. Requester is now a strong Slack workspace-scoped type, mirroring the Destination pattern: public plugin hooks, scheduler tools, runtime tools, and continuation code all carry platform, teamId, and userId, while durable storage uses an explicit stored Slack requester shape at persistence boundaries.

It also fixes the subscribed Slack skip loop in durable conversation work. Skipped subscribed messages now commit their inbound queue item through the same lease-safe path as injected messages, and skip bookkeeping only clears an active turn when the skipped message owns that turn. That keeps completeConversationWork() from seeing the skipped inbound message as pending forever while preserving unrelated awaiting-resume continuations.

The durable worker and heartbeat paths now maintain active conversation indexes for recovery instead of blindly scanning conversation or turn-session state. Stale destination-mismatched queue payloads are rejected in a way that acks the queue message, and the upgrade migration seeds existing active timeout/yield continuation sessions into durable conversation work so pre-upgrade paused runs are still recovered by the active index. Slack user profile lookup is cached by workspace and user id, and OAuth/MCP resume paths fail closed if stored requester team/user data does not match the active Slack destination.

The junior upgrade command stays in place, with the Redis conversation-state migration isolated under packages/junior/src/cli/upgrade/migrations/redis-conversation-state.ts so it can be removed later without deleting the command.

Validated with targeted requester, Slack, task-execution, continuation, plugin API, OAuth, heartbeat, and upgrade migration tests, package typechecks/builds, repo lint, and git diff --check.

Fixes #561

@vercel

vercel Bot commented Jun 5, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
junior-docs Ready Ready Preview, Comment Jun 9, 2026 11:19pm

Request Review

Comment thread packages/junior/src/chat/respond.ts Outdated
Comment thread packages/junior/src/chat/slack/user.ts Outdated
Comment thread packages/junior/src/chat/slack/user.ts
Comment thread packages/junior/src/chat/runtime/timeout-resume-runner.ts Outdated
@sentry-junior sentry-junior Bot changed the title fix(respond): complete requester identity from session record in resume turns fix(resume): use stored session requester identity at continuation endpoints Jun 6, 2026
Comment thread packages/junior/src/chat/state/session-log.ts
sentry-junior Bot and others added 5 commits June 9, 2026 11:00
…me turns

When a Slack turn is resumed in a new serverless invocation (timeout or
OAuth), the caller re-fetches the Slack profile via lookupSlackActorIdentity.
If that call fails (API error, transient failure, or missing users:read.email
scope) it returns only { userId } — stripping email, fullName, and userName.
This causes the GitHub plugin's beforeToolExecute hook to deny git commits with:

  AgentPluginHookDeniedError: could not determine a resolved requester
  name and email for commit attribution.

The AgentTurnSessionRecord already stores a full requester captured during
the original successful Slack lookup. Use that as a fallback to fill any
gaps in the live-derived identity, guarded against cross-user blending.

The fix converts requester and actorRequester from const to let so they
can be completed once — immediately after loadTurnSessionRecord resolves
existingSessionRecord — and before any consumers (plugin hooks, tools,
prompt, all session persistence paths). This means the completion applies
to both success paths inside the try block and error paths in the catch
block (yield, timeout, auth-pause), which were silently writing the
degraded identity back to the session record.

One new helper: completeActorRequesterFromSession. It fills ActorIdentityInput
gaps from AgentTurnRequester (the stored shape). The AgentTurnRequester form
of the completed requester is derived via the existing requesterFromContext,
keeping field-name conversion in one place.

Co-Authored-By: claude-opus-4-5 <claude-opus-4-5@anthropic.com>

Co-authored-by: immutable dcramer <david@sentry.io>
…me turns

When a Slack turn is resumed in a new serverless invocation (timeout or
OAuth), the caller re-fetches the Slack profile via lookupSlackActorIdentity.
If that call fails (API error, transient failure, or missing users:read.email
scope) it returns only { userId } — stripping email, fullName, and userName.
This causes the GitHub plugin's beforeToolExecute hook to deny git commits with:

  AgentPluginHookDeniedError: could not determine a resolved requester
  name and email for commit attribution.

The AgentTurnSessionRecord already stores a full requester captured during
the original successful Slack lookup. Use that as a fallback to fill any
gaps in the live-derived identity.

The fix converts requester and actorRequester from const to let so they
can be completed once — immediately after loadTurnSessionRecord resolves
existingSessionRecord — and before any consumers (plugin hooks, tools,
prompt, all session persistence paths). This means the completion applies
to both success paths inside the try block and error paths in the catch
block (yield, timeout, auth-pause), which were silently writing the
degraded identity back to the session record.

completeActorRequesterFromSession fills ActorIdentityInput gaps from the
stored AgentTurnRequester. It only fills display/email fields when both
user ids are present and equal — refusing to infer display identity from
a stored actor the live request has not proven to be. The merged result
is renormalized through actorRequesterFromContext so stored display fields
pass the same cleanActorDisplayName/cleanActorEmail gates as live context.
The AgentTurnRequester form is then derived via requesterFromContext,
keeping field-name conversion in one place.

Tests verify: (1) degraded live lookup preserves email/name from session
record in the catch-path timeout persistence, and (2) a different stored
userId does not bleed display fields onto the live requester.

Co-Authored-By: claude-opus-4-5 <claude-opus-4-5@anthropic.com>

Co-authored-by: immutable dcramer <david@sentry.io>
When a Slack turn resumes in a new serverless invocation, the callers
re-fetch the Slack user profile via lookupSlackActorIdentity. If that
call fails — API error, transient failure, or missing users:read.email
scope — only { userId } is returned, stripping email, fullName, and
userName. This caused the GitHub plugin's beforeToolExecute hook to deny
git commits with:

  AgentPluginHookDeniedError: could not determine a resolved requester
  name and email for commit attribution.

The fix is explicit rather than generic: continuation endpoints already
know they are resuming a specific session. They have the session record
in hand and are therefore the right place to reconstruct the full
requester, not generateAssistantReply which cannot distinguish a
degraded resume context from any other context.

lookupSlackResumeRequester is added to user.ts alongside the existing
Slack identity helpers. It takes the resumed userId and the stored
session requester, fetches the live profile, and merges: live data wins,
stored data fills gaps — but only when the stored slackUserId matches
the resumed userId, preventing display fields from one actor bleeding
onto another.

Two continuation endpoints are updated:
- timeout-resume-runner.ts: timeout and cooperative-yield resume
- oauth-callback.ts (beforeStart path): auth-resume continuation

resumePendingOAuthMessage is intentionally unchanged — it replays a
pending user message after OAuth connects and has no prior running
session to recover identity from.

respond.ts is reverted to its original state. generateAssistantReply
receives a complete requester from its callers and does not need to
infer or repair identity internally.

Co-Authored-By: claude-opus-4-5 <claude-opus-4-5@anthropic.com>

Co-authored-by: immutable dcramer <david@sentry.io>
…oints

Continuation endpoints were re-fetching the Slack user profile via
lookupSlackActorIdentity on every resume. The original motivation was
freshness, but this created the bug: if the live Slack lookup fails
at continuation time, the requester is reduced to { userId } alone —
stripping email, fullName, and userName — and the GitHub plugin denies
git commits because it cannot establish commit attribution.

A continuation is the same turn. The identity captured at turn start
and persisted in the session record (AgentTurnSessionRecord.requester)
is the correct identity for that turn. There is no reason to perform a
new Slack API call when we already have the proven identity.

resolveSlackResumeRequester replaces lookupSlackResumeRequester with
stored-primary behavior:
- When stored.slackUserId matches the resumed userId, return the
  stored identity directly via slackActorIdentity normalization.
  No Slack API call is made.
- When stored is absent or for a different user, fall back to a live
  Slack lookup. This covers old/missing session records and any case
  where stored ownership cannot be proven.

Call sites updated:
- timeout-resume-runner.ts: timeout and cooperative-yield resume
- oauth-callback.ts (beforeStart path): auth-resume continuation

resumePendingOAuthMessage is intentionally unchanged — it replays a
pending user message after OAuth connects and has no existing agent
turn to resume identity from.

Tests verify that the Slack API is not called when a matching stored
requester is available, and that mismatched or absent stored requester
correctly falls back to live lookup.

Co-Authored-By: claude-opus-4-5 <claude-opus-4-5@anthropic.com>

Co-authored-by: immutable dcramer <david@sentry.io>
The stored and resumed user ids are always the same value — both are
derived from message.author.userId in the original Slack event:

- stored.slackUserId is set via requesterFromContext(..., requesterId)
  where requesterId = context.correlation.requesterId = message.author.userId

- the userId parameter at continuation time is userMessage.author.userId
  persisted from that same Slack event

There is no scenario where they diverge in normal operation. The only
realistic gap is stored being absent (old session records that predate
requester storage) — not a different-user mismatch.

Simplify to: if stored is present, use it; otherwise fall back to a
live Slack lookup. The slackUserId field is removed from the parameter
type since it is neither needed for comparison nor used from stored.

Co-Authored-By: claude-opus-4-5 <claude-opus-4-5@anthropic.com>

Co-authored-by: immutable dcramer <david@sentry.io>
Drop the unused loadProjection import left after rebasing the requester session changes onto main so CI lint passes cleanly.

Co-Authored-By: GPT-5 Codex <noreply@openai.com>
Comment thread packages/junior/src/chat/state/session-log.ts Outdated
Record requester identity in the session log even when a turn-session update appends no new Pi messages. This keeps resume records from losing Slack profile metadata after timeout or auth pause commits reuse the same projection.

Co-Authored-By: GPT-5 Codex <noreply@openai.com>
Comment thread packages/junior/src/chat/state/turn-session.ts Outdated
Carry the active session-log requester into projection resets when a reset does not provide a new requester. This prevents context compaction from erasing Slack profile metadata needed by continuation runs.

Co-Authored-By: GPT-5 Codex <noreply@openai.com>
Comment thread packages/junior/src/chat/task-execution/slack-work.ts
Port the cleaner Slack skip, upgrade migration, heartbeat, and continuation work onto PR #537.

Mark skipped subscribed Slack inputs committed through the durable work hook so completed work does not requeue forever. Keep the upgrade command while isolating the Redis conversation-state migration in a deletable migration file.

Use active conversation indexes for heartbeat recovery to avoid blind state scans from completed work, and resume continuation/OAuth flows from stored requester identity with fail-closed mismatch handling.

Fixes GH-561

Co-Authored-By: GPT-5 Codex <noreply@openai.com>
Skipped subscribed messages should commit their durable input without clearing an unrelated active turn. Only clear activeTurnId when the skipped message owns the deterministic turn id, so awaiting continuations can still resume after passive Slack skips.

Co-Authored-By: GPT-5 Codex <noreply@openai.com>
Treat destination-mismatched conversation queue payloads as permanent queue rejects so the Vercel callback acknowledges stale messages instead of retrying them.

Give dashboard reporting sample-cap tests explicit timeouts because they intentionally write more than five thousand records and exceed the default Vitest timeout under coverage.

Co-Authored-By: GPT-5 Codex <noreply@openai.com>
@dcramer dcramer changed the title fix(resume): use stored session requester identity at continuation endpoints fix(junior): stabilize durable Slack conversation work Jun 9, 2026
Move requester parsing and Slack conversion into a canonical chat requester module. This mirrors the Destination pattern: one strong runtime type, one stored Slack shape, and explicit conversion at continuation boundaries.

Remove the old requester-identity helper layer so resume code reads as requester construction from stored Slack requester state instead of identity normalization around an identity-shaped object.

Co-Authored-By: GPT-5 Codex <noreply@openai.com>
Make Requester a domain type with Slack platform, team, and user identity instead of the old plugin-shaped requester surface. Carry that shape through plugin hooks, scheduler tools, resume flows, and Slack lookup paths so continuation identity is explicit and workspace scoped.

Add regression coverage for plugin requester schema validation, full hook context propagation, team-scoped Slack user caching, and stored requester team mismatch failures.

Refs #561
Co-Authored-By: GPT-5 Codex <noreply@openai.com>

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit e14894e. Configure here.

Comment thread packages/junior/src/chat/agent-dispatch/heartbeat.ts
Seed active timeout and yield continuation sessions into the durable conversation work record during the Redis conversation-state upgrade. This keeps heartbeat recovery active-index based while still repairing paused runs created before the new conversation work index existed.

Refs #561
Co-Authored-By: GPT-5 Codex <noreply@openai.com>
Assert the full workspace-scoped requester contract in the edited DM integration test so the mock no longer throws and masks the reply path with the fallback error response.

Co-Authored-By: GPT-5 Codex <noreply@openai.com>
dcramer and others added 2 commits June 9, 2026 15:48
Use StoredSlackRequester directly at durable requester storage boundaries instead of wrapping it in per-module aliases. This keeps the schema-owned type as the single name for persisted Slack requester state and avoids confusing it with the runtime Requester type.

Refs #561
Co-Authored-By: GPT-5 Codex <noreply@openai.com>
Record the current turn's transcript start index when durable Pi input is checkpointed, and use that boundary when dashboard reporting scopes a run transcript. This keeps steering messages from making the dashboard hide the original prompt in the same run.

Also preserve requester metadata through turn-session commits into the session log.

Co-Authored-By: GPT-5 Codex <noreply@openai.com>
@dcramer dcramer merged commit 1df6ebb into main Jun 9, 2026
16 checks passed
@dcramer dcramer deleted the fix/resume-requester-identity-from-session branch June 9, 2026 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Heartbeat scans conversation state blindly to find recovery candidates

1 participant