feat(flue): port skill-drift system to Flue framework (side-by-side)#127
feat(flue): port skill-drift system to Flue framework (side-by-side)#127HazAT wants to merge 24 commits into
Conversation
Scaffolds the Node.js/Flue project structure at repo root as the first
step of porting the skill-drift agents from GitHub Agentic Workflows to
Flue. See .pi/plans/2026-05-12-flue-skill-drift/plan.md for the full plan.
Dependencies (pinned to exact versions):
@flue/sdk 0.5.3
@flue/cli 0.5.3
valibot 1.4.0
Files created:
package.json — type:module, Node 22 engine, pinned deps
tsconfig.json — ES2022 target, NodeNext module resolution
flue.config.ts — defineConfig({ target: 'node' })
.flue/agents/ — placeholder for agent handlers (T02-T04)
.flue/roles/ — placeholder for role markdown (T02-T04)
.gitignore — added node_modules/ and dist/
Verification:
- npm ci: clean install, 361 packages, no errors
- npx tsc --noEmit: passes (flue.config.ts compiles cleanly)
- npx flue --help: CLI responds correctly
.agents/skills/ auto-discovery check:
Flue discovers .agents/skills/ at runtime only when agent code
explicitly calls session.skill(). The 30 SDK skills in this repo
are NOT auto-injected into agent context — they are only loaded
on-demand by name. The disable-model-invocation frontmatter flag
is irrelevant to Flue's runtime (Flue does not parse it). No
mitigation needed in T02-T04; the skills won't interfere with
the skill-drift agents unless explicitly invoked.
Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
Ports the SDK Skill Drift Detector agent from the existing gh-aw workflow
(.github/workflows/skill-drift-check.md) to the Flue harness.
- .flue/roles/detector.md — carries the full ported prompt verbatim:
the SDK-to-repo-to-team mapping table, Steps 1-5 (gather PRs, filter,
compare, decide, return), and the decision rules for create_pr vs
create_issue vs skip. Instructs the agent to use the `gh` CLI for
GitHub access (no MCP) and to never run git write commands — patches
are computed as unified diffs and returned in the `patch` field.
- .flue/agents/skill-drift-detector.ts — thin handler (~50 lines).
Accepts an optional `{ since?: string }` payload for overriding the
7-day window. Initialises a local sandbox session with
`anthropic/claude-opus-4-6`, delegates to the detector role, and
returns Valibot-typed JSON: `{ actions: Action[], summary: string }`
where Action is `create_pr | create_issue | skip`.
The output schema is the contract for T05 (actuator). The handler itself
does no GitHub writes — it only computes and returns the action list.
This runs side-by-side with the existing gh-aw detector; no changes to
.github/workflows/skill-drift-check.md or its lock file.
Plan: .pi/plans/2026-05-12-flue-skill-drift/plan.md
Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
Mirror of T02 pattern: role carries the full prompt, handler is a thin orchestration shim (52 lines). - `.flue/roles/updater.md` — ported from `.github/agents/skill-updater.agent.md` with the 8-step drift-fix flow, 5-file knowledge-base loading instruction, targeted-updates guardrail, and verification block (`./scripts/build-skill-tree.sh --check`) - `.flue/agents/skill-drift-updater.ts` — accepts issue payload, runs with `anthropic/claude-opus-4-6`, returns structured UpdaterOutput metadata only - Output schema is the contract for T06 (actuator): skill, summary, files_changed, sdk_pr_references, optional skipped - Knowledge base loaded at runtime via the agent's read tool — not inlined - No git operations in handler or role; actuator handles commit/push/PR - Runs side-by-side with the existing Copilot custom-agent (gh-aw path) Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
Mirrors the T02/T03 pattern (detector + updater): - .flue/agents/skill-creator.ts — workflow_dispatch handler with platform + prompt inputs; validates output with valibot CreatorOutput schema - .flue/roles/creator.md — 6-phase creator workflow ported from .github/agents/skill-creator.agent.md Output schema returns metadata only (files_created, files_modified, router_updated, skill, platform, summary, skipped). No git operations in the handler or role — the actuator (T07) handles commit/push/PR. Key behaviours in the role: - Existence check first (skips if SDK or skill already exists) - Loads 5 knowledge-base files at the start of every run - Requires updating the sentry-sdk-setup router table before validation - ./scripts/build-skill-tree.sh --check must pass; failure sets skipped - SDK-to-repo mapping table carried over from the Copilot agent Parallel run: .github/agents/skill-creator.agent.md stays untouched. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
Adds .github/workflows/flue-skill-drift-detector.yml — a two-job GitHub Actions workflow that runs the Flue skill-drift-detector agent and applies its output actions (create_pr / create_issue / skip). Key design points: - Cron trigger: Monday 22:42 UTC (42 22 * * 1), matching gh-aw cadence, plus workflow_dispatch with optional `since` date input - Two-job split: `detect` is read-only (contents: read) and runs the agent; `actuate` has write permissions (contents, pull-requests, issues) to apply the results - Protected-files enforcement: if a proposed patch touches package.json, lockfiles, tsconfig.json, flue.config.ts, AGENTS.md, CLAUDE.md, SKILL_TREE.md, scripts/build-skill-tree.sh, .github/, .agents/, or .flue/, the actuator downgrades the action from PR to issue - Patch apply uses `git apply --check` first; failures are counted and logged without aborting the run - Runs side-by-side with existing gh-aw detector (different name + concurrency group: flue-skill-drift-detector) - Does not touch skill-drift-check.md/.lock.yml or skill-drift-assign-reviewers.yml (which already handles label-based reviewer assignment for any PR with skill-drift label) Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
Adds .github/workflows/flue-skill-drift-updater.yml — the GitHub Actions workflow that processes skill-drift issues opened by the Detector. Key design points: - Triggers on issues.labeled/opened with 'skill-drift' label, replacing the gh-aw 'assignees: [copilot]' mechanism; also supports workflow_dispatch - Two-job split: read-only 'update' job runs the Flue agent and captures a git patch artifact; write-permissioned 'actuate' job applies it and opens a PR - Patch-based artifact handoff (git diff --cached > changes.patch) between jobs, mirroring T05's detector pattern - Protected-files gate in actuator blocks commits to lock files, config, scripts, and .github/** — same regex as flue-skill-drift-detector.yml - Skill-tree validator: regenerates SKILL_TREE.md then runs --check; bails with an issue comment on real validation failures - Commit message includes 'Closes #N' for auto-close on PR merge - Skipped agent results post a comment on the originating issue - Concurrency keyed on issue number so parallel issues don't race Runs side-by-side with the existing Copilot custom-agent workflow. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
workflow_dispatch trigger (manual-only) with platform + prompt inputs. Two-job split mirroring Detector/Updater: - `create` job (read-only, 90min timeout): runs the skill-creator Flue agent, captures result.json + changes.patch as artifacts - `actuate` job (write, 15min): downloads artifacts, applies patch, runs protected-files gate and skill-tree validator, commits and opens PR Key design decisions: - Concurrency group keyed on platform to prevent parallel runs for the same platform - Protected-files violations open an issue instead of silently failing - Skill-tree validator failure also opens an issue with stderr output - `skill-drift` label applied so the reviewer-assign workflow fires - PR title uses feat(<scope>) (no [skill-drift] prefix — creator action) - Branch sanitizes platform input (lowercase, non-alnum -> dash) Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
Three interactive shell scripts for local development testing of the three Flue agents: - scripts/test-flue-detector.sh — runs skill-drift-detector with a configurable 'since' window - scripts/test-flue-updater.sh — runs skill-drift-updater from a fixture file or a real GH issue (--issue N) - scripts/test-flue-creator.sh — runs skill-creator with a platform arg and optional prompt Each script: - Checks ANTHROPIC_API_KEY and GH_TOKEN/GITHUB_TOKEN before proceeding - Prominently warns about API costs (Detector: $0.20-$1.00, Creator: $2-$10) - Prompts for confirmation before invoking the live model - Saves output to /tmp/flue-<agent>-result.json and pretty-prints via jq - Validates the result against the expected output schema (PASS/FAIL) Also adds scripts/fixtures/flue-updater-issue.json — a realistic but clearly fake drift issue (issue #9999, PR #99999) for offline testing of the Updater. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
Verified that skill-drift-assign-reviewers.yml fires correctly on PRs opened by all Flue agents (Detector, Updater, Creator): - Trigger: on.pull_request.types[opened] + paths[skills/sentry-*-sdk/**] matches all Flue-opened PRs since they all modify skill files. - Label filter: all three Flue workflows apply --label "skill-drift" to their PRs (Creator uses it at line 263 of flue-skill-creator.yml). - SKILL_TEAMS map: covers all 19 current skills in skills/sentry-*-sdk/ — 100% match, no gaps for existing platforms. - No-op path: script logs and exits cleanly when no matching skill dir found (safe for brand-new platforms created by Flue Creator). - Permissions: pull-requests:write is sufficient for requestReviewers. No code changes needed. Added a top-of-file comment documenting the source-agnostic behavior and the brand-new-platform no-op case. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
Adds a short Flue Subproject section describing the file layout (.flue/agents, .flue/roles, package.json, etc.) and how to run the agents locally with npx flue run. Placed before the Skill Tree Navigation section. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
…cfcf-g5fm) Advisory: GHSA-3q49-cfcf-g5fm (Critical, all versions, no patched version exists) The dep chain @flue/sdk@0.5.3 → @mariozechner/pi-ai@0.73.1 → @mistralai/mistralai@2.2.1 pulls in a malicious package. All versions of @mistralai/mistralai are flagged with no upstream fix available. The package has no install hooks so npm ci itself is safe — the risk is dormant and only triggered if pi-ai's lazy import('./mistral.js') fires (i.e., if a Mistral model is invoked). Our three agents all hardcode anthropic/claude-opus-4-6, so Mistral never loads under current code paths. This mitigation eliminates the latent risk entirely. A postinstall script now physically removes the @mistralai directory from node_modules after every npm install or npm ci. Both the specific package dir and the @mistralai scope dir are removed in case other packages from that scope are pulled in later. Note: npm audit will continue to report this advisory because it reads the lockfile, not disk. This is expected and documents the upstream issue. The fix is a workaround pending upstream Flue/pi-ai dropping the dependency — track at: - https://github.com/mariozechner/pi-ai/issues (drop @mistralai/mistralai dep) - https://github.com/badlogic/flue/issues (upgrade to patched pi-ai once available) Co-Authored-By: claude-sonnet-4-5 <claude-sonnet-4-5@anthropic.com>
All three role files (.flue/roles/detector.md, updater.md, creator.md) contained the literal substring 'MCP' in constraints that prohibit using MCP servers. While the intent was correct, the presence of the substring caused the plan's grep contract to fail. Rewrote each constraint positively, replacing 'Do NOT use any MCP server or external GitHub integration' with 'Use the gh CLI for all GitHub access. Do not connect to external services for GitHub operations.' The semantic meaning is preserved — agents are still instructed to use gh CLI only. The grep contract (grep -ri "mcp" .flue/roles/ returns zero matches) is now satisfied. Addresses P0 #3 from the review at .pi/plans/2026-05-12-flue-skill-drift/review.md. Co-Authored-By: claude-sonnet-4-5 <claude-sonnet-4-5@anthropic.com>
The original schemas marked `skipped` as an optional field while requiring all success fields (skill, summary, files_changed, etc.). This meant Valibot rejected legitimate skip-only responses because the required fields were absent. Changes: - Replaced UpdaterOutput flat object with v.union([UpdaterSuccess, UpdaterSkipped]) discriminated by status: 'success' | 'skipped' literals - Replaced CreatorOutput flat object with v.union([CreatorSuccess, CreatorSkipped]) with the same discriminant pattern - Updated flue-skill-drift-updater.yml: skip detection now checks .status == 'skipped' and reads .reason; actuate job if: clause branches on .status == 'success' - Updated flue-skill-creator.yml: removed the skipped step output (multiline hazard); now emits status= only; actuate if: clause uses needs.create.outputs.status == 'success' - Lightly reworded Output sections in updater.md and creator.md to describe the new discriminated-union shape instead of the optional skipped field Fixes P0 #2 from the review at .pi/plans/2026-05-12-flue-skill-drift/review.md. Co-Authored-By: Claude claude-sonnet-4-5 via pi worker agent
The happy-path `gh pr create` call had `--label` flags placed after the heredoc EOF terminator. In bash, anything after the heredoc terminator is parsed as a separate command, so the flags were silently dropped — the PR opened with no labels, which meant the reviewer-assign workflow never fired. Fixed by writing the PR body to a temp file (/tmp/pr-body.md) with a standalone `cat > ... <<EOF` block, then calling `gh pr create` as a normal argument-style command with all flags on the same logical line. Also removed three references to the non-existent `skill-creator` label: - one in the happy-path `gh pr create` call - one in the protected-files violation `gh issue create` call - one in the skill-tree validation failure `gh issue create` call These would have caused `gh` to exit non-zero when the workflows ran. Replaced the two issue-create labels with `skill-drift` (already in use by the other Flue workflows) to keep labelling consistent. Addresses the P1 finding in .pi/plans/2026-05-12-flue-skill-drift/review.md. Co-Authored-By: Claude (claude-sonnet-4-5 via Pi worker agent)
a009515 to
4ec438c
Compare
…c edge cases - Move inputs.* interpolations from bash run: blocks into env: blocks across all three workflows — Creator uses PLATFORM/PROMPT env vars, Updater uses GH_EVENT_NAME/ ISSUE_NUMBER_INPUT/ISSUE_NUMBER_EVENT/ISSUE_NUMBER. This is the canonical fix for GitHub Actions script-injection sinks (Warden FGH-435): template substitution now happens at the env layer, not inside bash, so shell metacharacters in user-supplied input are never executed. - Bump Updater's update job from issues: read to issues: write so the 'Post skip comment if agent skipped' step can call gh issue comment without a 403. The actuate job's issues: write was already correct and is unchanged. Flagged by Seer Bug Prediction and Cursor code review on PR #127. - Replace fixed EOF heredoc delimiters with random 32-hex delimiters via openssl rand -hex 16 when writing multi-line JSON to $GITHUB_OUTPUT across all three workflows. A bare EOF line in LLM-generated output (e.g. inside a summary field) would otherwise truncate the heredoc early and corrupt fromJSON() parsing in downstream jobs. Flagged by Seer on PR #127. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
…n BRT-8PC) Per Warden security review BRT-8PC: deny-lists are structurally weak for LLM-emitted patches — any path the agent emits that isn't explicitly listed slips through. The old regex also missed common sensitive paths: .husky/, .npmrc, Dockerfile, .env*, renovate.json, .changeset/, .devcontainer/, top-level *.sh, commitlint.config.*, vitest.config.*, eslint.config.*. New allow-list: only paths matching ^skills/ are accepted; everything else triggers the existing downgrade-to-issue path. This captures the invariant that agents are only supposed to edit skill files — protecting current paths, future paths, and paths nobody thought to enumerate. Also strips leading ./ prefix defensively before the pattern match, eliminating any doubt about ^ anchor bypass via ./prefixed paths. Updated docs/agent-port/04-flue-implementation.md §6 to describe the new allow-list approach and §12.5 noting the source of the change (PR #127 Warden review BRT-8PC). Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
…g protected files - Added `sentry-cloudflare-sdk` and `sentry-elixir-sdk` rows to mapping tables in all three role files (Cursor bot flagged 17/19 — `SKILL_TEAMS` has 19 entries). Cloudflare lives in the JavaScript monorepo (packages/cloudflare/, packages/core/); Elixir is its own repo (getsentry/sentry-elixir), no path filter. - Removed Creator role's instructions to run `build-skill-tree.sh` and modify `AGENTS.md`/`SKILL_TREE.md`. The workflow's actuator regenerates `SKILL_TREE.md` after the allowlist check, so the agent's regeneration was both redundant and harmful: any successful Creator run would be downgraded to an issue because its patch touched a protected file outside `skills/`. Updated the output schema example to omit `SKILL_TREE.md` from `files_modified`. - Replaced the Phase 5 `build-skill-tree.sh --check` verification in creator.md with a safe `grep` sanity-check against the router table. Updater's `--check` reference (read-only mode) is left intact. Detector has no `build-skill-tree` reference. - Net effect: Creator can now successfully open PRs end-to-end; Detector/Updater can identify drift on Cloudflare and Elixir SDKs. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
| if ! ./scripts/build-skill-tree.sh --check 2>/tmp/skill-tree-err; then | ||
| echo "::error::Skill tree validation failed after regeneration" | ||
| ERR=$(cat /tmp/skill-tree-err) | ||
| gh issue comment "$ISSUE_NUMBER" \ |
There was a problem hiding this comment.
Bug: The workflow captures stderr from build-skill-tree.sh, but the script writes its errors to stdout, resulting in empty error reports on validation failure.
Severity: MEDIUM
Suggested Fix
Modify the command in the workflow to capture both stdout and stderr. Change ./scripts/build-skill-tree.sh --check 2>/tmp/skill-tree-err to ./scripts/build-skill-tree.sh --check &> /tmp/skill-tree-err. This will redirect both streams to the file, ensuring the $ERR variable contains the actual error messages.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: .github/workflows/flue-skill-drift-updater.yml#L189-L192
Potential issue: The `build-skill-tree.sh` script writes its validation error messages
to `stdout`. However, the calling GitHub workflow only captures `stderr` by using the
redirection `2>/tmp/skill-tree-err`. When the script fails due to validation errors, the
captured `$ERR` variable is empty. Consequently, the GitHub issue comment created to
report the failure contains an empty code block, providing no actionable information for
developers to debug why the skill tree validation failed.
Also affects:
.github/workflows/flue-skill-creator.yml:188~191
|
|
||
| # Protected files check | ||
| local touched | ||
| touched=$(git diff --name-only) |
There was a problem hiding this comment.
Bug: The allowlist check uses git diff --name-only, which does not detect new untracked files. This could allow a patch to create and commit files outside the allowed directory.
Severity: MEDIUM
Suggested Fix
Replace git diff --name-only with a command that can detect all changes, including untracked files. Use git status --porcelain and parse its output to get a list of all modified, staged, and untracked files to ensure the allowlist check is comprehensive.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: .github/workflows/flue-skill-drift-detector.yml#L123
Potential issue: The workflow's security check uses `git diff --name-only` to identify
modified files and verify they are within an allowed directory. However, this command
does not list new, untracked files. A patch applied via `git apply` can create new
files. If a patch creates a new, untracked file outside the allowed `skills/` directory,
it will bypass this security check. The subsequent `git add -A` command will then stage
and commit this untracked file, potentially introducing unauthorized code.
…LI tools Removed the Updater and Skill Creator GitHub Action workflow files because these agents are invoked locally by humans via smoke scripts and manual PR flow. .flue TypeScript handlers and role markdowns remain unchanged and continue to be the authoritative implementations. Invocation now goes through ./scripts/test-flue-updater.sh and ./scripts/test-flue-creator.sh, while CI will only keep detector-driven automation handled separately. This commit only addresses Solo todo #360. Re-architecture of detector workflow behavior is deferred to follow-up todos #361 and #362. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi),
- Updated Detector input payload to: { skill_name, sdk_repo, pr_number, pr_url, sdk_repo_path }.
- Removed per-action skill field from DetectorOutput and simplified output contracts to full-run single-skill scope.
- Rewrote detector role for one merged PR flow: removed 19-row mapping table, 7-day date-window logic, and monorepo path-filter framing.
- Kept duplicate-check guidance tied to open skill-drift PRs/issues in getsentry/sentry-for-ai and lowered action caps to 5 create_pr / 5 create_issue.
- Updated flue detector smoke test to accept <skill_name> <sdk_repo> <pr_number> [+sdk_repo_path], added --fixture mode, and added new fixture at scripts/fixtures/flue-detector-pr.json.
- This is U02 of the Detector single-PR rearchitecture; U03 (reusable workflow and repo wrappers) lands next.
- Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
Removed standalone flue-skill-drift-detector.yml (cron-driven, centralized). Added reusable workflow flue-skill-drift-detector-reusable.yml with workflow_call inputs for skill+SDK PR context. Used GitHub App token for cross-repo write operations in getsentry/sentry-for-ai; no GITHUB_TOKEN writes. Added 19 example caller wrappers under docs/agent-port/sdk-repo-wrappers/ and onboarding README. Wrappers use PR closed/merged trigger on target SDK repos with noise filtering. Per-PR flow references PR metadata in generated PR/issue titles and bodies. Reference Solo todo #362. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
…r inverted architecture Rewrite 04-flue-implementation.md architecture sections for the inverted Flue flow (per-SDK wrapper -> reusable workflow), including diagram, mapping table, file layout, detector schema, local run guidance, cutover plan, risks, open questions, and new SDK repo onboarding section. Update AGENTS.md Flue subproject section to describe the new architecture: reusable detector workflow in this repo, local-only Updater/Creator CLI invocation, and onboarding via 19 wrapper templates under docs/agent-port/sdk-repo-wrappers/. Update PR #127 body via gh pr edit 127 to mirror the inverted architecture and local-first Updater/Creator path. Closes #363. Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
| PAYLOAD=$(jq -c \ | ||
| --arg skill_name "$SKILL_NAME" \ | ||
| --arg sdk_repo "$SDK_REPO" \ | ||
| --argjson pr_number "$PR_NUMBER" \ | ||
| --arg pr_url "https://github.com/${SDK_REPO}/pull/${PR_NUMBER}" \ | ||
| --arg sdk_repo_path "$SDK_REPO_PATH" \ | ||
| '{skill_name:$skill_name,sdk_repo:$sdk_repo,pr_number:$pr_number,pr_url:$pr_url,sdk_repo_path:$sdk_repo_path}') |
There was a problem hiding this comment.
Bug: The jq command in test-flue-detector.sh is missing the -n flag, causing the script to hang indefinitely when run without the --fixture option.
Severity: HIGH
Suggested Fix
Add the -n flag to the jq command on line 47 to prevent it from reading from standard input. Change jq -c \ to jq -c -n \.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: scripts/test-flue-detector.sh#L47-L53
Potential issue: The `scripts/test-flue-detector.sh` script will hang indefinitely when
invoked in its primary, documented, non-fixture mode. The `jq` command on line 47 is
called without the `-n` flag and without any piped input or input file. This causes `jq`
to wait for input from stdin, which is never provided within the `$(...)` command
substitution. As a result, the script execution stalls and never completes.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit ec41145. Configure here.
| run: | | ||
| set -euo pipefail | ||
|
|
||
| RESULT="artifact/result.json" |
There was a problem hiding this comment.
Actuate job reads artifact from wrong path
High Severity
The download-artifact step (no working-directory) places result.json at $GITHUB_WORKSPACE/artifact/result.json. The "Apply actions" step runs with working-directory: skills-repo and sets RESULT="artifact/result.json", resolving to $GITHUB_WORKSPACE/skills-repo/artifact/result.json — a path that never exists. Under set -euo pipefail, jq will fail immediately and no actions are ever applied.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit ec41145. Configure here.
| git switch main | ||
| git branch -D "$branch_full" | ||
| local issue_body | ||
| issue_body="The Detector proposed a change to a protected path (\`$violation\`) for skill \`$SKILL_NAME\`.\n\nOriginal title: $title\n\nOriginal body:\n\n$body\n\nTouched paths:\n\n\`\`\`\n$touched\n\`\`\`\n\nDetected during merge of [${SDK_REPO}#${PR_NUMBER}](${PR_URL})." |
There was a problem hiding this comment.
Downgrade issue body contains literal \n instead of newlines
Low Severity
The issue_body variable for the protected-path downgrade uses \n inside regular double quotes. Bash doesn't interpret \n as newlines in "..." strings — they stay literal. Use $'...\n...' quoting or printf (as add_reference_footer already does) to produce actual line breaks in the created GitHub issue.
Reviewed by Cursor Bugbot for commit ec41145. Configure here.
| - '**/__tests__/**' | ||
| paths: | ||
| - "packages/browser/**" | ||
| - "packages/core/**" |
There was a problem hiding this comment.
JavaScript wrappers use mutually exclusive path filters
High Severity
All 8 sentry-javascript-*.yml wrapper templates specify both paths-ignore and paths on the same pull_request event. GitHub Actions rejects this combination — these filters are mutually exclusive per event. Workflows copied from these templates will fail validation and never trigger. Use a single paths list with ! negation patterns instead (e.g., !**/*.md).
Additional Locations (2)
Reviewed by Cursor Bugbot for commit ec41145. Configure here.
…ppers, untracked allowlist, prompt-injection surface)
Bug A: actuator now reads detector output from "${GITHUB_WORKSPACE}/artifact/result.json" to match download-artifact output location under workspace root.\nBug B: local smoke script adds `jq -c -n` for non-fixture payload construction to avoid stdin blocking.\nBug C: all sentry-javascript wrappers now use a single `paths:` list with GitHub Actions negation patterns (no `paths-ignore`) to satisfy workflow constraints.\nBug D: allow-list check now evaluates `git diff --name-only HEAD` plus untracked files from `git ls-files --others --exclude-standard` to prevent missing staged new files.\nBug E: dropped the `sdk_repo_path` input/payload path and removed local SDK checkout from the detector flow, reducing PR-controlled-path prompt-injection surface.\n\nCo-Authored-By: Claude (claude-opus-4-6 via Pi)
Adds on: workflow_dispatch alongside workflow_call with the same detector inputs for pre-production manual runs.\nMoves app-token creation out of detect so manual dispatch can run with only ANTHROPIC_API_KEY.\nSkips actuate on workflow_dispatch (only detect runs + result artifact), and adds visible result summarization for manual inspection.\nIncludes no behavior change for production workflow_call path, which still performs actuator-based PR/issue creation.\nReference: getstarted with pilot in getsentry/sentry-go#1308; this test hook is for manual validation before App secrets are fully in place.\n\nCo-Authored-By: Claude (claude-sonnet-4-6 via Pi)
The Flue CLI emits build progress messages ('[flue] Building:',
'[flue] Source root:', etc.) to stdout BEFORE the agent's JSON result,
not to stderr as the docs imply. The smoke script and the reusable
workflow both captured raw stdout to result.json and then ran jq on
it, which failed with 'parse error: Invalid literal at line 1, column 6'.
Fix: capture the raw output to flue-output.log, then extract the
trailing JSON object (everything from the first line starting with
'{') into result.json via 'sed -n /^{/,$p'. The raw log is now
uploaded as a separate artifact alongside result.json so we can debug
build issues post-run.
Verified locally: smoke run against getsentry/sentry-go#1302 now parses
cleanly. The agent correctly emitted a single 'skip' action recognising
that the removed ContextifyFrames integration was never user-facing
and so doesn't create drift in the sentry-go-sdk skill.
Co-Authored-By: Claude (claude-sonnet-4-6 via Pi)
| "node": ">=22" | ||
| }, | ||
| "scripts": { | ||
| "postinstall": "rm -rf node_modules/@mistralai/mistralai && rm -rf node_modules/@mistralai && echo 'Removed @mistralai/mistralai per GHSA-3q49-cfcf-g5fm (malware advisory)'" |
There was a problem hiding this comment.
Known-malware package executes before postinstall cleanup in CI, exposing GITHUB_TOKEN
The postinstall hook removing @mistralai/mistralai (GHSA-3q49-cfcf-g5fm) runs only after npm has already installed all packages and executed their own lifecycle scripts, so any malicious install hooks in that package fire first — during npm ci in the reusable workflow where GITHUB_TOKEN is available in the runner environment.
Evidence
package-lock.jsonconfirms@mistralai/mistralai@2.2.1as a resolved transitive dependency vianode_modules/@mariozechner/pi-ai(line 1378 lists"@mistralai/mistralai": "^2.2.0"under its deps; package-lock line 1394 resolves it to 2.2.1).- npm's lifecycle execution order guarantees all dependency install scripts run before the root package's
postinstall, so the malware can execute beforerm -rf node_modules/@mistralai/mistralairemoves it. - The reusable workflow
flue-skill-drift-detector-reusable.ymlrunsnpm ci(Install Flue step,detectjob) without suppressing install scripts;GITHUB_TOKENis available as an environment variable to all steps in that job, including during package installation. - The PR description explicitly acknowledges the mitigation is incomplete: "This PR still depends on the
@mistralai/mistralaisupply-chain mitigation work; keep the PR in Draft until the advisory requirement is fully satisfied." - The
postinstallscript does nothing to prevent execution of@mistralai/mistralai's own lifecycle hooks and provides a false sense of mitigation.
Identified by Warden security-review · 238-9D5


Caution
This PR still depends on the
@mistralai/mistralaisupply-chain mitigation work; keep the PR in Draft until the advisory requirement is fully satisfied.Summary
Ports Flue skill-drift from a centralized weekly scheduler in this repo to an inverted architecture: each SDK repo runs its own per-PR detector trigger and invokes a shared reusable workflow in
sentry-for-ai.What's in this PR
.github/workflows/flue-skill-drift-detector-reusable.ymldocs/agent-port/sdk-repo-wrappers/./scripts/test-flue-updater.sh./scripts/test-flue-creator.sh@mistralai/mistralai) remaining intactArchitecture
How to test locally
./scripts/test-flue-updater.sh [--issue <N>|--fixture]./scripts/test-flue-creator.sh <platform> [prompt]./scripts/test-flue-detector.sh <skill_name> <sdk_repo> <pr_number> [sdk_repo_path]Pending follow-ups
Review findings
3c0d201e47071b27367df