Skip to content

fix: improve old CLI conflict detection in preinstall hook#588

Open
tejaskash wants to merge 6 commits intomainfrom
fix/old-cli-conflict-detection
Open

fix: improve old CLI conflict detection in preinstall hook#588
tejaskash wants to merge 6 commits intomainfrom
fix/old-cli-conflict-detection

Conversation

@tejaskash
Copy link
Contributor

Summary

Closes #587

  • Replace brittle PATH-based detection (command -v agentcore + --version heuristic) with direct package-manager queries (pip list, pipx list, uv tool list) that check for the bedrock-agentcore-starter-toolkit Python package specifically
  • Keep the old PATH-based check (command -v agentcore / agentcore --version) as a fallback when no package manager reports the toolkit
  • Exit non-zero (process.exit(1)) so npm 8+ always surfaces the error instead of swallowing it on exit 0
  • Add AGENTCORE_SKIP_CONFLICT_CHECK=1 env var bypass for CI and edge cases
  • Move README migration note before the install command and cover pip, pipx, and uv uninstall methods

Test plan

  • npm test passes (20 unit tests covering probeInstaller, probePath, detectOldToolkit, formatErrorMessage, and subprocess integration)
  • npm pack produces a tarball that includes both scripts/check-old-cli.lib.mjs and scripts/check-old-cli.mjs
  • npm install <tarball> on a machine with the old toolkit installed exits 1 with a visible error showing pip uninstall bedrock-agentcore-starter-toolkit
  • AGENTCORE_SKIP_CONFLICT_CHECK=1 npm install <tarball> installs successfully, bypassing the check
  • Clean machine (no old toolkit) installs without issue

Replace brittle PATH-based detection with direct package-manager queries
(pip list, pipx list, uv tool list) and keep the old PATH check as a
fallback. Exit non-zero so npm always surfaces the error. Add
AGENTCORE_SKIP_CONFLICT_CHECK env var bypass. Update README to show all
three uninstall methods before the install command.

Closes #587
@tejaskash tejaskash requested a review from a team March 20, 2026 15:58
@github-actions github-actions bot added the size/m PR size: M label Mar 20, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 20, 2026

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 43.06% 4899 / 11376
🔵 Statements 42.67% 5189 / 12160
🔵 Functions 42.43% 900 / 2121
🔵 Branches 44.49% 3163 / 7108
Generated in workflow #1104 for commit 3cdba8b by the Vitest Coverage Report Action

@github-actions github-actions bot added size/m PR size: M and removed size/m PR size: M labels Mar 20, 2026
Add platform parameter to probePath() so it uses `where agentcore` on
Windows and `command -v agentcore` elsewhere, matching the original
behavior. Add tests for both platforms.
@github-actions github-actions bot added size/m PR size: M and removed size/m PR size: M labels Mar 20, 2026
Override @aws-sdk/xml-builder to 3.972.14 which sets
maxTotalExpansions: Infinity when creating its XMLParser. The previous
fast-xml-parser 5.5.7 override (from PR #577) introduced a default
limit of 1000 entities, but large CloudFormation responses for
container stacks exceed this (1175 entities), causing CDK deploy to
fail with "Entity expansion limit exceeded".

Also add deploy retry logic for container e2e tests (up to 3 attempts
with 30s delay) and include stdout in error assertions since the CLI
returns errors as JSON on stdout when using --json.
@github-actions github-actions bot removed the size/m PR size: M label Mar 20, 2026
@github-actions github-actions bot added the size/m PR size: M label Mar 20, 2026
Let customers know that npm install will fail with a clear error if the
old toolkit is detected, and mention the AGENTCORE_SKIP_CONFLICT_CHECK
env var for CI environments.
@github-actions github-actions bot added size/m PR size: M and removed size/m PR size: M labels Mar 20, 2026
Copy link
Contributor

@notgitika notgitika left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scripts/check-old-cli.lib.mjs:20 — Substring match is too broad
output.includes('bedrock-agentcore-starter-toolkit') will match any package whose name contains that string (e.g., a hypothetical bedrock-agentcore-starter-toolkit-extra). pip list output is columnar — matching against a regex like /^bedrock-agentcore-starter-toolkit\s/m would be more precise and avoid false positives.

scripts/check-old-cli.mjs:5 — process.exit(1) in a preinstall hook is a breaking change for existing users
If someone has the old toolkit installed and runs npm install in a project that depends on @aws/agentcore (not globally), the install will hard-fail. The old behavior was a warning. The README and PR description acknowledge this, but the severity should be called out: this can break CI pipelines and automated environments that previously worked. Consider whether a --force or grace period would be
appropriate, or at minimum ensure the error message is highly visible and actionable (it currently goes to console.error, which npm may suppress depending on log level).

scripts/check-old-cli.lib.mjs:47-51 — PATH fallback false positive risk
agentcore --version failing is used as a signal for the old CLI, but any broken install of the new CLI (corrupted node_modules, missing deps) would also fail --version. This could falsely block reinstallation of the new CLI. Consider checking the binary's shebang or location (e.g., is it in a Python site-packages path?) for a stronger signal.

src/cli/external-requirements/tests/check-old-cli.test.ts:234-252 — Subprocess integration test is environment-dependent
The "exits 1 with error when old toolkit is detected" test behaves differently depending on whether the old toolkit is installed on the test machine. The test silently passes in both cases (toolkit present → checks stderr, toolkit absent → does nothing). This means the exit-1 path is never verified in clean CI. Consider mocking PATH or using a fixture to guarantee the detection path is exercised.

- Use regex /^bedrock-agentcore-starter-toolkit\s/m instead of
  includes() to avoid matching superstring package names
- Skip node_modules-installed binaries in PATH fallback to prevent
  false positives from broken new CLI installs
- Replace environment-dependent integration test with deterministic
  stub that always exercises the exit-1 path
@github-actions github-actions bot removed the size/m PR size: M label Mar 20, 2026
@tejaskash tejaskash deployed to e2e-testing March 20, 2026 21:36 — with GitHub Actions Active
@github-actions github-actions bot added the size/m PR size: M label Mar 20, 2026
@tejaskash
Copy link
Contributor Author

Addressed 3 of 4 items from the review, pushing back on one:

1. Substring match too broad -- Fixed. Switched to /^bedrock-agentcore-starter-toolkit\s/m regex to match only the exact package name at the start of a line followed by whitespace. Added a test verifying that a hypothetical bedrock-agentcore-starter-toolkit-extra is not matched.

2. process.exit(1) is a breaking change -- Intentional and the reason for this PR. The old exit 0 behavior caused npm 8+ to hide the warning entirely, which is the bug #587 reports. The AGENTCORE_SKIP_CONFLICT_CHECK=1 env var bypass covers CI pipelines, and both the error message and README document it. Reverting to exit 0 or adding a grace period would reintroduce the original bug.

3. PATH fallback false positive risk -- Fixed. probePath now checks the resolved binary path against node_modules, /npm/, /nvm/, /fnm/ patterns. If the binary lives in a Node.js install directory, it's the new CLI (possibly broken) and we skip it rather than blocking reinstallation. Added a test covering this case.

4. Environment-dependent integration test -- Fixed. Replaced the environment-dependent test with a deterministic stub script that always injects bedrock-agentcore-starter-toolkit into the pip output. The exit-1 path is now exercised in every CI run regardless of what's installed on the machine.

].join('\n')
);
try {
execSync(`node ${wrapperPath}`, { stdio: 'pipe', encoding: 'utf-8' });

Check warning

Code scanning / CodeQL

Shell command built from environment values Medium test

This shell command depends on an uncontrolled
absolute path
.

Copilot Autofix

AI 26 minutes ago

In general, the fix is to avoid constructing a shell command string that embeds an environment-derived path and is executed via a shell. Instead, call Node with the script path as a separate argument, using an API that does not invoke a shell or that accepts an argument array so quoting is handled safely.

In this specific test, we can change the execSync call at line 268 from using a single template-string command to using the argument-array form: execSync('node', [wrapperPath], { ... }). In Node, when execSync is called with a non-empty argument array, it runs the given file directly (without going through a shell), and passes each array entry as a separate argument, preventing shell interpretation of wrapperPath. No behavior change is intended: we still run node with the same script, with the same options, and still capture stdout/stderr for assertions. No new imports are required and all changes are confined to src/cli/external-requirements/__tests__/check-old-cli.test.ts.

Concretely:

  • Locate the execSync call inside the "exits 1 with actionable error when old toolkit is detected" test.
  • Replace execSync(\node ${wrapperPath}`, { stdio: 'pipe', encoding: 'utf-8' });withexecSync('node', [wrapperPath], { stdio: 'pipe', encoding: 'utf-8' });`.
  • Leave the surrounding try/catch/finally and assertions unchanged.
Suggested changeset 1
src/cli/external-requirements/__tests__/check-old-cli.test.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/src/cli/external-requirements/__tests__/check-old-cli.test.ts b/src/cli/external-requirements/__tests__/check-old-cli.test.ts
--- a/src/cli/external-requirements/__tests__/check-old-cli.test.ts
+++ b/src/cli/external-requirements/__tests__/check-old-cli.test.ts
@@ -265,7 +265,7 @@
       ].join('\n')
     );
     try {
-      execSync(`node ${wrapperPath}`, { stdio: 'pipe', encoding: 'utf-8' });
+      execSync('node', [wrapperPath], { stdio: 'pipe', encoding: 'utf-8' });
       expect.unreachable('Should have exited with code 1');
     } catch (err: any) {
       expect(err.status).toBe(1);
EOF
@@ -265,7 +265,7 @@
].join('\n')
);
try {
execSync(`node ${wrapperPath}`, { stdio: 'pipe', encoding: 'utf-8' });
execSync('node', [wrapperPath], { stdio: 'pipe', encoding: 'utf-8' });
expect.unreachable('Should have exited with code 1');
} catch (err: any) {
expect(err.status).toBe(1);
Copilot is powered by AI and may make mistakes. Always verify output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/m PR size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Critical] CLI command conflict: old starter toolkit silently overrides new CLI - detection and guidance is insufficient

2 participants