Skip to content

feat: add benchmark framework for collection mount performance#7915

Open
chirag-bruno wants to merge 6 commits intousebruno:mainfrom
chirag-bruno:feat/benchmark-collection-mount
Open

feat: add benchmark framework for collection mount performance#7915
chirag-bruno wants to merge 6 commits intousebruno:mainfrom
chirag-bruno:feat/benchmark-collection-mount

Conversation

@chirag-bruno
Copy link
Copy Markdown
Collaborator

@chirag-bruno chirag-bruno commented May 5, 2026

Description

  • Playwright benchmark tests measuring collection mount time across bru/yml formats and sizes (50-5000 requests)
  • IPC listener approach for precise mount-complete signal
  • Generic benchmark utils: stats, results I/O, baseline comparison, PR commenting
  • Collection generator using @usebruno/filestore serializers
  • CI workflow running on ubuntu, macos, and windows with PR comment reporting
  • Regression detection against committed baselines with configurable threshold

Contribution Checklist:

  • I've used AI significantly to create this pull request
  • The pull request only addresses one issue or adds one feature.
  • The pull request does not introduce any breaking changes
  • I have added screenshots or gifs to help explain the change if applicable.
  • I have read the contribution guidelines.
  • Create an issue and link to the pull request.

Note: Keeping the PR small and focused helps make it easier to review and merge. If you have multiple changes you want to make, please consider submitting them as separate pull requests.

Publishing to New Package Managers

Please see here for more information.

Summary by CodeRabbit

  • New Features

    • Added performance benchmarking system for measuring application performance across Ubuntu, macOS, and Windows.
    • Automated benchmark comparison against established baselines with regression detection.
    • Added test:benchmark npm script for running performance tests.
  • Chores

    • Updated project configuration to support benchmark test execution.
    • Added benchmark result directories to .gitignore.

- Playwright benchmark tests measuring collection mount time across bru/yml formats and sizes (50-5000 requests)
- IPC listener approach for precise mount-complete signal
- Generic benchmark utils: stats, results I/O, baseline comparison, PR commenting
- Collection generator using @usebruno/filestore serializers
- CI workflow running on ubuntu, macos, and windows with PR comment reporting
- Regression detection against committed baselines with configurable threshold
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 5, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

This PR introduces a comprehensive benchmarking infrastructure for measuring collection mount performance across operating systems. It adds GitHub Actions workflows, Playwright benchmark configuration, collection generation utilities, statistical helpers, result comparison tooling, PR integration, and baseline metrics for Ubuntu, macOS, and Windows.

Changes

Benchmark Infrastructure & Test Suite

Layer / File(s) Summary
Workflow & Action Configuration
.github/workflows/benchmarks.yml, .github/actions/tests/run-benchmark-tests/action.yml, playwright.benchmark.config.ts, package.json, .gitignore
GitHub Actions workflow runs benchmarks on a matrix of operating systems with conditional steps for updating vs. comparing baselines. Composite action abstracts benchmark execution with OS-specific runner logic. Playwright config isolates benchmarks to a single worker with extended timeouts and JSON reporting. npm script and ignore patterns complete the CI/test infrastructure.
Benchmark Utility Modules
tests/benchmarks/utils/stats.ts, tests/benchmarks/utils/results.ts, tests/benchmarks/utils/collection-generator.ts
Statistical functions compute mean, median, percentiles, and standard deviation. Results I/O handles JSON serialization/deserialization with typed schemas. Collection generator produces Bruno/OpenCollection fixtures on disk at configurable sizes and formats for repeatability.
Benchmark Test Implementation
tests/benchmarks/mounting/collection-mount.bench.ts
Playwright test measures collection mount timing across multiple sizes and formats by instrumenting Electron dialog mocking, IPC event listening, and performance.now() measurement. Aggregates per-configuration statistics and writes results to JSON.
Result Processing & Reporting
tests/benchmarks/utils/compare.js, tests/benchmarks/utils/pr-comment.js
compare.js CLI validates results against baseline with configurable regression threshold, supports baseline update mode. pr-comment.js constructs and posts/updates GitHub PR comments showing per-benchmark percentage changes and regression status.
Baseline Data
tests/benchmarks/mounting/baseline.*.json
Platform-specific baseline files (Ubuntu, macOS, Windows) define regression thresholds and per-benchmark mean/p50 targets for detection of performance regressions.

Sequence Diagram

sequenceDiagram
    participant GHA as GitHub Actions
    participant CAct as Composite Action
    participant Pw as Playwright Runner
    participant Bench as Benchmark Test
    participant CmpJS as compare.js
    participant CommentJS as pr-comment.js
    participant GHApi as GitHub API

    GHA->>CAct: Trigger with os & update-baseline
    CAct->>Pw: Run benchmarks (with xvfb on Linux)
    Pw->>Bench: Execute mount tests
    Bench->>Bench: Measure timing, generate results.json
    CAct->>CmpJS: Load results & baseline
    alt update-baseline == true
        CmpJS->>CmpJS: Compute new baseline from results
        CmpJS->>Pw: Write baseline.json
    else compare mode
        CmpJS->>CmpJS: Compare against baseline
        CmpJS->>CmpJS: Check for regressions
    end
    GHA->>GHA: Upload results & baseline artifacts
    alt Pull Request event
        GHA->>CommentJS: Load results & baseline
        CommentJS->>CommentJS: Build comparison table
        CommentJS->>GHApi: Post/update PR comment
        GHApi->>GHApi: Display benchmark results
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Suggested labels

perf

Suggested reviewers

  • helloanoop
  • lohit-bruno
  • naman-bruno

Poem

🚀 Benchmarks take flight across every OS,
Collection mounts timed with statistical finesse,
Baselines established, regressions in sight,
Performance metrics shining bright—
Speed insights for all, PR comments in flight! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat: add benchmark framework for collection mount performance' accurately and concisely summarizes the primary change: introducing a complete benchmarking framework for measuring collection mount performance.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (5)
tests/benchmarks/mounting/collection-mount.bench.ts (2)

81-81: ⚡ Quick win

Replace page.waitForTimeout(500) with a deterministic wait.

Per coding guidelines, page.waitForTimeout() should only be used when no suitable expect() locator-based wait is available.

If closeAllCollections leaves a detectable UI change (e.g., the sidebar collection list becomes empty), prefer:

-          await page.waitForTimeout(500);
+          await expect(page.locator('#sidebar-collection-name')).toHaveCount(0);

If there's genuinely no stable locator to await after close, add a brief comment justifying the timeout.

As per coding guidelines: "Try to reduce usage of page.waitForTimeout(); in code unless absolutely necessary and the locator cannot be found using existing expect() playwright calls."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/benchmarks/mounting/collection-mount.bench.ts` at line 81, Replace the
non-deterministic sleep (page.waitForTimeout(500)) in the collection-mount
benchmark with a locator-based expect or a justified comment: after calling
closeAllCollections() wait for a concrete UI change (e.g.,
expect(sidebarCollectionList).toBeEmpty() or
expect(collectionItem.locator('text=...')).not.toBeVisible()) that signals
completion; if there truly is no stable DOM change to await, keep a very brief
comment explaining why a fixed timeout is required and reduce the timeout to the
minimum necessary while noting the reason.

70-95: ⚡ Quick win

Wrap test body steps with test.step for clearer reports.

The inner test body performs distinct phases — generate, measure (× N), record, log — but has no test.step wrapping, making failure traces harder to read.

♻️ Example restructuring with test.step
         test(`mount ${format} collection with ${size} requests`, async ({ page, electronApp, createTmpDir }) => {
           test.setTimeout((2 + Math.ceil(size / 100) * 2) * 60_000);
           const timings: number[] = [];

           for (let i = 0; i < ITERATIONS_PER_SIZE; i++) {
-            const collectionName = `bench-${format}-${size}-iter-${i}`;
-            const collectionDir = await createTmpDir(`bench-${format}-${size}-${i}`);
-            generateCollection({ dir: collectionDir, name: collectionName, requestCount: size, format });
-
-            const elapsed = await measureCollectionMount(page, electronApp, collectionDir, collectionName);
-            timings.push(Math.round(elapsed));
-            await page.waitForTimeout(500);
+            await test.step(`iteration ${i + 1}`, async () => {
+              const collectionName = `bench-${format}-${size}-iter-${i}`;
+              const collectionDir = await createTmpDir(`bench-${format}-${size}-${i}`);
+              await test.step('generate collection', () => {
+                generateCollection({ dir: collectionDir, name: collectionName, requestCount: size, format });
+              });
+              const elapsed = await test.step('measure mount', () =>
+                measureCollectionMount(page, electronApp, collectionDir, collectionName)
+              );
+              timings.push(Math.round(elapsed));
+            });
           }

As per coding guidelines: "Promote the use of test.step as much as possible so the generated reports are easier to read."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/benchmarks/mounting/collection-mount.bench.ts` around lines 70 - 95,
The test body for the `test("mount ${format} collection with ${size} requests",
...)` case should be split into explicit Playwright `test.step` calls to improve
report clarity: wrap the collection generation loop (calls to
`generateCollection` and `createTmpDir`) in a "generate collections" step, wrap
each measurement call to `measureCollectionMount` (and the per-iteration wait)
inside a "measure mount (iteration i)" step or a single "measure mounts" step,
and wrap result aggregation/annotation (uses of `timings`, `resultKey`,
`summarize`, and `test.info().annotations.push`) and the final console log into
a "record results" step; update the test body (the anonymous async function
passed to `test(...)`) to call `await test.step(name, async () => { ... })`
around those phases so failures show as separate steps while preserving existing
logic and variables.
tests/benchmarks/utils/compare.js (1)

62-62: ⚡ Quick win

baseline.collections fallback is undocumented dead code.

The baseline schema only defines entries; the baseline.collections fallback has no corresponding writer and will silently swallow a missing entries key instead of failing loudly.

-const baselineEntries = baseline.entries || baseline.collections || {};
+const baselineEntries = baseline.entries || {};
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/benchmarks/utils/compare.js` at line 62, Replace the undocumented
fallback that uses baseline.collections with a strict requirement for
baseline.entries: change the initialization of baselineEntries (the variable
named baselineEntries) to only read baseline.entries and add a guard that throws
(or returns a clear error) when baseline.entries is missing so the function
fails loudly instead of silently using a dead-path; this ensures the code path
in compare.js validates the expected schema rather than swallowing a missing
entries key.
tests/benchmarks/utils/pr-comment.js (1)

29-33: ⚡ Quick win

pct is a string — parse it before comparing to numbers.

.toFixed(1) returns a string, so all comparisons on lines 30–31 and 33 (pct > threshold, pct < -threshold, pct > 0) silently rely on JS coercion. When base.mean is 0, pct becomes "Infinity" (or potentially "NaN") and the coercion is unreliable. Keep pct numeric; format only at the point of interpolation.

♻️ Suggested fix
-  const pct = ((data.mean - base.mean) / base.mean * 100).toFixed(1);
-  const status = pct > threshold ? '🔴 REGRESSION' : pct < -threshold ? '🟢 IMPROVED' : '✅ OK';
-  if (pct > threshold) hasRegression = true;
-
-  body += `| ${key} | ${Math.round(data.mean)} | ${base.mean} | ${pct > 0 ? '+' : ''}${pct}% | ${status} |\n`;
+  const pct = (data.mean - base.mean) / base.mean * 100;
+  const pctStr = pct.toFixed(1);
+  const status = pct > threshold ? '🔴 REGRESSION' : pct < -threshold ? '🟢 IMPROVED' : '✅ OK';
+  if (pct > threshold) hasRegression = true;
+
+  body += `| ${key} | ${Math.round(data.mean)} | ${base.mean} | ${pct > 0 ? '+' : ''}${pctStr}% | ${status} |\n`;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/benchmarks/utils/pr-comment.js` around lines 29 - 33, pct is being set
to a string via .toFixed(1) and then compared numerically; change the code so
pct is a numeric value used for comparisons (compute pct = (data.mean -
base.mean) / base.mean * 100 without .toFixed), perform numeric comparisons
against threshold and -threshold (affecting the checks using pct in the
hasRegression assignment and status calculation), and only format pct with
toFixed(1) when building the output interpolation for body (e.g., use a
formattedPct variable for the string). Ensure the same named symbols are
updated: pct (numeric), hasRegression, status, and the body interpolation where
`${pct}` is inserted.
tests/benchmarks/mounting/baseline.json (1)

1-46: ⚡ Quick win

Placeholder baseline values — bru-* and yml-* entries are identical, which may mask format-specific regressions.

All ten entries share the same mean/p50 figures (e.g., both bru-50 and yml-50 have mean: 2000). Committing equal baselines for two formats that likely have distinct I/O characteristics means the comparison is not meaningful until real numbers replace these. Consider running the suite once on a reference machine and replacing these with the actual measured medians before merging, or at minimum add a prominent "status": "placeholder" field and fail CI baseline comparison when that flag is present.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/benchmarks/mounting/baseline.json` around lines 1 - 46, The baseline
file currently uses placeholder identical values for all entries (see entries
"bru-50"..."bru-5000" and "yml-50"..."yml-5000"), which masks format-specific
regressions; replace these placeholders by running the benchmark on a reference
machine and updating the entries with the real measured mean/p50 (use the
provided script node tests/benchmarks/mounting/compare.js --update-baseline) so
each "bru-*" and "yml-*" has accurate numbers, or if you cannot run benchmarks
yet add a top-level "status":"placeholder" flag in the JSON and update CI to
fail baseline comparisons when status is "placeholder" to prevent merging until
real values are recorded.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/benchmarks.yml:
- Around line 61-73: The "Comment Benchmark Results on PR" workflow step (uses:
actions/github-script@v7 and calling run({ resultsPath:
'tests/benchmarks/results/mounting.json', baselinePath:
'tests/benchmarks/mounting/baseline.json', title: 'Benchmark Results —
Collection Mount' })) can fail with 403 for forked PRs; update that step to
include continue-on-error: true to avoid failing the whole job on permission
errors (alternatively replace the event with a secure
pull_request_target/workflow_run pattern if you need to support writing comments
from privileged runs).

In `@tests/benchmarks/utils/collection-generator.ts`:
- Around line 46-47: The ymlContent produced by stringifyCollection may be
null/undefined and is written directly via fs.writeFileSync, so add a fallback
like the existing one for the 'bru' output: compute ymlContent =
stringifyCollection(...) || `meta: { name: "${name}", type: "collection",
opencollection: "1.0.0" }` (or a minimal valid YAML string), then pass that safe
ymlContent to fs.writeFileSync; update references to stringifyCollection and
ymlContent accordingly so opencollection.yml never contains "undefined"/"null".

In `@tests/benchmarks/utils/compare.js`:
- Line 21: The file uses ESM import syntax (import { existsSync, readFileSync,
writeFileSync } from 'fs') inside tests/benchmarks/utils/compare.js which causes
a SyntaxError under default CommonJS Node; either rename compare.js →
compare.mjs and update .github/workflows/benchmarks.yml to invoke the .mjs path
(Option A), or convert the file to CommonJS by replacing the import with const {
existsSync, readFileSync, writeFileSync } = require('fs') and ensure any other
ESM syntax is changed accordingly (Option B); pick one option and make the
matching change to the CI workflow if you rename the file.

In `@tests/benchmarks/utils/pr-comment.js`:
- Around line 77-78: The code reads baselinePath without checking existence,
causing an uncaught ENOENT; wrap the baseline read in the same guard/handling
used for resultsPath — either check fs.existsSync(baselinePath) before
JSON.parse or perform the fs.readFileSync/JSON.parse inside a try/catch, and on
error emit a clear log/error message and exit or return; update the lines that
call JSON.parse(fs.readFileSync(baselinePath, 'utf-8')) (and any surrounding
logic using baseline) to use this existence check or try/catch so missing or
invalid baseline files produce a helpful message instead of an unhandled
exception.
- Around line 46-52: The current call using github.rest.issues.listComments(...)
and then searching comments.find(...) can miss the existing marker on PRs with
>100 comments; update the call to either include per_page: 100 and handle
pagination, or replace the listComments call with
github.paginate(github.rest.issues.listComments, { owner: context.repo.owner,
repo: context.repo.repo, issue_number: context.issue.number, per_page: 100 })
and then search the returned array for the marker (remove the .data unwrap since
paginate returns items directly); adjust the variable currently named comments
and the comments.find(...) usage accordingly so duplicate benchmark comments are
not created.

---

Nitpick comments:
In `@tests/benchmarks/mounting/baseline.json`:
- Around line 1-46: The baseline file currently uses placeholder identical
values for all entries (see entries "bru-50"..."bru-5000" and
"yml-50"..."yml-5000"), which masks format-specific regressions; replace these
placeholders by running the benchmark on a reference machine and updating the
entries with the real measured mean/p50 (use the provided script node
tests/benchmarks/mounting/compare.js --update-baseline) so each "bru-*" and
"yml-*" has accurate numbers, or if you cannot run benchmarks yet add a
top-level "status":"placeholder" flag in the JSON and update CI to fail baseline
comparisons when status is "placeholder" to prevent merging until real values
are recorded.

In `@tests/benchmarks/mounting/collection-mount.bench.ts`:
- Line 81: Replace the non-deterministic sleep (page.waitForTimeout(500)) in the
collection-mount benchmark with a locator-based expect or a justified comment:
after calling closeAllCollections() wait for a concrete UI change (e.g.,
expect(sidebarCollectionList).toBeEmpty() or
expect(collectionItem.locator('text=...')).not.toBeVisible()) that signals
completion; if there truly is no stable DOM change to await, keep a very brief
comment explaining why a fixed timeout is required and reduce the timeout to the
minimum necessary while noting the reason.
- Around line 70-95: The test body for the `test("mount ${format} collection
with ${size} requests", ...)` case should be split into explicit Playwright
`test.step` calls to improve report clarity: wrap the collection generation loop
(calls to `generateCollection` and `createTmpDir`) in a "generate collections"
step, wrap each measurement call to `measureCollectionMount` (and the
per-iteration wait) inside a "measure mount (iteration i)" step or a single
"measure mounts" step, and wrap result aggregation/annotation (uses of
`timings`, `resultKey`, `summarize`, and `test.info().annotations.push`) and the
final console log into a "record results" step; update the test body (the
anonymous async function passed to `test(...)`) to call `await test.step(name,
async () => { ... })` around those phases so failures show as separate steps
while preserving existing logic and variables.

In `@tests/benchmarks/utils/compare.js`:
- Line 62: Replace the undocumented fallback that uses baseline.collections with
a strict requirement for baseline.entries: change the initialization of
baselineEntries (the variable named baselineEntries) to only read
baseline.entries and add a guard that throws (or returns a clear error) when
baseline.entries is missing so the function fails loudly instead of silently
using a dead-path; this ensures the code path in compare.js validates the
expected schema rather than swallowing a missing entries key.

In `@tests/benchmarks/utils/pr-comment.js`:
- Around line 29-33: pct is being set to a string via .toFixed(1) and then
compared numerically; change the code so pct is a numeric value used for
comparisons (compute pct = (data.mean - base.mean) / base.mean * 100 without
.toFixed), perform numeric comparisons against threshold and -threshold
(affecting the checks using pct in the hasRegression assignment and status
calculation), and only format pct with toFixed(1) when building the output
interpolation for body (e.g., use a formattedPct variable for the string).
Ensure the same named symbols are updated: pct (numeric), hasRegression, status,
and the body interpolation where `${pct}` is inserted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b03d3316-2bd5-4d22-9187-5e9061d11d3b

📥 Commits

Reviewing files that changed from the base of the PR and between d332d8e and c536810.

📒 Files selected for processing (13)
  • .github/actions/tests/run-benchmark-tests/action.yml
  • .github/workflows/benchmarks.yml
  • .gitignore
  • package.json
  • playwright.benchmark.config.ts
  • playwright.config.ts
  • tests/benchmarks/mounting/baseline.json
  • tests/benchmarks/mounting/collection-mount.bench.ts
  • tests/benchmarks/utils/collection-generator.ts
  • tests/benchmarks/utils/compare.js
  • tests/benchmarks/utils/pr-comment.js
  • tests/benchmarks/utils/results.ts
  • tests/benchmarks/utils/stats.ts

Comment thread .github/workflows/benchmarks.yml
Comment on lines +46 to +47
const ymlContent = stringifyCollection({ name } as any, { name, type: 'collection', opencollection: '1.0.0' }, { format: 'yml' });
fs.writeFileSync(path.join(dir, 'opencollection.yml'), ymlContent);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Missing fallback for yml stringifyCollection result.

Line 44 guards against a falsy bru return with || \meta {...}`, but line 46 writes ymlContentdirectly. IfstringifyCollectionreturnsnull/undefinedfor the yml format,writeFileSyncwill write the string"undefined"or"null"`, producing an unreadable fixture file.

🛡️ Proposed fix
-    const ymlContent = stringifyCollection({ name } as any, { name, type: 'collection', opencollection: '1.0.0' }, { format: 'yml' });
-    fs.writeFileSync(path.join(dir, 'opencollection.yml'), ymlContent);
+    const ymlContent = stringifyCollection({ name } as any, { name, type: 'collection', opencollection: '1.0.0' }, { format: 'yml' })
+      || `name: ${name}\ntype: collection\nopencollection: '1.0.0'\n`;
+    fs.writeFileSync(path.join(dir, 'opencollection.yml'), ymlContent);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const ymlContent = stringifyCollection({ name } as any, { name, type: 'collection', opencollection: '1.0.0' }, { format: 'yml' });
fs.writeFileSync(path.join(dir, 'opencollection.yml'), ymlContent);
const ymlContent = stringifyCollection({ name } as any, { name, type: 'collection', opencollection: '1.0.0' }, { format: 'yml' })
|| `name: ${name}\ntype: collection\nopencollection: '1.0.0'\n`;
fs.writeFileSync(path.join(dir, 'opencollection.yml'), ymlContent);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/benchmarks/utils/collection-generator.ts` around lines 46 - 47, The
ymlContent produced by stringifyCollection may be null/undefined and is written
directly via fs.writeFileSync, so add a fallback like the existing one for the
'bru' output: compute ymlContent = stringifyCollection(...) || `meta: { name:
"${name}", type: "collection", opencollection: "1.0.0" }` (or a minimal valid
YAML string), then pass that safe ymlContent to fs.writeFileSync; update
references to stringifyCollection and ymlContent accordingly so
opencollection.yml never contains "undefined"/"null".

* --update-baseline
*/

import { existsSync, readFileSync, writeFileSync } from 'fs';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm root package.json has no "type": "module"
cat package.json | python3 -c "import sys, json; d=json.load(sys.stdin); print('type:', d.get('type', '<not set>'))"

Repository: usebruno/bruno

Length of output: 74


🏁 Script executed:

# Check if the file exists and view its imports
cat tests/benchmarks/utils/compare.js | head -30

Repository: usebruno/bruno

Length of output: 1057


🏁 Script executed:

# Check the GitHub action file for how compare.js is invoked
find .github -name "*benchmark*" -type f

Repository: usebruno/bruno

Length of output: 91


🏁 Script executed:

# If action file found, check the invocation
cat .github/actions/tests/run-benchmark-tests/action.yml | grep -A 5 -B 5 "compare.js" || echo "Pattern not found in action.yml"

Repository: usebruno/bruno

Length of output: 325


ESM import in .js file breaks CI — GitHub Actions benchmark comparison will fail with SyntaxError

Node.js treats .js files as CommonJS by default. Without "type": "module" in package.json, the node tests/benchmarks/utils/compare.js command will immediately throw SyntaxError: Cannot use import statement outside a module, killing the "Compare Against Baseline" CI step.

Choose one fix:

Option A: Rename to `.mjs`

Rename compare.jscompare.mjs and update .github/workflows/benchmarks.yml:

-        node tests/benchmarks/utils/compare.js \
+        node tests/benchmarks/utils/compare.mjs \
Option B: Convert to CommonJS
-import { existsSync, readFileSync, writeFileSync } from 'fs';
+const { existsSync, readFileSync, writeFileSync } = require('fs');
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import { existsSync, readFileSync, writeFileSync } from 'fs';
const { existsSync, readFileSync, writeFileSync } = require('fs');
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/benchmarks/utils/compare.js` at line 21, The file uses ESM import
syntax (import { existsSync, readFileSync, writeFileSync } from 'fs') inside
tests/benchmarks/utils/compare.js which causes a SyntaxError under default
CommonJS Node; either rename compare.js → compare.mjs and update
.github/workflows/benchmarks.yml to invoke the .mjs path (Option A), or convert
the file to CommonJS by replacing the import with const { existsSync,
readFileSync, writeFileSync } = require('fs') and ensure any other ESM syntax is
changed accordingly (Option B); pick one option and make the matching change to
the CI workflow if you rename the file.

Comment on lines +46 to +52
const { data: comments } = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number
});

const existing = comments.find((c) => c.body.startsWith(marker));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

listComments is not paginated — duplicate benchmark comments will accumulate on long-lived PRs.

The GitHub API returns at most 100 comments per page by default. On a PR with more than 100 comments the existing benchmark comment may lie on a later page, causing comments.find(...) to return undefined and a brand-new comment to be posted every benchmark run. Add per_page: 100 as a start; for full robustness, paginate with github.paginate.

🛠️ Suggested minimal fix
-  const { data: comments } = await github.rest.issues.listComments({
-    owner: context.repo.owner,
-    repo: context.repo.repo,
-    issue_number: context.issue.number
-  });
+  const comments = await github.paginate(github.rest.issues.listComments, {
+    owner: context.repo.owner,
+    repo: context.repo.repo,
+    issue_number: context.issue.number,
+    per_page: 100
+  });

Update the find call accordingly (no .data unwrap needed after paginate):

-  const existing = comments.find((c) => c.body.startsWith(marker));
+  const existing = comments.find((c) => c.body && c.body.startsWith(marker));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/benchmarks/utils/pr-comment.js` around lines 46 - 52, The current call
using github.rest.issues.listComments(...) and then searching comments.find(...)
can miss the existing marker on PRs with >100 comments; update the call to
either include per_page: 100 and handle pagination, or replace the listComments
call with github.paginate(github.rest.issues.listComments, { owner:
context.repo.owner, repo: context.repo.repo, issue_number: context.issue.number,
per_page: 100 }) and then search the returned array for the marker (remove the
.data unwrap since paginate returns items directly); adjust the variable
currently named comments and the comments.find(...) usage accordingly so
duplicate benchmark comments are not created.

Comment on lines +77 to +78
const results = JSON.parse(fs.readFileSync(resultsPath, 'utf-8'));
const baseline = JSON.parse(fs.readFileSync(baselinePath, 'utf-8'));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

baselinePath has no existence guard — crashes with an unhandled exception if the file is missing.

resultsPath is guarded (lines 72–75) but baselinePath is not. A missing or mistyped baseline path throws a synchronous ENOENT that will surface as an opaque CI failure instead of a helpful log message.

🛡️ Suggested fix
+  if (!fs.existsSync(baselinePath)) {
+    console.log(`No baseline found at ${baselinePath}, skipping comment.`);
+    return;
+  }
+
   const results = JSON.parse(fs.readFileSync(resultsPath, 'utf-8'));
   const baseline = JSON.parse(fs.readFileSync(baselinePath, 'utf-8'));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/benchmarks/utils/pr-comment.js` around lines 77 - 78, The code reads
baselinePath without checking existence, causing an uncaught ENOENT; wrap the
baseline read in the same guard/handling used for resultsPath — either check
fs.existsSync(baselinePath) before JSON.parse or perform the
fs.readFileSync/JSON.parse inside a try/catch, and on error emit a clear
log/error message and exit or return; update the lines that call
JSON.parse(fs.readFileSync(baselinePath, 'utf-8')) (and any surrounding logic
using baseline) to use this existence check or try/catch so missing or invalid
baseline files produce a helpful message instead of an unhandled exception.

cchirag added 2 commits May 5, 2026 23:51
…fix review issues

- Same collection mounted/unmounted across iterations for cold vs cached comparison
- workflow_dispatch has update-baseline boolean input for manual baseline updates
- Fix string comparison bug in pr-comment.js (pct was string from toFixed)
- Remove dead baseline.collections fallback in compare.js and pr-comment.js
- Remove unnecessary waitForTimeout between iterations
- Rename pct/pctChange to changePercent/percentChange for readability
…mmit on update-baseline

- Reduce max collection size from 5000 to 3000 to keep CI runtime reasonable
- Update baseline values from actual CI run data (worst case across ubuntu/macos/windows)
- Auto-commit updated baseline.json when update-baseline is triggered via workflow_dispatch
- Reuse same collection across iterations for cold vs cached comparison
- Fix string comparison bug and remove dead code from review feedback
- Rename pct variables to changePercent for readability
- Remove unnecessary waitForTimeout between iterations
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
.github/workflows/benchmarks.yml (1)

75-87: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Prevent fork PR permission errors from failing the benchmark job

Line 75–87 can still fail on fork-originated pull_request runs when PR comment write is blocked. Make the comment step non-blocking so benchmark execution/artifacts remain available.

Suggested patch
       - name: Comment Benchmark Results on PR
         if: github.event_name == 'pull_request' && !cancelled() && matrix.os-name == 'ubuntu'
+        continue-on-error: true
         uses: actions/github-script@v7
         with:
           script: |
             const run = require('./tests/benchmarks/utils/pr-comment.js');
             await run({
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/benchmarks.yml around lines 75 - 87, The "Comment
Benchmark Results on PR" step currently calls the pr-comment script and can fail
on forked PRs when comment permissions are blocked; make this step non-blocking
by adding continue-on-error: true to that step (the step with name "Comment
Benchmark Results on PR" that uses actions/github-script@v7 and invokes the
run(...) call) so failures to post the comment won't fail the whole job and
artifacts/results remain available.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In @.github/workflows/benchmarks.yml:
- Around line 75-87: The "Comment Benchmark Results on PR" step currently calls
the pr-comment script and can fail on forked PRs when comment permissions are
blocked; make this step non-blocking by adding continue-on-error: true to that
step (the step with name "Comment Benchmark Results on PR" that uses
actions/github-script@v7 and invokes the run(...) call) so failures to post the
comment won't fail the whole job and artifacts/results remain available.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c9bd73f9-89e6-4048-a88a-9f2e6dad5879

📥 Commits

Reviewing files that changed from the base of the PR and between ae9c9e8 and 80205e9.

📒 Files selected for processing (3)
  • .github/workflows/benchmarks.yml
  • tests/benchmarks/mounting/baseline.json
  • tests/benchmarks/mounting/collection-mount.bench.ts
✅ Files skipped from review due to trivial changes (1)
  • tests/benchmarks/mounting/baseline.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/benchmarks/mounting/collection-mount.bench.ts

cchirag added 2 commits May 6, 2026 01:41
- Add continue-on-error to PR comment step since GITHUB_TOKEN lacks write access on cross-fork PRs
- Split baseline.json into baseline.ubuntu/macos/windows.json with real CI data
- Action and workflow dynamically reference baseline per OS
- PR comment posted per OS with OS-specific comparison
- Auto-commit updated baseline on workflow_dispatch with update-baseline flag
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/benchmarks.yml:
- Around line 67-73: Replace the per-matrix push in the "Commit Updated
Baseline" step with an artifacts workflow: in the matrix job (the step keyed by
name "Commit Updated Baseline" that currently checks
github.event.inputs.update-baseline == 'true' and references ${ { matrix.os-name
} }), stop running git commit/push and instead upload the updated baseline file
as an artifact named with the OS (use actions/upload-artifact, upload
tests/benchmarks/mounting/baseline.${{ matrix.os-name }}.json). Add a single
downstream job (e.g., "merge-baseline-updates") that runs once when
github.event.inputs.update-baseline == 'true', depends on the matrix job(s) via
needs, downloads all per-OS artifacts with actions/download-artifact, places
them into tests/benchmarks/mounting/, then performs git config, a single git
add/commit -m "chore: update baseline" and git push so one atomic commit handles
all baseline files and avoids concurrent non-fast-forward failures.
- Around line 67-73: The commit step named "Commit Updated Baseline" currently
uses the chained shell expression `git diff --staged --quiet || git commit -m
... && git push`, which can run `git push` even when no commit was created;
replace this operator chaining with an explicit conditional that checks the
staged-diff result and only runs the commit-and-push sequence when changes exist
(i.e., test `git diff --staged --quiet` and, if it indicates changes, run `git
commit -m "chore: update ${{ matrix.os-name }} benchmark baseline"` followed by
`git push`), ensuring push is executed only after a successful commit.

In `@tests/benchmarks/mounting/baseline.windows.json`:
- Around line 3-43: The baseline JSON "entries" currently only includes sizes up
to "bru-3000" and "yml-3000", but the benchmark coverage should span 50–5000;
run the mount benchmarks for the 5000-size collections and add new baseline
records named "bru-5000" and "yml-5000" to the entries object (including mean
and p50 stats) so regression detection covers the largest collection size.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0c3f77dd-22e8-4b5e-bcbc-78927b8e9bd9

📥 Commits

Reviewing files that changed from the base of the PR and between 8e14991 and 33ae801.

📒 Files selected for processing (5)
  • .github/actions/tests/run-benchmark-tests/action.yml
  • .github/workflows/benchmarks.yml
  • tests/benchmarks/mounting/baseline.macos.json
  • tests/benchmarks/mounting/baseline.ubuntu.json
  • tests/benchmarks/mounting/baseline.windows.json
✅ Files skipped from review due to trivial changes (2)
  • tests/benchmarks/mounting/baseline.macos.json
  • tests/benchmarks/mounting/baseline.ubuntu.json

Comment on lines +67 to +73
- name: Commit Updated Baseline
if: github.event.inputs.update-baseline == 'true'
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add tests/benchmarks/mounting/baseline.${{ matrix.os-name }}.json
git diff --staged --quiet || git commit -m "chore: update ${{ matrix.os-name }} benchmark baseline" && git push
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify commit/push exists inside matrix benchmark job
rg -n -C3 --type=yaml 'strategy:|matrix:|Commit Updated Baseline|git push' .github/workflows/benchmarks.yml

Repository: usebruno/bruno

Length of output: 1003


🏁 Script executed:

cat -n .github/workflows/benchmarks.yml

Repository: usebruno/bruno

Length of output: 3702


Matrix jobs push concurrently to the same branch, risking non-fast-forward failures.

When update-baseline is true, all three OS jobs push simultaneously to the same branch. Since each updates a different baseline file, git rejects subsequent pushes as not fast-forward, causing intermittent workflow failures.

Safer approach: Have matrix jobs upload per-OS baseline artifacts, then run a single follow-up job to download all baselines and perform one atomic commit/push.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/benchmarks.yml around lines 67 - 73, Replace the
per-matrix push in the "Commit Updated Baseline" step with an artifacts
workflow: in the matrix job (the step keyed by name "Commit Updated Baseline"
that currently checks github.event.inputs.update-baseline == 'true' and
references ${ { matrix.os-name } }), stop running git commit/push and instead
upload the updated baseline file as an artifact named with the OS (use
actions/upload-artifact, upload tests/benchmarks/mounting/baseline.${{
matrix.os-name }}.json). Add a single downstream job (e.g.,
"merge-baseline-updates") that runs once when
github.event.inputs.update-baseline == 'true', depends on the matrix job(s) via
needs, downloads all per-OS artifacts with actions/download-artifact, places
them into tests/benchmarks/mounting/, then performs git config, a single git
add/commit -m "chore: update baseline" and git push so one atomic commit handles
all baseline files and avoids concurrent non-fast-forward failures.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the risky pattern exists in workflow files
rg -n --type=yaml 'git diff --staged --quiet \|\| git commit .*&& git push'

Repository: usebruno/bruno

Length of output: 40


🏁 Script executed:

# Check if the file exists and read the relevant section
if [ -f ".github/workflows/benchmarks.yml" ]; then
  echo "=== File exists, showing lines 60-80 with context ==="
  sed -n '60,80p' .github/workflows/benchmarks.yml | cat -n
else
  echo "=== File not found, listing .github/workflows directory ==="
  ls -la .github/workflows/ 2>/dev/null || echo "Directory not found"
fi

Repository: usebruno/bruno

Length of output: 1106


🏁 Script executed:

# Create a test to demonstrate the operator precedence issue
cat > /tmp/test_operators.sh << 'EOF'
#!/bin/bash

echo "=== Scenario 1: File with no changes (git diff exits 0) ==="
mkdir -p /tmp/git_test1
cd /tmp/git_test1
git init
echo "initial" > file.txt
git add file.txt
git commit -m "initial"

# Simulate: git diff --staged --quiet (no staged changes)
echo "Running: git diff --staged --quiet || echo 'COMMIT' && echo 'PUSH'"
git diff --staged --quiet || echo "COMMIT executed" && echo "PUSH executed"
echo "Exit code of full chain: $?"

echo ""
echo "=== Scenario 2: File with staged changes (git diff exits 1) ==="
cd /tmp/git_test1
echo "modified" > file.txt
git add file.txt

# Simulate: git diff --staged --quiet (has staged changes)
echo "Running: git diff --staged --quiet || echo 'COMMIT' && echo 'PUSH'"
git diff --staged --quiet || echo "COMMIT executed" && echo "PUSH executed"
echo "Exit code of full chain: $?"
EOF

bash /tmp/test_operators.sh

Repository: usebruno/bruno

Length of output: 1387


🏁 Script executed:

# Test bash operator precedence with simple commands
cat > /tmp/test_precedence.sh << 'EOF'
#!/bin/bash

echo "=== Testing: false || echo 'B' && echo 'C' ==="
false || echo 'B' && echo 'C'
echo "---"

echo "=== Testing: true || echo 'B' && echo 'C' ==="
true || echo 'B' && echo 'C'
echo "---"

echo "=== Testing: (false || echo 'B') && echo 'C' [explicit grouping] ==="
(false || echo 'B') && echo 'C'
echo "---"

echo "=== Testing: (true || echo 'B') && echo 'C' [explicit grouping] ==="
(true || echo 'B') && echo 'C'
echo "---"
EOF

bash /tmp/test_precedence.sh

Repository: usebruno/bruno

Length of output: 316


🏁 Script executed:

cat > /tmp/test_fix.sh << 'EOF'
#!/bin/bash

echo "=== Simulating the FIXED version ==="
echo ""
echo "Case 1: No staged changes"
if ! true; then  # git diff --staged --quiet would return 0 (true/success)
  echo "  commit would run"
  echo "  push would run"
else
  echo "  commit skipped"
  echo "  push skipped"
fi

echo ""
echo "Case 2: Staged changes exist"
if ! false; then  # git diff --staged --quiet would return 1 (false/failure)
  echo "  commit would run"
  echo "  push would run"
else
  echo "  commit skipped"
  echo "  push skipped"
fi
EOF

bash /tmp/test_fix.sh

Repository: usebruno/bruno

Length of output: 220


Fix line 73 operator chaining to prevent pushing when nothing was committed.

The git diff --staged --quiet || git commit ... && git push pattern runs git push even when there are no staged changes, making the step unreliable. With matrix jobs pushing simultaneously, this increases failure risk.

Suggested fix
-          git diff --staged --quiet || git commit -m "chore: update ${{ matrix.os-name }} benchmark baseline" && git push
+          if ! git diff --staged --quiet; then
+            git commit -m "chore: update ${{ matrix.os-name }} benchmark baseline"
+            git push
+          fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Commit Updated Baseline
if: github.event.inputs.update-baseline == 'true'
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add tests/benchmarks/mounting/baseline.${{ matrix.os-name }}.json
git diff --staged --quiet || git commit -m "chore: update ${{ matrix.os-name }} benchmark baseline" && git push
- name: Commit Updated Baseline
if: github.event.inputs.update-baseline == 'true'
run: |
git config user.name "github-actions[bot]"
git config user.email "github-actions[bot]@users.noreply.github.com"
git add tests/benchmarks/mounting/baseline.${{ matrix.os-name }}.json
if ! git diff --staged --quiet; then
git commit -m "chore: update ${{ matrix.os-name }} benchmark baseline"
git push
fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/benchmarks.yml around lines 67 - 73, The commit step named
"Commit Updated Baseline" currently uses the chained shell expression `git diff
--staged --quiet || git commit -m ... && git push`, which can run `git push`
even when no commit was created; replace this operator chaining with an explicit
conditional that checks the staged-diff result and only runs the commit-and-push
sequence when changes exist (i.e., test `git diff --staged --quiet` and, if it
indicates changes, run `git commit -m "chore: update ${{ matrix.os-name }}
benchmark baseline"` followed by `git push`), ensuring push is executed only
after a successful commit.

Comment on lines +3 to +43
"entries": {
"bru-50": {
"mean": 2700,
"p50": 800
},
"bru-200": {
"mean": 1500,
"p50": 1400
},
"bru-500": {
"mean": 3500,
"p50": 3500
},
"bru-1000": {
"mean": 9500,
"p50": 9400
},
"bru-3000": {
"mean": 195000,
"p50": 190000
},
"yml-50": {
"mean": 600,
"p50": 570
},
"yml-200": {
"mean": 1350,
"p50": 1300
},
"yml-500": {
"mean": 3800,
"p50": 3700
},
"yml-1000": {
"mean": 11000,
"p50": 11000
},
"yml-3000": {
"mean": 90000,
"p50": 88000
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Missing 5000-size baseline entries reduces promised coverage

entries currently stops at *-3000, but the PR objective says mount benchmarks cover sizes 50–5000. Please add bru-5000 and yml-5000 baseline records (via benchmark run) so regression detection includes the largest collection size too.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/benchmarks/mounting/baseline.windows.json` around lines 3 - 43, The
baseline JSON "entries" currently only includes sizes up to "bru-3000" and
"yml-3000", but the benchmark coverage should span 50–5000; run the mount
benchmarks for the 5000-size collections and add new baseline records named
"bru-5000" and "yml-5000" to the entries object (including mean and p50 stats)
so regression detection covers the largest collection size.

…ults

- writeResults now accepts SuiteMeta with name, unit, and direction
- Results JSON includes suite field for the visualization dashboard to ingest
- Mounting benchmark outputs unit: ms, direction: smaller
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants