fix(supermodel): remove GT and analysis caching — always run fresh (#714) by greynewell · Pull Request #9 · supermodeltools/mcpbr

greynewell · 2026-03-26T17:03:07Z

Summary

Removes both local caches from SupermodelBenchmark. They had no invalidation mechanism and silently served stale data through every fix we applied, which is the actual root cause of the 0-recall scores in supermodeltools/supermodel-public-api#714.

Root cause (confirmed)

GT cache (~/.cache/mcpbr/supermodel_ground_truth/dead_code_*.json):

Written once, read forever — line 174: if gt_path.exists(): return
The FP filter added in 918e3a1 was bypassed for every previously cached task
Confirmed: dead_code_n8n_pr23572.json still had PythonSandbox (should be filtered), dead_code_prisma_pr28485.json still had serializeDatasources (same)

Analysis cache (*_analysis_{zip_hash}.json):

Keyed by zip hash of the repo, not by idempotency key version
Bumping the server key from v1→v2→v3 forces fresh server analysis but the local cache is never checked against the key — the server is never reached
Confirmed: n8n_pr23572_analysis_7e9bae86b22f.json had 3,834 results with zero Pyodide/PythonSandbox hits, indicating it was computed from the post-merge commit (before the pre-merge fix in e35a179) and served that result on every subsequent run

What this removes

DEFAULT_GT_DIR constant
ground_truth_dir constructor parameter
self.gt_dir and its mkdir call
GT cache read/write in _load_ground_truth
Analysis cache read/write in _get_analysis (including hashlib import)
Stale cache files deleted from disk

Why this is safe

GT extraction: single gh api call to fetch a PR diff — takes ~1s, produces correct results every time
Analysis: Supermodel API handles deduplication server-side via the idempotency key; no local caching needed

Test plan

uv run pytest -m "not integration" --ignore=tests/test_benchmarks.py -q passes
Re-run n8n and prisma tasks — expect non-zero recall now that stale GT and stale analysis are both gone

Fixes supermodeltools/supermodel-public-api#714

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

New Features
- Extended dead code detection to recognize additional exported symbol types including interfaces and enum declarations.
- Added alive code set analysis for more precise false-positive evaluation metrics.
Bug Fixes
- Fixed feature-removal false-positive filtering in ground truth extraction.
- Improved accuracy in identifying truly orphaned deleted symbols.
Refactor
- Removed local ground truth and analysis caching; analysis now runs fresh with server-side deduplication.

coderabbitai · 2026-03-26T17:03:22Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ed456d8e-2a1e-4a48-b3b4-7f348c1f9e43

📥 Commits

Reviewing files that changed from the base of the PR and between db54b68 and 731cea6.

📒 Files selected for processing (7)

CHANGELOG.md
src/mcpbr/benchmarks/supermodel/api_client.py
src/mcpbr/benchmarks/supermodel/benchmark.py
src/mcpbr/benchmarks/supermodel/endpoints/dead_code.py
src/mcpbr/benchmarks/supermodel/evaluation.py
src/mcpbr/harness.py
tests/test_supermodel_dead_code.py

Walkthrough

This PR removes ground truth and analysis caching from the Supermodel benchmark, shifting reliance to server-side idempotency-key deduplication. The API client is streamlined with direct API-key headers instead of temporary config files. Dead-code ground-truth extraction now applies feature-removal false-positive filtering to exclude symbols co-removed with consumers while retaining orphaned deletions. File-path handling differentiates between MCP and baseline runs, and evaluation metrics now support alive-code-based false-positive computation.

Changes

Cohort / File(s)	Summary
API Client Updates `src/mcpbr/benchmarks/supermodel/api_client.py`	Idempotency key bumped to v3, API key now passed directly via `-H` headers instead of temp config file, JSON parsing simplified by removing `try/except` wrapping, and return type narrowed to `dict[str, Any]`.
Benchmark Core Logic `src/mcpbr/benchmarks/supermodel/benchmark.py`	Removed GT/analysis caching (`DEFAULT_GT_DIR`, `gt_dir`, `_load_ground_truth()`, zip-hash caching); added `is_mcp` parameter to `create_environment()` to write analysis JSON to visible path (MCP) or hidden path (baseline); updated `evaluate()` to load `entryPoints` from analysis and pass `alive_code` to `compute_prf1()`; forced `resolved=True` unconditionally and added `fp_mode` to output.
Dead Code Endpoint `src/mcpbr/benchmarks/supermodel/endpoints/dead_code.py`	Expanded TypeScript export detection (added `interface`, `type`, `enum`); narrowed `enhanced_prompt_v2` filtering to require `confidence == "high"`; moved feature-removal FP filtering after `scope_prefix` filtering.
Evaluation Metrics `src/mcpbr/benchmarks/supermodel/evaluation.py`	Added optional `alive_code` parameter to `compute_prf1()`; when provided, computes false positives as overlap with alive set rather than set difference from GT; falls back to standard FP if alive set is empty.
Harness Integration `src/mcpbr/harness.py`	Added runtime inspection of `create_environment` signature to conditionally pass `is_mcp` parameter; extended `agent_result_to_dict()` to pass-through additional attributes from `eval_result` into serialized output.
Tests & Documentation `tests/test_supermodel_dead_code.py`, `CHANGELOG.md`	New test suite validating dead-code ground-truth extraction (import parsing, feature-removal FP classification); changelog entries documenting caching removal and FP filtering fix.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

fix(supermodel): remove GT and analysis caching — always run fresh (#714) #9: Appears to implement the same code-level changes—caching removal, signature updates (ground_truth_dir deletion, is_mcp addition), alive_code parameter, API client idempotency updates, and dead-code FP filtering.
fix: dead-code benchmark correctness + binary asset filtering #5: Overlaps on SupermodelBenchmark create_environment / analysis file handling and dead_code endpoint entryPoints / prompt logic modifications.
fix(supermodel): remove recall threshold gating and filter feature-removal GT false positives #7: Modifies the same evaluation recall-threshold removal and implements identical feature-removal false-positive filtering via _is_feature_removal_fp in dead-code ground-truth extraction.

Poem

No caches cluttering the way, 🚀
The server dupes what we once saved all day,
Dead symbols now filtered with care—
Feature-removed ghosts banished from the air! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 42.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: removing GT and analysis caching to fix stale data issues causing 0-recall scores.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/supermodel-dead-code-fixes

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

src/mcpbr/harness.py (1)

277-281: Re-serialize extra eval fields before passing them through.

eval_result has already gone through dict_to_namespace(), so a nested extra metric like {"precision": {"raw": 0.9}} reaches this loop as nested SimpleNamespace objects. Copying that verbatim makes the new pass-through path non-JSON-safe the moment a benchmark returns structured extras.

♻️ Possible fix

+def _to_plain_json(value: Any) -> Any:
+    if isinstance(value, SimpleNamespace):
+        return {k: _to_plain_json(v) for k, v in vars(value).items()}
+    if isinstance(value, list):
+        return [_to_plain_json(v) for v in value]
+    return value
+
         # Pass through any extra fields (e.g. precision/recall from custom benchmarks)
         _known = {"resolved", "patch_applied", "fail_to_pass", "pass_to_pass", "error"}
         for k, v in vars(eval_result).items():
             if k not in _known and k not in data:
-                data[k] = v
+                data[k] = _to_plain_json(v)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/mcpbr/harness.py` around lines 277 - 281, The extra fields from
eval_result are still Namespace/SimpleNamespace objects (because
dict_to_namespace was applied) and must be converted to JSON-safe structures
before adding to data; in the loop that iterates vars(eval_result) (the block
referencing _known and for k,v in vars(eval_result).items()), replace the direct
assignment data[k] = v with a JSON-safe serialization step: detect
Namespace/SimpleNamespace (or other non-primitive types) and recursively convert
them to dict/list/primitive types (e.g. via a small helper like
namespace_to_dict or using json.dumps+json.loads with a default that calls
vars()) so nested metrics (e.g. {"precision": {"raw": 0.9}}) become plain
dicts/lists and are safe to pass through.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CHANGELOG.md`:
- Around line 10-27: Add a new [Unreleased] CHANGELOG.md entry describing the
user-visible result-semantics change: note that Supermodel now can set fp_mode
to "vs alive set" and that the `resolved` field is no longer gated by recall
(i.e., it may be set regardless of recall), and indicate these affect run output
and result JSON; place this entry alongside the existing cache/GT fixes so users
see both behavioral changes and the caching removal.

In `@src/mcpbr/benchmarks/supermodel/api_client.py`:
- Around line 54-58: The current code appends the API key into the curl command
arguments (via the headers list and upload_cmd) exposing it in process argv;
replace the subprocess/curl usage with a Python HTTP client so the API key is
only set in an in-memory header. Locate the places constructing headers and
upload_cmd and the polling loop that builds curl commands (references: headers,
upload_cmd, zip_path, url, and the polling logic around line ~105) and
reimplement the POST file upload and subsequent status poll using an async HTTP
client (httpx or aiohttp) or synchronous requests, setting "X-Api-Key" in the
request headers rather than any command-line, and remove all
subprocess.Popen/args construction that included the key.

In `@src/mcpbr/benchmarks/supermodel/benchmark.py`:
- Around line 607-627: The current evaluate() implementation reads entryPoints
from a workspace file under env.host_workdir (via analysis_path/hidden_path and
self._endpoint.analysis_filename) which is agent-writable and truncated; instead
load the full, trusted alive set from host-side state prepared in
create_environment() (e.g. store the complete entryPoints there) and use that
when computing metrics. Concretely: stop opening files under env.host_workdir in
evaluate(), remove the analysis_path/hidden_path logic, read alive_code from the
host-side attribute you populate in create_environment() (e.g.
self._analysis_data or a new self._endpoint.host_alive_entrypoints) and pass
that alive_code into compute_prf1(agent_findings, ground_truth, key_fields,
alive_code=alive_code).

In `@src/mcpbr/benchmarks/supermodel/endpoints/dead_code.py`:
- Around line 19-21: extract_ground_truth() now emits interfaces/types/enums but
baseline_prompt still instructs agents to exclude them and
SupermodelBenchmark.create_environment() strips those candidates, causing GT vs
prompt/filter mismatch; either remove those symbols from extract_ground_truth()
or (recommended) update baseline_prompt to permit interfaces/types/enums and
stop stripping them in SupermodelBenchmark.create_environment() so GT items are
attainable — specifically, modify baseline_prompt text to remove the “Do NOT
include interface/type/enum” wording and change the filtering logic in
SupermodelBenchmark.create_environment() that currently removes
interface/type/enum candidates (and any regex/filters tied to patterns like
r"^-\s*export\s+interface\s+(\w+)" / r"^-\s*export\s+type\s+(\w+)\s*[={<]" /
r"^-\s*export\s+(?:const\s+)?enum\s+(\w+)") so those symbols are retained when
composing the MCP analysis payload.

---

Nitpick comments:
In `@src/mcpbr/harness.py`:
- Around line 277-281: The extra fields from eval_result are still
Namespace/SimpleNamespace objects (because dict_to_namespace was applied) and
must be converted to JSON-safe structures before adding to data; in the loop
that iterates vars(eval_result) (the block referencing _known and for k,v in
vars(eval_result).items()), replace the direct assignment data[k] = v with a
JSON-safe serialization step: detect Namespace/SimpleNamespace (or other
non-primitive types) and recursively convert them to dict/list/primitive types
(e.g. via a small helper like namespace_to_dict or using json.dumps+json.loads
with a default that calls vars()) so nested metrics (e.g. {"precision": {"raw":
0.9}}) become plain dicts/lists and are safe to pass through.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e666fe8d-40d5-4333-8c95-cb80378d77ce

📥 Commits

Reviewing files that changed from the base of the PR and between db40a67 and 87ad956.

📒 Files selected for processing (8)

CHANGELOG.md
src/mcpbr/benchmarks/supermodel/api_client.py
src/mcpbr/benchmarks/supermodel/benchmark.py
src/mcpbr/benchmarks/supermodel/endpoints/dead_code.py
src/mcpbr/benchmarks/supermodel/evaluation.py
src/mcpbr/benchmarks/supermodel/git_utils.py
src/mcpbr/harness.py
tests/test_supermodel_dead_code.py

coderabbitai · 2026-03-26T17:15:48Z

CHANGELOG.md

+### Fixed
+
+- **Remove GT and analysis caching from SupermodelBenchmark** (supermodeltools/supermodel-public-api#714):
+  Both caches had no invalidation mechanism, causing stale data to persist silently across runs.
+  The GT cache bypassed all fixes applied to `extract_ground_truth` (FP filters, pattern additions).
+  The analysis cache was keyed by zip hash, so server-side idempotency key version bumps did not
+  bust it — the server was never reached and old results were served indefinitely. Both caches are
+  now removed. GT extraction is a single GitHub API call (cheap). The Supermodel API handles
+  server-side deduplication via the idempotency key. Also removes the `DEFAULT_GT_DIR` constant
+  and `ground_truth_dir` constructor parameter, which existed solely to support the caches.
+
+- **Dead code benchmark: filter feature-removal false positives from ground truth** (supermodeltools/supermodel-public-api#714):
+  The ground truth extractor now applies the existing `_is_feature_removal_fp` filter (which
+  was implemented but never called). Symbols deleted in a PR that are also imported by other
+  files deleted in the same PR are excluded from GT — they were live code co-removed with
+  their consumers, not dead code. Genuinely orphaned symbols with no deleted importer are
+  kept. This fixes 0-recall scores for PRs like n8n #23572 and prisma #28485 where whole
+  files were removed as part of a feature deletion.


⚠️ Potential issue | 🟡 Minor

Document the result-semantics change too.

This PR also changes what Supermodel reports back to users: fp_mode can now be "vs alive set", and resolved is no longer recall-gated. That shows up in run output and result JSON, so it should have its own [Unreleased] entry alongside the cache/GT fixes.

Based on learnings "Applies to CHANGELOG.md : Update CHANGELOG.md with ALL user-visible changes: new features, bug fixes, breaking changes, deprecations, security fixes, performance improvements, and CLI/configuration changes".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@CHANGELOG.md` around lines 10 - 27, Add a new [Unreleased] CHANGELOG.md entry describing the user-visible result-semantics change: note that Supermodel now can set fp_mode to "vs alive set" and that the `resolved` field is no longer gated by recall (i.e., it may be set regardless of recall), and indicate these affect run output and result JSON; place this entry alongside the existing cache/GT fixes so users see both behavioral changes and the caching removal.

coderabbitai · 2026-03-26T17:15:48Z

src/mcpbr/benchmarks/supermodel/api_client.py

    if api_key:
-        with tempfile.NamedTemporaryFile(
-            mode="w", suffix=".cfg", prefix="mcpbr_curl_", delete=False
-        ) as api_key_fd:
-            api_key_fd.write(f'header = "X-Api-Key: {api_key}"\n')
-            api_key_config_path = api_key_fd.name
-        os.chmod(api_key_config_path, 0o600)
+        headers.extend(["-H", f"X-Api-Key: {api_key}"])

    # Initial request with file upload
    upload_cmd = ["curl", "-s", "-X", "POST", url, "-F", f"file=@{zip_path}", *headers]


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail python - <<'PY' import os import subprocess import time proc = subprocess.Popen( ["python", "-c", "import time; time.sleep(5)", "X-Api-Key: demo-secret"] ) time.sleep(1) os.system(f"ps -o command= -p {proc.pid}") proc.terminate() proc.wait() PY

Repository: supermodeltools/mcpbr

Length of output: 85

🏁 Script executed:

find src -name "api_client.py" -type f

Repository: supermodeltools/mcpbr

Length of output: 110

🏁 Script executed:

cat -n src/mcpbr/benchmarks/supermodel/api_client.py | sed -n '45,70p'

Repository: supermodeltools/mcpbr

Length of output: 1082

🏁 Script executed:

cat -n src/mcpbr/benchmarks/supermodel/api_client.py | sed -n '1,65p'

Repository: supermodeltools/mcpbr

Length of output: 2434

🏁 Script executed:

cat -n src/mcpbr/benchmarks/supermodel/api_client.py | sed -n '65,100p'

Repository: supermodeltools/mcpbr

Length of output: 1640

🏁 Script executed:

cat -n src/mcpbr/benchmarks/supermodel/api_client.py | sed -n '97,130p'

Repository: supermodeltools/mcpbr

Length of output: 1459

Move API key out of curl command-line arguments.

Right now, the API key gets passed to curl as a -H flag (lines 54-55, repeated at line 105 in the polling loop). That means the key ends up in the process command-line. While the subprocess is running, anything that inspects running processes—like ps, system monitoring tools, or CI/job logs—can see your secret in plain text.

Here's the simple reason why: when you call subprocess.Popen(["curl", "-s", ..., f"X-Api-Key: {api_key}"]), the API key becomes part of the process's argv. On Linux, anyone with access can read /proc/[pid]/cmdline. On any Unix system, ps aux shows it. That's a security regression.

Fix options:

curl config file: Write headers to a temp config file with restrictive permissions, then pass --config to curl. The key stays in the file, not argv.

stdin: Use echo "header: value" | curl --config - to feed config from stdin.

Python HTTP client: Use requests or httpx instead of subprocess—the key never touches argv.

I'd suggest option 3 (Python HTTP client) since you're already in async context. Otherwise, option 1 (temp config file) is safer than option 2.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/mcpbr/benchmarks/supermodel/api_client.py` around lines 54 - 58, The current code appends the API key into the curl command arguments (via the headers list and upload_cmd) exposing it in process argv; replace the subprocess/curl usage with a Python HTTP client so the API key is only set in an in-memory header. Locate the places constructing headers and upload_cmd and the polling loop that builds curl commands (references: headers, upload_cmd, zip_path, url, and the polling logic around line ~105) and reimplement the POST file upload and subsequent status poll using an async HTTP client (httpx or aiohttp) or synchronous requests, setting "X-Api-Key" in the request headers rather than any command-line, and remove all subprocess.Popen/args construction that included the key.

src/mcpbr/benchmarks/supermodel/benchmark.py

src/mcpbr/benchmarks/supermodel/endpoints/dead_code.py

Benchmark evaluate() methods can return arbitrary fields beyond the standard resolved/patch_applied/fail_to_pass set. Pass these through to the stored result dict so metrics like precision, recall, f1_score, true_positives, etc. are captured in incremental_results.jsonl. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…e filter Two changes to eliminate misleading precision numbers: 1. compute_prf1 now accepts an alive_code list. When provided, false positives are pred ∩ alive_set (agent flagged something confirmed alive) rather than pred − gt_set (agent found real dead code the PR didn't happen to remove). evaluate() reads entryPoints from the workspace analysis file to populate it. 2. enhanced_prompt_v2 filter script now skips medium/low confidence candidates. All GT items across sampled tasks are high-confidence; medium/low are noise. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…de label Three changes to make benchmark results credible for comparison: 1. benchmark.py: baseline contamination fix — analysis file written to hidden path (.supermodel_dead_code_analysis.json) when is_mcp=False so the baseline agent cannot see pre-computed results. evaluate() checks both paths. fp_mode field added to result dict to clarify which FP definition was used. 2. harness.py: duck-typed is_mcp — uses inspect.signature to pass is_mcp only when the benchmark's create_environment supports it, so other benchmarks remain unaffected without Protocol changes. 3. evaluation.py: fallback to standard FP when alive_set is empty — prevents trivial precision=1.0 on analysis failure (empty entryPoints returns fp=0 which makes the metric meaningless). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Ensures the post-deploy benchmark run gets fresh dead-code analysis results from the API rather than serving cached v2 responses. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ruth (#714) _is_feature_removal_fp and _parse_deleted_imports were implemented but never called. extract_ground_truth now applies them to remove symbols from GT that were imported by other files also deleted in the same PR — these are feature-removal co-deletions, not dead code, so no static analysis tool would ever report them pre-merge. Genuinely orphaned symbols (no deleted importer) are correctly kept in GT and are detectable by the API when analysing the pre-merge commit. Adds 16 unit tests covering: - _parse_deleted_imports: single-line, multi-line, and default import forms - _is_feature_removal_fp: FP detection keyed to file basename to avoid spurious suppression of unrelated same-named symbols - Integration: full filter pipeline with the n8n (#23572) and prisma (#28485) benchmark cases from the issue Fixes supermodeltools/supermodel-public-api#714 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…714) Both caches had no invalidation mechanism: GT cache (~/.cache/mcpbr/supermodel_ground_truth/dead_code_*.json): - Written once, read forever; bypassed every fix applied to extract_ground_truth (FP filters added in c579a18/918e3a1, pattern additions for interface/type/enum, etc.) - n8n and prisma GT files still contained PythonSandbox/serializeDatasources as benchmark targets even after the FP filter was implemented Analysis cache (*_analysis_{zip_hash}.json): - Keyed by zip hash; server-side idempotency key bumps (v1→v2→v3) did not invalidate it — the server was never reached, old results served - n8n analysis cache showed 3,834 results with zero Pyodide hits, meaning it was computed from the post-merge commit (before the pre-merge fix) and would serve that result forever on re-runs Fix: remove both caches entirely. GT extraction is one GitHub API call. The Supermodel API handles server-side dedup via the idempotency key. Also removes DEFAULT_GT_DIR constant and ground_truth_dir constructor param (existed solely to support the now-deleted caches). Fixes supermodeltools/supermodel-public-api#714 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Completes the local-cache purge started in the previous commit. The cached_analysis field allowed a task YAML to point at a pre-fetched analysis file on disk; this path bypassed the API entirely and shared all the same stale-data problems as the zip-hash cache. Now create_environment always calls _get_analysis directly. The Supermodel API handles deduplication server-side via the idempotency key. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

greynewell and others added 7 commits March 26, 2026 13:22

chore: bump idempotency key to v3 to force fresh API analyses

fe5f8fb

Ensures the post-deploy benchmark run gets fresh dead-code analysis results from the API rather than serving cached v2 responses. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

greynewell force-pushed the feat/supermodel-dead-code-fixes branch from db54b68 to 731cea6 Compare March 26, 2026 17:22

greynewell merged commit 69fe34d into main Mar 26, 2026
12 checks passed

greynewell deleted the feat/supermodel-dead-code-fixes branch March 26, 2026 17:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(supermodel): remove GT and analysis caching — always run fresh (#714)#9

fix(supermodel): remove GT and analysis caching — always run fresh (#714)#9
greynewell merged 7 commits intomainfrom
feat/supermodel-dead-code-fixes

greynewell commented Mar 26, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 26, 2026 •

edited

Loading

Review failed

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 26, 2026

Uh oh!

coderabbitai bot Mar 26, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

greynewell commented Mar 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause (confirmed)

What this removes

Why this is safe

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greynewell commented Mar 26, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 26, 2026 •

edited

Loading