Iterate resolve-issue skill: internal consistency checks, stronger Claude rule

snapsynapse · claude · snapsynapse · commit a82f0ba6e045 · 2026-03-15T13:08:14.000-06:00
- Add Step 2 internal consistency check (gating vs availability table,
  talking point vs data, platform completeness, source URL health)
- Remove aggressive "phrasing/terminology" blanket close heuristic
- Make external research mandatory for all Claude/Anthropic issues
- Add audit trail requirement for no-change closes
- Flag phrasing differences that touch gating/pricing for investigation

Based on learnings from resolving 25 issues: the skill was too quick to
dismiss conflicts as "phrasing" and missed internal data contradictions
(Vision gating=paid vs free availability table, Codex missing terminal/Linux).

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/skills/resolve-issue/src/SKILL.md b/skills/resolve-issue/src/SKILL.md
@@ -37,29 +37,40 @@ For batch mode, list all open issues for the platform:
 gh issue list --repo snapsynapse/ai-capability-reference --state open --search "<platform>" --json number,title,labels
 ```
 
-### Step 2: Read current data
+### Step 2: Read current data and check internal consistency
 
 Read the relevant `data/platforms/<platform>.md` file. Identify the feature section that matches the issue.
 
+Before assessing the issue itself, scan the feature's data for internal contradictions — these are often more reliable indicators of errors than the model responses:
+- Does the **Gating** field match the **availability table**? (e.g., `Gating: paid` but Free tier shows ✅ means one of them is wrong)
+- Does the **talking point** match the actual gating/status? (e.g., talking point says "available on all plans" but Gating says `paid`)
+- Is the **platforms table** complete? Check whether major surfaces (terminal/CLI, Linux, API) are missing when they plausibly exist for this feature.
+- Do the **source URLs** still work? A 404 is worth noting even if it's not the issue being resolved.
+
 ### Step 3: Assess — apply these heuristics IN ORDER
 
 **Close immediately as duplicate if:**
 - An older `verification-inconclusive` issue exists for the same feature AND a newer `verification-conflict` issue also exists. Close the older one with comment: "Superseded by #[newer]. Consolidating to the newer issue."
 
 **Close with no data change (bump Checked date only) if:**
-- Both models agree on the facts but the "conflict" is about phrasing or terminology
 - One model (usually Perplexity) says "insufficient sources" but the other confirms our existing data is correct
 - The flagged change is an incremental UX improvement, not a change to status, gating, pricing tiers, platform availability, or regional availability
-- The issue asks about a feature that IS correctly reflected in our data
+- The issue asks about a feature that IS correctly reflected in our data AND the internal consistency check (Step 2) found no contradictions
 
 **Research and update data if:**
+- The internal consistency check (Step 2) found contradictions in the data file — even if the models didn't flag them
 - A model reports a genuine change to: gating (free/paid), plan availability, platform support, status (ga/beta/preview/deprecated), or regional availability
 - Both models report something different from what our data says
 - One model makes a specific, sourced claim that contradicts our data
+- The models describe a "phrasing/terminology" difference but the underlying facts touch gating, pricing, or platform availability (e.g., "DALL-E 3 vs GPT-4o" may sound like naming but if one model says free and the other says paid, that's a real discrepancy worth investigating)
+
+### Step 4: Research
+
+**For Claude/Anthropic features: ALWAYS research externally.** You are Claude — you cannot objectively assess your own product's data. Every Claude issue must include at least one Perplexity search or official URL fetch, even if the issue looks like an obvious no-change close. This is non-negotiable.
 
-### Step 4: Research (only if needed)
+**For all other platforms:** research when Step 3 indicates it's needed, or when the internal consistency check found issues.
 
-When research is needed, check official sources first:
+When researching, check official sources first:
 1. Fetch the feature's official URL from the data file
 2. Use Perplexity via curl for broader search:
 ```bash
@@ -134,8 +145,10 @@ After resolving, report to the user:
 
 ### Important rules
 
-- **Never update Claude/Anthropic data based on your own knowledge.** Always verify against official sources. You ARE Claude — this is a conflict of interest.
+- **Never update Claude/Anthropic data based on your own knowledge.** You ARE Claude — every Claude issue requires external verification via Perplexity or official URL fetch. Even for no-change closes, you must show evidence from an external source. See Step 4.
+- **Always check internal consistency before assessing.** The most common real errors are contradictions within the data file itself (gating vs availability table, talking point vs actual data, missing platform rows). The model responses in the issue are a starting point, not the whole picture.
 - **Always present your assessment to the user before making changes** if the issue requires data updates. For no-change closes and duplicate closes, proceed directly.
 - **The talking point must be presenter-ready.** It's used in a classroom setting. Bold the key access/pricing info.
 - **Changelog entries are reverse chronological** — newest at top.
 - **Don't forget to update `last_verified` in the frontmatter** if you're updating any feature for that platform.
+- **When closing with no change, be specific about why.** Don't just say "data is correct." Name which fields you verified and what sources you checked. This creates an audit trail.