|
| 1 | +# Capability Pressure Test |
| 2 | + |
| 3 | +Status: Draft 1 |
| 4 | +Last updated: 2026-03-07 |
| 5 | + |
| 6 | +This document checks whether the draft capability taxonomy can coexist with the current 70-feature dataset without forcing unnatural schema decisions too early. |
| 7 | + |
| 8 | +## Current Dataset Snapshot |
| 9 | + |
| 10 | +Current feature count by platform: |
| 11 | + |
| 12 | +- ChatGPT: 12 |
| 13 | +- Claude: 9 |
| 14 | +- Microsoft Copilot: 7 |
| 15 | +- Gemini: 14 |
| 16 | +- Grok: 8 |
| 17 | +- Local/Open Models: 10 |
| 18 | +- Perplexity: 10 |
| 19 | + |
| 20 | +Current feature count by existing `Category`: |
| 21 | + |
| 22 | +- `other`: 17 |
| 23 | +- `coding`: 8 |
| 24 | +- `agents`: 6 |
| 25 | +- `integrations`: 6 |
| 26 | +- `research`: 6 |
| 27 | +- `voice`: 5 |
| 28 | +- `image-gen`: 5 |
| 29 | +- `local-files`: 4 |
| 30 | +- `browser`: 3 |
| 31 | +- `search`: 3 |
| 32 | +- `video-gen`: 3 |
| 33 | +- `vision`: 2 |
| 34 | +- `cloud-files`: 2 |
| 35 | + |
| 36 | +The category distribution confirms that coexistence is feasible, but the existing category system is not a clean substitute for a capability taxonomy. |
| 37 | + |
| 38 | +## High-Confidence Capability Areas |
| 39 | + |
| 40 | +These parts of the draft taxonomy already fit the current feature set well. |
| 41 | + |
| 42 | +### Real-time voice |
| 43 | + |
| 44 | +Clear current implementations: |
| 45 | + |
| 46 | +- ChatGPT Advanced Voice Mode |
| 47 | +- Copilot Voice |
| 48 | +- Gemini Live |
| 49 | +- Grok Voice Mode |
| 50 | +- Perplexity Voice Mode |
| 51 | + |
| 52 | +This area strongly supports splitting listening and speaking into separate capabilities even if they are often implemented together. |
| 53 | + |
| 54 | +### Visual understanding |
| 55 | + |
| 56 | +Clear current implementations: |
| 57 | + |
| 58 | +- Claude Vision |
| 59 | +- Copilot Vision |
| 60 | +- Gemini Project Astra |
| 61 | + |
| 62 | +This is a stable user-facing concept and should remain capability-first. |
| 63 | + |
| 64 | +### Image and video generation |
| 65 | + |
| 66 | +Clear current implementations: |
| 67 | + |
| 68 | +- DALL-E Image Generation |
| 69 | +- Imagen |
| 70 | +- Aurora |
| 71 | +- Designer |
| 72 | +- Sora |
| 73 | +- Veo |
| 74 | +- Grok Imagine |
| 75 | + |
| 76 | +These are already intuitive user-facing capabilities and map well. |
| 77 | + |
| 78 | +### Research, search, and action-taking |
| 79 | + |
| 80 | +Clear current implementations: |
| 81 | + |
| 82 | +- ChatGPT Search |
| 83 | +- ChatGPT Deep Research |
| 84 | +- ChatGPT Agent Mode |
| 85 | +- Claude Cowork Mode |
| 86 | +- Gemini Deep Research |
| 87 | +- Perplexity Pro Search |
| 88 | +- Perplexity Agent Mode |
| 89 | +- Comet Browser |
| 90 | +- DeepSearch |
| 91 | + |
| 92 | +These need many-to-many mapping, but they are editorially strong. |
| 93 | + |
| 94 | +### Projects, files, and persistent work |
| 95 | + |
| 96 | +Clear current implementations: |
| 97 | + |
| 98 | +- ChatGPT Projects |
| 99 | +- Claude Projects |
| 100 | +- Perplexity Collections |
| 101 | +- Gemini in Workspace |
| 102 | +- Copilot in Office Apps |
| 103 | +- NotebookLM |
| 104 | +- Memory |
| 105 | + |
| 106 | +This is another strong area for capability-first navigation. |
| 107 | + |
| 108 | +## Areas That Need Careful Modeling |
| 109 | + |
| 110 | +These are not blockers, but they are the places where schema pressure is most likely. |
| 111 | + |
| 112 | +### 1. Model access is not the same as a user capability |
| 113 | + |
| 114 | +Features that fit awkwardly into a capability taxonomy: |
| 115 | + |
| 116 | +- GPT-4 Access |
| 117 | +- Gemini Advanced |
| 118 | +- Grok Chat |
| 119 | +- Model Selection |
| 120 | +- Llama 3.3 |
| 121 | +- Llama 4 |
| 122 | +- DeepSeek-V3 / DeepSeek-R1 |
| 123 | +- Mistral Large / Mistral Nemo |
| 124 | +- Mistral Small 3 |
| 125 | +- Qwen 2.5 |
| 126 | +- Qwen 3 |
| 127 | + |
| 128 | +These mostly describe: |
| 129 | + |
| 130 | +- access to stronger base models |
| 131 | +- access to more choice |
| 132 | +- quality or reasoning differences |
| 133 | +- local/open model availability |
| 134 | + |
| 135 | +Recommendation: |
| 136 | + |
| 137 | +- Treat these primarily as implementation or constraint records for now. |
| 138 | +- Do not rush to create a top-level capability like "use good models." |
| 139 | +- Revisit only if a stable user-facing question emerges, such as "Can I choose which model I use?" |
| 140 | + |
| 141 | +### 2. Reasoning depth may be a modifier, not a primary capability |
| 142 | + |
| 143 | +Potentially awkward features: |
| 144 | + |
| 145 | +- Claude Extended Thinking |
| 146 | +- Grok Think Mode |
| 147 | + |
| 148 | +These feel more like quality or depth modifiers on existing capabilities than standalone capabilities. |
| 149 | + |
| 150 | +Recommendation: |
| 151 | + |
| 152 | +- Model them as capability enhancers or constraints before modeling them as first-class capability pages. |
| 153 | + |
| 154 | +### 3. Workspace/build tools often span multiple capabilities |
| 155 | + |
| 156 | +Examples: |
| 157 | + |
| 158 | +- Claude Artifacts |
| 159 | +- Gemini Canvas |
| 160 | +- ChatGPT Canvas |
| 161 | +- AI Studio |
| 162 | +- Claude Code |
| 163 | +- Grok Studio |
| 164 | + |
| 165 | +These are best understood as workspaces or implementation shells that support: |
| 166 | + |
| 167 | +- document creation |
| 168 | +- code work |
| 169 | +- iterative editing |
| 170 | +- reusable context |
| 171 | + |
| 172 | +Recommendation: |
| 173 | + |
| 174 | +- Map them to multiple capabilities. |
| 175 | +- Avoid making "canvas" or "studio" a capability category. |
| 176 | + |
| 177 | +### 4. Browser-like products mix surface and behavior |
| 178 | + |
| 179 | +Examples: |
| 180 | + |
| 181 | +- Atlas Browser |
| 182 | +- Comet Browser |
| 183 | +- Copilot Vision |
| 184 | + |
| 185 | +These combine: |
| 186 | + |
| 187 | +- interface surface |
| 188 | +- search/research behavior |
| 189 | +- action-taking |
| 190 | +- visual context |
| 191 | + |
| 192 | +Recommendation: |
| 193 | + |
| 194 | +- Treat browser products as implementations. |
| 195 | +- Map them into capabilities like `search-the-web`, `take-actions-and-run-tools`, and `see-images-and-screens`. |
| 196 | +- Keep surface-specific information in constraints. |
| 197 | + |
| 198 | +## Draft Outcome |
| 199 | + |
| 200 | +The current taxonomy appears viable for coexistence. |
| 201 | + |
| 202 | +What the pressure test suggests: |
| 203 | + |
| 204 | +- The capability-first direction is compatible with the current dataset. |
| 205 | +- The old feature-first site can remain operational during migration. |
| 206 | +- The biggest unresolved question is how to represent model access and reasoning quality without turning marketing terms into top-level capabilities. |
| 207 | + |
| 208 | +## Explicit Risks To Call Out Later |
| 209 | + |
| 210 | +These should be surfaced again if they start affecting implementation choices: |
| 211 | + |
| 212 | +- If model-brand pages begin dominating the capability-first IA, the editorial model is drifting backward toward feature-first. |
| 213 | +- If reasoning modes become top-level categories too early, the taxonomy may become vendor-shaped. |
| 214 | +- If surfaces like browser, desktop, and API are treated as peer categories with user-intent capabilities, the IA will get muddy. |
| 215 | + |
| 216 | +## Recommendation |
| 217 | + |
| 218 | +Proceed with coexistence. |
| 219 | + |
| 220 | +The next concrete step should be adding a thin capability mapping layer while keeping the current feature records and current dashboard untouched. |
0 commit comments