Commit afdf792
fix(benchmarks): Address Claude UI review feedback
Harden the Claude UI benchmark harness so preflight retries transient UI inspection failures, temporary simulators are cleaned up after partial setup failures, and suite config validation fails before expensive runs.
Also report unsettled post-action runtime snapshots as recoverable UI automation warnings instead of returning unstable refs.
Co-Authored-By: OpenAI Codex <codex@openai.com>1 parent bbab5f7 commit afdf792
9 files changed
Lines changed: 396 additions & 94 deletions
File tree
- src
- benchmarks/claude-ui
- __tests__
- mcp/tools/ui-automation
- __tests__
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
25 | 29 | | |
26 | 30 | | |
27 | 31 | | |
| |||
Lines changed: 65 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
6 | 7 | | |
7 | 8 | | |
8 | 9 | | |
| |||
199 | 200 | | |
200 | 201 | | |
201 | 202 | | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
202 | 216 | | |
203 | 217 | | |
204 | 218 | | |
| |||
251 | 265 | | |
252 | 266 | | |
253 | 267 | | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
254 | 319 | | |
255 | 320 | | |
256 | 321 | | |
| |||
Lines changed: 87 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| 41 | + | |
41 | 42 | | |
42 | 43 | | |
43 | 44 | | |
| |||
47 | 48 | | |
48 | 49 | | |
49 | 50 | | |
50 | | - | |
| 51 | + | |
51 | 52 | | |
52 | 53 | | |
53 | 54 | | |
| |||
120 | 121 | | |
121 | 122 | | |
122 | 123 | | |
123 | | - | |
| 124 | + | |
124 | 125 | | |
125 | 126 | | |
126 | 127 | | |
| |||
175 | 176 | | |
176 | 177 | | |
177 | 178 | | |
178 | | - | |
| 179 | + | |
179 | 180 | | |
180 | 181 | | |
181 | 182 | | |
| |||
220 | 221 | | |
221 | 222 | | |
222 | 223 | | |
223 | | - | |
| 224 | + | |
224 | 225 | | |
225 | 226 | | |
226 | 227 | | |
| |||
251 | 252 | | |
252 | 253 | | |
253 | 254 | | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
254 | 337 | | |
255 | 338 | | |
256 | 339 | | |
| |||
Lines changed: 45 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
49 | | - | |
50 | 45 | | |
51 | 46 | | |
52 | 47 | | |
| |||
225 | 220 | | |
226 | 221 | | |
227 | 222 | | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
228 | 268 | | |
229 | 269 | | |
230 | 270 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
108 | 108 | | |
109 | 109 | | |
110 | 110 | | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
118 | 142 | | |
119 | 143 | | |
120 | 144 | | |
| |||
153 | 177 | | |
154 | 178 | | |
155 | 179 | | |
156 | | - | |
| 180 | + | |
157 | 181 | | |
158 | 182 | | |
159 | 183 | | |
| |||
186 | 210 | | |
187 | 211 | | |
188 | 212 | | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
193 | | - | |
194 | | - | |
195 | | - | |
196 | | - | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | | - | |
201 | | - | |
| |||
0 commit comments