Skip to content

Commit dd3e593

Browse files
committed
fix(benchmarks-migrate): route 2-part random-access to Skip, not Unknown
The post-alpha cleanup made `bin_random_access` reject 2-part legacy shapes by returning None, but `classify_outcome` mapped any None bin to `Outcome::Unknown`, so those records counted against the 5% uncategorized gate. Intentional drops must Skip; only truly unrecognized shapes are Unknown. Reuses the existing `Skip::UnsupportedShape` variant, originally added for "get_group succeeded but classify_v2 didn't" — same semantic shape (recognized but unsupported). Regression test in `tests/classifier.rs` exercises `classify_outcome` (not just the top-level `classify`) so Skip vs Unknown is observable. Signed-off-by: Claude <noreply@anthropic.com>
1 parent 85e39f3 commit dd3e593

2 files changed

Lines changed: 27 additions & 1 deletion

File tree

benchmarks-website/migrate/src/classifier.rs

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -537,7 +537,16 @@ pub fn classify_outcome(record: &V2Record) -> Outcome {
537537
return Outcome::Skip(Skip::DerivedRatio);
538538
}
539539
let bin = match &cls.group {
540-
V2Group::RandomAccess => bin_random_access(record),
540+
V2Group::RandomAccess => match bin_random_access(record) {
541+
Some(b) => Some(b),
542+
// Legacy 2-part `random-access/<format>-…` records carry
543+
// no dataset and are intentionally dropped by
544+
// `bin_random_access`. Route them to Skip so the
545+
// `Outcome::Unknown` arm below — and the 5%
546+
// uncategorized gate in `migrate::run` — don't trip on
547+
// them.
548+
None => return Outcome::Skip(Skip::UnsupportedShape),
549+
},
541550
V2Group::Compression => bin_compression_time(&cls, record),
542551
V2Group::CompressionSize => bin_compression_size(&cls, record),
543552
V2Group::Query { .. } => bin_query(&cls, record),

benchmarks-website/migrate/tests/classifier.rs

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -271,6 +271,23 @@ fn unmapped_records_yield_none(#[case] name: &str) {
271271
);
272272
}
273273

274+
#[test]
275+
fn random_access_2_part_legacy_is_skip_not_unknown() {
276+
// The 2-part legacy shape `random-access/<format>-tokio-local-disk`
277+
// carries no dataset, so `bin_random_access` returns None. That
278+
// None must route through `Outcome::Skip` (an intentional drop),
279+
// NOT `Outcome::Unknown`, otherwise these records count against
280+
// the 5% uncategorized gate in `migrate::run`. Top-level
281+
// `classify()` returns None for both Skip and Unknown, so this
282+
// assertion has to go through `classify_outcome`.
283+
let r = record("random-access/parquet-tokio-local-disk");
284+
let outcome = classify_outcome(&r);
285+
assert!(
286+
matches!(outcome, Outcome::Skip(_)),
287+
"2-part legacy random-access must Skip, not Unknown; got {outcome:?}"
288+
);
289+
}
290+
274291
#[test]
275292
fn parquet_zstd_size_is_deprecated() {
276293
// `parquet-zstd` is not on the v3 emitter's format allowlist, so

0 commit comments

Comments
 (0)