[codex] replace backfill chunking with smart planner#107
Draft
[codex] replace backfill chunking with smart planner#107
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
@chkit/plugin-backfill/sdkinstead of the package rootWhy
The previous backfill planner only handled coarse partition splitting with a single sort-key range fallback. The new smart planner supports string-prefix, temporal, quantile, and equal-width strategies, including secondary-dimension handoff for hot keys, which matches the new prototype behavior and gives much better confidence in expected chunk layouts.
Impact
Backfill planning now produces smarter chunk boundaries for skewed data and multi-column sort keys.
Public API note: planner/runtime internals are no longer exported from
@chkit/plugin-backfill. They now live under@chkit/plugin-backfill/sdk.Root Cause
The older planner could only split by partition and a single coarse sort-key range, which broke down on hot keys and skewed distributions. During review, a few edge cases also surfaced in the first port: collapsed equal-width boundaries, brittle
sorting_keyparsing for expression-based sort keys, and temporal refinement extending past parent slice bounds.Validation
bun test /Users/marc/Workspace/chkit/packages/plugin-backfill/srcbun run --cwd /Users/marc/Workspace/chkit/packages/plugin-backfill typecheckbun verify