refactor(undo): iterate affected files child-first via recursive CTE#59
Merged
Conversation
QueryUndoAffectedFiles previously returned affected files ordered by file_id ASC, which approximates parent-first iteration (UUIDv7 = creation time, dirs are created before their contents). This ordering was the precondition for the cascade-artifact bug fixed previously: under parent-first iteration, each child UPSERT's bump_parent_mtime cascade landed on the already-restored parent and wrote a fresh history row -- newer than the parent's own restore row. The earlier post-hoc "newest history row" capture in Step 3 then resolved to the cascade artifact instead of the parent's own restore snapshot, silently corrupting undo-of-undo for directory renames. The inline version_id capture made correctness order-independent, so this rewrite is defense in depth, not a correctness fix. Switch the affected-files query to a recursive CTE that walks each affected file's parent_id chain UP and sorts by depth DESC (deepest first, file_id ASC tiebreaker among siblings). Two-tier ancestor lookup: for ancestors in the affected set, use the parent_id from the entry's snapshot (affected.version_id -> history row); for untouched ancestors, use current source.parent_id. The snapshot path is what makes child-first ordering work for the rm-rf-then-undo case, where all affected source rows are absent and a source-only walk would default everything to depth 0. Result: cascades from child restores either land on absent parents (cascade UPDATE matches zero rows, no row written) or on pre-restore parents (cascade row written but superseded by the parent's own subsequent UPSERT as newest). The parent's own restore row is always "newest" for the parent's file_id at end-of-transaction. If the inline version_id capture ever regresses, child-first iteration preserves the correctness invariant. Cascade-noise reduction in the common rm-rf-then-undo pattern is a bonus. Function comment in query.go has the full design rationale, including the simpler form this rewrite replaces (kept inline as a reference). Adds TestQueryUndoAffectedFiles_TopologicalOrder (unit), and TestSynth_UndoIterationOrder_ChildBeforeParent and TestSynth_UndoChildFirst_MinimizesCascadeArtifacts (integration). Cascade-noise test asserts delta=6 history rows added to d during rm-rf-then-undo; pre-rewrite the same scenario adds 9 (extra 3 cascade rows from Step-2 child UPSERTs onto the just-restored parent).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
QueryUndoAffectedFiles previously returned affected files ordered by file_id ASC, which approximates parent-first iteration (UUIDv7 = creation time, dirs are created before their contents). This ordering was the precondition for the cascade-artifact bug fixed previously: under parent-first iteration, each child UPSERT's bump_parent_mtime cascade landed on the already-restored parent and wrote a fresh history row -- newer than the parent's own restore row. The earlier post-hoc "newest history row" capture in Step 3 then resolved to the cascade artifact instead of the parent's own restore snapshot, silently corrupting undo-of-undo for directory renames.
The inline version_id capture made correctness order-independent, so this rewrite is defense in depth, not a correctness fix. Switch the affected-files query to a recursive CTE that walks each affected file's parent_id chain UP and sorts by depth DESC (deepest first, file_id ASC tiebreaker among siblings). Two-tier ancestor lookup: for ancestors in the affected set, use the parent_id from the entry's snapshot (affected.version_id -> history row); for untouched ancestors, use current source.parent_id. The snapshot path is what makes child-first ordering work for the rm-rf-then-undo case, where all affected source rows are absent and a source-only walk would default everything to depth 0.
Result: cascades from child restores either land on absent parents (cascade UPDATE matches zero rows, no row written) or on pre-restore parents (cascade row written but superseded by the parent's own subsequent UPSERT as newest). The parent's own restore row is always "newest" for the parent's file_id at end-of-transaction. If the inline version_id capture ever regresses, child-first iteration preserves the correctness invariant. Cascade-noise reduction in the common rm-rf-then-undo pattern is a bonus.
Function comment in query.go has the full design rationale, including the simpler form this rewrite replaces (kept inline as a reference).
Adds TestQueryUndoAffectedFiles_TopologicalOrder (unit), and TestSynth_UndoIterationOrder_ChildBeforeParent and TestSynth_UndoChildFirst_MinimizesCascadeArtifacts (integration). Cascade-noise test asserts delta=6 history rows added to d during rm-rf-then-undo; pre-rewrite the same scenario adds 9 (extra 3 cascade rows from Step-2 child UPSERTs onto the just-restored parent).