Skip to content

fix(ooxml): move-sheet silent data corruption#213

Merged
arcaputo3 merged 1 commit intomainfrom
fix-move-sheet-corruption
Feb 8, 2026
Merged

fix(ooxml): move-sheet silent data corruption#213
arcaputo3 merged 1 commit intomainfrom
fix-move-sheet-corruption

Conversation

@arcaputo3
Copy link
Copy Markdown
Contributor

Summary

  • P0 bug fix: move-sheet caused silent data corruption where sheet names pointed to wrong data after reorder
  • Root cause: Workbook.reorder() set reorderedSheets=true but the surgical writer only checked modifiedMetadata to decide whether to regenerate all sheets
  • Unmodified sheets were copied from the source ZIP in their original positions while workbook.xml referenced them in the new order
  • Added regression test that writes/reads/reorders/writes/reads and verifies data integrity

Test plan

  • New regression test in XlsxWriterCorruptionRegressionSpec — creates 3 sheets with distinct data, reorders, round-trips, verifies each sheet name maps to correct data
  • All 174 OOXML tests pass
  • All 127 core tests pass
  • Manual verification via CLI: move-sheet "C" --to 0 now preserves correct data mapping
  • CI passes

Closes #207

🤖 Generated with Claude Code

…odification tracking (#207)

Workbook.reorder() set reorderedSheets=true but the surgical writer only
checked modifiedMetadata to decide whether to regenerate all sheets.
Unmodified sheets were copied from the source ZIP in original positions
while workbook.xml referenced them in the new order — name/data mismatch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude bot commented Feb 8, 2026

Code Review - PR #213: fix(ooxml): move-sheet silent data corruption

Summary

Verdict: ✅ LGTM - Excellent P0 bug fix

This PR correctly addresses a critical data corruption bug where sheet reordering caused sheet names to point to wrong data. The fix is minimal, surgical, and precisely targets the root cause.


What This PR Does Right

1. Precise Root Cause Analysis

The issue description correctly identifies that Workbook.reorder() sets reorderedSheets=true but NOT modifiedMetadata=true, and the surgical writer only checked modifiedMetadata to decide whether to regenerate sheets. This caused unmodified sheets to be copied from source ZIP in original positions while workbook.xml referenced them in the new order.

2. Minimal, Surgical Fix

The fix touches exactly 2 locations in XlsxWriter.scala:

  • Line 1014: determineRegenerateParts() - now checks tracker.reorderedSheets
  • Line 1411: sheetsToRegenerate logic - now checks tracker.reorderedSheets

Both locations add the same logical condition: || tracker.reorderedSheets, forcing regeneration of ALL sheets when reordering occurs. This ensures sheet XML files are written to positions matching the new order in workbook.xml.

3. Comprehensive Regression Test

The test in XlsxWriterCorruptionRegressionSpec.scala (68 lines) is exemplary:

This test will catch any future regressions of this specific issue.

4. Correct Understanding of Architecture

The PR author correctly understands that:

  • Line 1318 (parsePreservedStructure) does NOT need reorderedSheets check because OoxmlWorkbook.updateSheets() handles sheet reordering in the preserved structure
  • The critical issue is regenerating sheet XML files at new positions, not the workbook structure itself
  • Sheet reordering is fundamentally different from metadata modification (add/remove/rename)

Code Quality Assessment

Adherence to CLAUDE.md Principles

  • Purity & Totality: No new effects, maintains pure functional design
  • Strong Typing: Uses existing opaque types and tracker flags
  • Deterministic Output: Fix ensures sheet positions match workbook.xml order
  • Law-Governed: Maintains round-trip invariants for write → read → reorder → write
  • Minimal Changes: No over-engineering, focuses on the exact bug

Style & Conventions

  • ✅ Comments updated to explain "reordered sheets must be written to new positions"
  • ✅ Parallel structure maintained: both locations use identical || tracker.reorderedSheets logic
  • ✅ Test follows regression spec naming and structure conventions

Potential Considerations (Minor)

1. Performance Impact (Low Priority)

When reorderedSheets=true, ALL sheets are regenerated even if only metadata changed. For workbooks with 100+ sheets where only one was moved, this could be slower than necessary.

However: This is the correct conservative behavior for a P0 data corruption fix. Premature optimization here would risk reintroducing the bug. If performance becomes an issue, a future enhancement could track per-sheet reordering mappings, but that's significantly more complex.

2. SharedStrings and Styles Regeneration (Question)

Lines 1004-1009 regenerate styles.xml and sharedStrings.xml when modifiedMetadata is true, but NOT when reorderedSheets is true.

Analysis: This is correct because:

  • Sheet reordering doesn't change style indices (styles are sheet-independent)
  • Sheet reordering doesn't change shared string indices (SST is global)
  • Only the sheet XML file positions need to match the new order

No change needed here.

3. Test Coverage for Edge Cases (Nice-to-Have)

The regression test covers the basic case (3 sheets, move one to front). Additional edge cases could be tested:

  • Reorder all sheets in reverse
  • Reorder with formulas referencing other sheets
  • Reorder with shared strings / rich text
  • Reorder very large workbooks (performance)

However: The current test is sufficient for the P0 fix. Additional tests can be added if issues arise.


Security & Safety

No security concerns: Fix is purely internal logic, no user input handling changes
No data loss risk: Fix prevents data corruption rather than introducing new risks
Backwards compatible: Existing workbooks will continue to work correctly


Test Execution Plan

The PR description states:

  • ✅ New regression test passes
  • ✅ All 174 OOXML tests pass
  • ✅ All 127 core tests pass
  • ✅ Manual CLI verification: move-sheet "C" --to 0 preserves data
  • ⏳ CI pending

Recommendation: Merge after CI passes. The fix is sound and well-tested.


Final Assessment

Category Rating Notes
Correctness ⭐⭐⭐⭐⭐ Precisely fixes root cause
Test Coverage ⭐⭐⭐⭐⭐ Comprehensive regression test
Code Quality ⭐⭐⭐⭐⭐ Minimal, clear, well-documented
Performance ⭐⭐⭐⭐ Conservative approach (acceptable tradeoff)
Documentation ⭐⭐⭐⭐⭐ Excellent comments and commit message

Overall: 5/5 - Exemplary P0 bug fix. Merge immediately after CI.


References

@arcaputo3 arcaputo3 merged commit dadea15 into main Feb 8, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(ooxml): move-sheet causes silent data corruption — sheet names point to wrong data

1 participant