Skip to content

feat(cli): Batch completeness — 17 operations (GH-88)#219

Merged
arcaputo3 merged 3 commits intomainfrom
gh88-batch-completeness
Feb 10, 2026
Merged

feat(cli): Batch completeness — 17 operations (GH-88)#219
arcaputo3 merged 3 commits intomainfrom
gh88-batch-completeness

Conversation

@arcaputo3
Copy link
Copy Markdown
Contributor

Summary

  • Expand batch from 7 to 17 operations, making it the primary interface for atomic multi-step workbook modifications
  • Fix pre-existing OOXML bug where row/col hidden state could never be cleared (OR logic → domain override)

New batch operations (10)

Operation Example Streaming
comment {"op":"comment","ref":"A1","text":"Note","author":"User"} No
remove-comment {"op":"remove-comment","ref":"A1"} No
clear {"op":"clear","range":"A1:B10","all":true} No
col-hide {"op":"col-hide","col":"C"} Yes
col-show {"op":"col-show","col":"C"} Yes
row-hide {"op":"row-hide","row":5} Yes
row-show {"op":"row-show","row":5} Yes
autofit {"op":"autofit","columns":"A:F"} No
add-sheet {"op":"add-sheet","name":"Summary","after":"Sheet1"} No
rename-sheet {"op":"rename-sheet","from":"Old","to":"New"} No

OOXML bugfix

applyDomainRowProps used props.hidden || row.hidden which made unhiding impossible — the original XML hidden="1" always won. Domain properties now always override preserved XML attributes.

Test plan

  • All 837 tests pass (./mill __.test)
  • Format check clean (./mill __.checkFormat)
  • CLI installed and dogfooded all 10 new ops locally
  • Row/col hide→show round-trip verified (was broken before OOXML fix)
  • Streaming mode: col/row visibility ops work; unsupported ops produce clear error
  • Output xlsx files opened in Excel — no corruption

🤖 Generated with Claude Code

arcaputo3 and others added 2 commits February 9, 2026 16:46
…ation (GH-88)

Expand batch from 7 to 17 operations, making it the primary interface
for atomic multi-step workbook modifications. This is especially
impactful for LLM agents where each tool call is expensive.

New operations:
- comment/remove-comment: cell comment management
- clear: clear contents/styles/comments with flags
- col-hide/col-show/row-hide/row-show: visibility control
- autofit: auto-fit column widths
- add-sheet/rename-sheet: workbook-level sheet management

Streaming mode supports col/row visibility ops; other new ops
require full workbook mode with a clear error message.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
applyDomainRowProps used OR logic (props.hidden || row.hidden) which
made it impossible to unhide a row once hidden — the original XML
attribute always won. Domain properties should always override the
preserved XML attributes, matching the documented behavior.

Found during dogfooding of batch row-hide/row-show operations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude bot commented Feb 10, 2026

Code Review: PR #219 - Batch Completeness (17 Operations)

Summary

This PR significantly expands the batch operations interface from 7 to 17 operations and includes an important OOXML bugfix for row/column visibility. Overall, this is high-quality, well-architected code that adheres to the project's strict functional programming principles.


Strengths

1. Architecture & Design

  • Excellent separation of concerns: The new operations are cleanly integrated into the existing BatchParser architecture
  • Streaming vs. full-workbook distinction: Clear error messages guide users when operations aren't supported in streaming mode (StreamingWriteCommands.scala:691-694)
  • Deterministic ordering: Operations applied sequentially via foldLeft ensures predictable behavior
  • Type safety: All operations use proper domain types (ARef, Column, Row, SheetName)

2. Functional Purity

  • Effect isolation: All operations return IO[Workbook] with proper error handling
  • Totality: No partial functions, no null, comprehensive Either-based error handling
  • Immutability: Workbook transformations are pure - each operation returns a new workbook

3. OOXML Bugfix (Critical)

The fix in WorksheetHelpers.scala:100,102 is essential:

// Before: hidden = props.hidden || row.hidden  ❌ (OR logic made unhiding impossible)
// After:  hidden = props.hidden                ✅ (Domain always overrides XML)

This correctly implements domain-over-XML semantics, enabling row/col hide→show round-trips. Well caught during dogfooding!

4. Documentation

  • Clear examples in CLAUDE.md and cli.md
  • Comprehensive table of all 17 operations with required/optional fields
  • Skill documentation updated with proper guidance

5. Error Handling

  • Descriptive error messages (e.g., "This batch operation is not supported in streaming mode")
  • Proper validation of sheet names, column letters, and row numbers
  • Helpful hints when operations fail

🔍 Observations & Suggestions

1. Test Coverage ⚠️

While the PR mentions "837 tests pass", I could not find explicit tests for the 10 new batch operations in the diff or test files:

  • No tests for comment, remove-comment, clear, col-hide, col-show, row-hide, row-show, autofit, add-sheet, rename-sheet
  • Existing BatchPutSpec.scala only covers put, putf, style, merge, unmerge, colwidth, rowheight

Recommendation: Add a BatchNewOpsSpec.scala test file covering:

test("batch comment round-trip")
test("batch clear with all flag")
test("batch col-hide then col-show")
test("batch autofit calculates width correctly")
test("batch add-sheet with 'after' positioning")
test("batch rename-sheet error on duplicate")

Given the OOXML bugfix was found via dogfooding, automated tests would prevent regressions.

2. AutoFit Width Calculation (BatchParser.scala:1176-1204)

The autofit algorithm looks reasonable, but consider:

  • Bold factor (1.1): Is this empirically validated against Excel's autofit? Excel uses font metrics, not just length × 1.1
  • Magic constants: 0.90 + 1.5 in line 1202 should be documented (why these values?)
  • Character-width assumption: Assumes proportional width, but Excel uses font metrics (different for 'i' vs 'W')

Suggestion: Add a comment explaining the heuristic:

// Heuristic: baseWidth * 0.90 (accounts for font spacing) + 1.5 (Excel padding)
// This approximates Excel's autofit but doesn't use actual font metrics

3. Clear Operation Logic (BatchParser.scala:1126-1148)

The unmerge-on-clear logic is correct but worth documenting:

// Unmerge overlapping regions when clearing contents
if clearContents then
  val overlapping = s3.mergedRanges.filter(_.intersects(range))
  overlapping.foldLeft(s3)((s, mr) => s.unmerge(mr))

Why is this correct? Excel also unmerges when you clear a merged range. Document this Excel-parity behavior.

4. AddSheet After Logic (BatchParser.scala:1269-1285)

Clean implementation, but edge case question:

  • What if after sheet doesn't exist? ✅ Handled (line 1278: proper error)
  • What if sheet already exists? ✅ Handled (line 1262: proper error)

Nice work on validation!

5. Streaming Mode Error Messages (StreamingWriteCommands.scala:691-694)

Excellent UX - the error clearly explains why the operation failed and how to fix it:

"This batch operation is not supported in streaming mode. " +
  "Remove --stream to use full workbook mode."

🐛 Potential Issues

1. AutoFit with Empty Column (BatchParser.scala:1176-1204)

If a column has no cells, maxCharWidth.maxOption returns None, resulting in width 8.43. Is this intended?

if cellsInColumn.isEmpty then 8.43  // Default width

Question: Should autofit skip empty columns (preserve existing width) or set to default? Excel's behavior?

2. RichText in AutoFit (BatchParser.scala:1199)

case RichText(rt) => rt.toPlainText.length.toDouble

This doesn't account for RichText formatting (bold runs, font size changes). Not critical, but might under-estimate width for mixed-format cells.

3. Clear Operation: What About Formulas with Cached Values?

The removeRange call will delete cells, but does it properly handle formulas that reference the cleared range? (This is more of a core library question than a batch-specific issue.)


🔒 Security

No security concerns:

  • Input validation is comprehensive (sheet names, column letters, row numbers)
  • No path traversal risks
  • No injection vulnerabilities
  • Proper error handling prevents panics

Performance

  • Sequential processing: foldLeft over operations is fine for batch sizes < 1000. For massive batches, consider chunking, but current approach is idiomatic for Cats Effect.
  • Streaming mode support: Row/col visibility operations correctly support streaming (no workbook load), excellent for large files.
  • AutoFit: O(n) per column where n = cells in column. Acceptable.

📋 Compliance with CLAUDE.md

Purity & Totality: All functions are pure, no exceptions thrown
Strong Typing: Opaque types used correctly (Column, Row, ARef)
Deterministic Output: Operations applied in order
Effect Isolation: Only IO in xl-cli, core remains pure
Error Handling: Proper XLResult / IO.raiseError patterns
Code Style: WartRemover compliant, imports follow conventions


🎯 Final Recommendation

Approve with minor suggestions:

  1. Add explicit tests for the 10 new batch operations
  2. Document autofit heuristic (character width calculation)
  3. Document unmerge-on-clear behavior (Excel parity)

This PR represents excellent work:

  • Critical bugfix for row/col visibility
  • Clean architecture for 10 new operations
  • Proper error handling and streaming support
  • Strong adherence to functional programming principles

LGTM pending test coverage addition. The OOXML bugfix alone justifies merging, and the new batch operations are production-ready.


📊 Metrics

  • Files changed: 7 (focused, surgical changes)
  • Additions: 462 lines (mostly new batch operations)
  • Deletions: 9 lines (OOXML bugfix)
  • Complexity: Low (functional composition, no cyclomatic complexity issues)

Great work! 🚀

…rvation

15 new in-memory batch operation tests covering comment, remove-comment,
clear, col-hide/show, row-hide/show, autofit, add-sheet, rename-sheet
including error cases (duplicate sheet, missing sheet, invalid column spec).

Streaming improvements (agent contribution):
- SAX-based readExistingWorksheetMetadata reads col/row properties from
  source worksheet before applying patches, preserving widths, outline
  levels, styles, and visibility state during streaming writes
- StreamingTransform column writer now conditionally emits width/customWidth
  and properly handles styleId and collapsed attributes
- Autofit column spec parsing extracted to pure Either-based helper with
  fail-fast error reporting

3 streaming-specific tests verify metadata preservation for col-hide
and row-show round-trips.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude bot commented Feb 10, 2026

Code Review - PR #219: Batch Completeness

Summary

This PR successfully expands the batch API from 7 to 17 operations and fixes a critical OOXML bug preventing row/column unhiding. The implementation is solid, well-tested, and aligns with XL's architectural philosophy.


✅ Strengths

1. Critical Bug Fix

The OOXML fix in WorksheetHelpers.scala (lines 100, 102) is excellent:

// Before: props.hidden || row.hidden  (OR logic prevented unhiding)
// After:  props.hidden                (domain always wins)

This fix properly allows unhiding rows/columns that were hidden in the original XML. The same pattern is correctly applied to collapsed state.

2. Comprehensive Test Coverage

  • 199 new test lines in MainSpec.scala covering all 10 new operations
  • 78 new test lines in StreamingWriteSpec.scala for streaming mode edge cases
  • Tests verify both success cases and error conditions (duplicate sheets, missing refs, etc.)
  • Metadata preservation tests ensure existing column/row properties survive streaming operations

3. Smart Streaming Mode Handling

The implementation correctly identifies which operations work in streaming mode:

  • Streaming-safe: col-hide/show, row-hide/show (metadata-only)
  • Requires full workbook: comment, clear, autofit, add-sheet, rename-sheet

The explicit error message (lines 238-243 in StreamingWriteCommands.scala) guides users properly.

4. Metadata Preservation

The readExistingWorksheetMetadata function (lines 260-335 in StreamingWriteCommands.scala) is well-designed:

  • Uses SAX parsing with proper security configurations
  • Early exit via StopWorksheetScan exception prevents unnecessary parsing
  • Only reads metadata when needed (needsColumnMetadata, targetRows)
  • Performance optimization: stops parsing once all target rows are found

5. AutoFit Implementation

The autoFitWidth algorithm (lines 735-771 in BatchParser.scala) is reasonable:

  • Considers cell value types, number formatting, and bold styling
  • Uses 0.90 scaling factor + 1.5 padding (empirical constants)
  • Minimum width of 5.0 prevents pathologically narrow columns

⚠️ Issues & Suggestions

1. Column Streaming Bug (Minor)

Location: StreamingTransform.scala:111-117

The column element writer unconditionally writes attributes even when they shouldn't be present:

props.width.foreach { width =>
  writer.writeAttribute("width", width.toString)
  writer.writeAttribute("customWidth", "1")
}
if props.hidden then writer.writeAttribute("hidden", "1")

Issue: When col-show sets hidden = false, the existing hidden="1" attribute from the source XML should be removed, but this code doesn't write anything. This means the original hidden="1" might persist if it was already in the XML.

However, looking at line 837-843 in the old diff, I see the code rebuilds column elements from scratch, so this may not be an issue. Worth verifying in manual testing.

2. Null Annotation Inconsistency (Style)

Location: StreamingWriteCommands.scala:259

@SuppressWarnings(Array("org.wartremover.warts.Null"))

This suppression is used for attributes.getValue() which can return null in SAX. However, the code already uses Option(attributes.getValue(...)) everywhere, which safely handles null. Consider whether this annotation is necessary or if it's cargo-culted.

3. AutoFit Magic Numbers (Documentation)

Location: BatchParser.scala:770

val width = BigDecimal(maxCharWidth * 0.90 + 1.5)
  .setScale(2, BigDecimal.RoundingMode.HALF_UP).toDouble

Suggestion: Add a comment explaining the 0.90 scaling factor and 1.5 padding. These constants affect user experience and future maintainers will wonder where they came from.

// Scale by 0.90 to approximate character width in Excel units
// Add 1.5 for padding (empirically determined to match Excel autofit)
val width = BigDecimal(maxCharWidth * 0.90 + 1.5)

4. Potential Performance Issue in applyClear (Low Priority)

Location: BatchParser.scala:650-654

if clearContents then
  val overlapping = s3.mergedRanges.filter(_.intersects(range))
  overlapping.foldLeft(s3)((s, mr) => s.unmerge(mr))
else s3

Issue: This creates intermediate Sheet instances for each unmerge operation. For large workbooks with many merged ranges, this could be inefficient.

Suggestion (future optimization): Add a bulk Sheet.unmergeMany(ranges: Set[CellRange]) method that applies all unmerges in one pass.

5. Missing Qualified Range Support (Enhancement)

Location: BatchParser.scala:658-677

Operations like col-hide, row-hide require --sheet but don't support qualified syntax (e.g., "Sheet1\!B"). This inconsistency with other operations (like comment, which does support qualified refs at line 596) could confuse users.

Impact: Low - most users will use --sheet flag anyway.


🔒 Security

Excellent SAX parser hardening (lines 278-287 in StreamingWriteCommands.scala):

  • Doctype declaration disabled
  • External entities disabled
  • XInclude disabled
  • Namespace-aware parsing enabled

This follows best practices for XML security.


🎯 Performance

Streaming Mode

  • ✅ Metadata reading is optimized with early exit
  • ✅ Only parses what's needed (columns or specific rows)
  • ✅ O(1) memory for supported operations

Full Workbook Mode

  • ⚠️ AutoFit iterates all cells in each column (O(rows × cols))
  • ⚠️ Clear operation filters merged ranges (O(merged ranges))

Both are acceptable for typical workbook sizes.


📋 Code Quality

Adherence to CLAUDE.md

  • ✅ Purity: All functions return IO or Either
  • ✅ Totality: No partial functions, all errors handled
  • ✅ Strong typing: Proper use of Column, Row, SheetName opaque types
  • ✅ Error handling: Consistent use of IO.fromEither and IO.raiseError
  • ✅ Code style: Passes checkFormat

Test Quality

  • ✅ Positive and negative test cases
  • ✅ Edge cases covered (duplicate names, missing sheets, invalid specs)
  • ✅ Round-trip verification (hide→show)
  • ✅ Metadata preservation verified

🚀 Recommendation

APPROVE with minor suggestions.

This PR is production-ready. The OOXML bug fix alone justifies merging. The 10 new batch operations are well-implemented and tested. Suggested improvements are minor polish items that can be addressed in future PRs if needed.

Post-Merge Checklist

  • Verify make install updates CLI with new operations
  • Manual test: hide column → save → show column → verify visible in Excel
  • Update STATUS.md if batch operations are mentioned

Reviewed according to XL project guidelines (CLAUDE.md)

@arcaputo3 arcaputo3 merged commit 2b9a213 into main Feb 10, 2026
2 checks passed
@arcaputo3 arcaputo3 deleted the gh88-batch-completeness branch February 10, 2026 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant