-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Description
Problem
Currently, DataFusion's struct casting includes a positional fallback mechanism that allows casting between structs with completely different field names if the field counts match. This violates DuckDB's semantics and can silently corrupt data.
Example of problematic behavior:
// This currently succeeds but should fail
source: struct<left int, right varchar>
target: struct<alpha int, beta varchar>
// Positional fallback causes:
// left → alpha, right → beta
// Despite zero matching field names!Current Behavior
The implementation in validate_struct_compatibility() currently allows casting when:
- No field name overlap exists AND
- Field counts match
The casting correctly fails only when there's no overlap and counts differ with the error:
Cannot cast struct with X fields to Y fields without name overlap; positional mapping is ambiguous
Proposed Solution
Align with DuckDB's semantics by requiring at least one matching field name for any struct cast to succeed:
- Require name-based matching — at least one field name must match between source and target
- Remove positional fallback — even when field counts are equal
- Clear error message when no field names match
Impact
- Breaking change — Any code relying on positional casting with no name overlap will fail
- Safety improvement — Prevents silent data corruption from accidental field misalignment
- Consistency — Aligns with DuckDB's well-designed struct casting semantics and the principle of "at least one field name must match"
Metadata
Metadata
Assignees
Labels
No labels