You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-53032] Fix parquet format of shredded timestamp values within arrays
### What changes were proposed in this pull request?
This PR is an extension of the [previous PR](#51609) which did not account for the fact that timestamps within Variant arrays could be shredded as well. This PR makes sure that these timestamps are stored in compliance to the shredding spec.
### Why are the changes needed?
Variants representing arrays of timestamps could be shredded and the written format must reflect the parquet spec.
### Does this PR introduce _any_ user-facing change?
This PR must go in the same version as the [previous PR](#51609). The physical format of shredded timestamps within parquet files will be different.
### How was this patch tested?
Incorporated `array<timestamp>` within previous unit test.
### Was this patch authored or co-authored using generative AI tooling?
no
Closes#51734 from harshmotw-db/harsh-motwani_data/shredding_timestamp_fix.
Authored-by: Harsh Motwani <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Copy file name to clipboardExpand all lines: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetVariantShreddingSuite.scala
+20-3Lines changed: 20 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -47,17 +47,21 @@ class ParquetVariantShreddingSuite extends QueryTest with ParquetTest with Share
0 commit comments