Skip to content

Conversation

comphead
Copy link
Contributor

@comphead comphead commented Aug 18, 2025

Which issue does this PR close?

Closes #2157
.

Rationale for this change

What changes are included in this PR?

This update introduces support for nested ARRAY literals in the Comet query plan serialization logic. The changes include both code refactoring and enhanced test coverage.

Highlights

  • Nested ARRAY Literal Support:

The code now handles array literals containing other arrays (i.e., arrays of arrays), enabling use of multi-dimensional array literals in queries.

  • Refactored Serialization Logic:

Serialization of array literals (including primitive and nested arrays) is now handled by a new recursive helper function, makeListLiteral.
The logic for handling array types has been extracted from a large match-case block into this helper, improving maintainability and extensibility.

  • Improved Type Handling:

The logic now recognizes and correctly serializes arrays whose element type is itself an array.
Type checks and element conversions are clarified and improved.

Detailed Changes

Code (QueryPlanSerde.scala)

  • Added import java.lang for better handling of boxed primitives.
  • The code path for handling ArrayType now supports both primitives and nested arrays.
  • Introduced a private makeListLiteral(array, arrayType) method:
  • Recursively serializes nested arrays.
  • Handles nulls and all Spark SQL primitive types.

@codecov-commenter
Copy link

codecov-commenter commented Sep 2, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.53%. Comparing base (f09f8af) to head (19226b1).
⚠️ Report is 516 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2181      +/-   ##
============================================
+ Coverage     56.12%   57.53%   +1.40%     
- Complexity      976     1295     +319     
============================================
  Files           119      147      +28     
  Lines         11743    13477    +1734     
  Branches       2251     2354     +103     
============================================
+ Hits           6591     7754    +1163     
- Misses         4012     4457     +445     
- Partials       1140     1266     +126     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@comphead comphead marked this pull request as ready for review September 2, 2025 22:03
// can be tracked https://github.com/apache/datafusion-comet/issues/1937
// now supports only Array of primitive
(Seq(CometConf.SCAN_NATIVE_ICEBERG_COMPAT, CometConf.SCAN_NATIVE_DATAFUSION)
.contains(CometConf.COMET_NATIVE_SCAN_IMPL.get()) && dataType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CometConf.COMET_NATIVE_SCAN_IMPL.get() returns auto by default, so this logic would not get triggered in the default case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a good point, for auto is it a way to figure out current scan impl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is no way to know it in runtime, created PR #2295

"""
|select 1 a
|""".stripMargin,
"select array(array(1, 2, null), array(), array(10), null) from tbl")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@comphead
Copy link
Contributor Author

comphead commented Sep 5, 2025

Test failed because of #2321

@comphead
Copy link
Contributor Author

comphead commented Sep 6, 2025

Depends on #2286

@comphead
Copy link
Contributor Author

@andygrove PTAL again. all tests passed including #2321

@comphead comphead requested a review from andygrove September 18, 2025 00:49
Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @comphead

@andygrove andygrove merged commit 12f0fdc into apache:main Sep 18, 2025
94 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: Support nested Array literals
3 participants