Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add APPLY operator representation #363

Closed
wants to merge 3 commits into from

Conversation

ashvina
Copy link

@ashvina ashvina commented Oct 21, 2022

This PR adds a new relation type for APPLY operation. APPLY is similar to JOIN
operator since it joins two table sources. However, in APPLY, the table source
on the right depends on the result of computation of the table source on left.
Since the right source cannot be computed without left, the right source is
represented by an Expression in Substrait and not a relation. The right source
will most frequently be a SubQuery.

Closes #357

This PR adds a new relation type for APPLY operation. Apply is similar
to JOIN operator since it joins two table sources. However the table
source on the right depends on computation of the table source on right.
Since the right source cannot be computed without right, it is
represented by an Expression and not a relation. In Substrait the right
source will most likely be a SubQuery.
@CLAassistant
Copy link

CLAassistant commented Oct 21, 2022

CLA assistant check
All committers have signed the CLA.

@jacques-n
Copy link
Contributor

@ashvina , thanks for proposing this. It seems like we need an additional subquery type to support this pattern, no?

@hannes, I your team worked extensively on these patterns. Do you have a thought of how we should express this?

@jacques-n
Copy link
Contributor

Following up here @ashvina , did you see my question about the need for an additional subquery type?

@ashvina
Copy link
Author

ashvina commented Nov 27, 2022

Hi @jacques-n. Sorry, I missed your question earlier.

I agree, to properly support APPLY, additional subquery types will be needed. I have added examples queries in substrait-java/pull/106. In the examples, the correlated-right-subquery (FROM clause) produce a rel. A subquery_type for this rel is needed. I think the new type would be similar to subquery:in_predicate, as it returns a rel.

Although, I think extension of subquery expression is independent of this PR and is needed.

Assuming a new subquery type which produces a rel is defined, what do you think about specification of the APPLY?

Note, the APPLY is generally used with table-value-functions. IIUC, TVF spec in missing. Also, it is tricky to get java unit-test to work with TVF. Hence I have omitted TVF query examples. I believe the proposed APPLY spec should be able to support TVF whenever its expression spec is added.

ApplyType type = 4;

enum ApplyType {
Apply_TYPE_UNSPECIFIED = 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why lower case Apply here and APPLY for the next two?

@westonpace
Copy link
Member

This has been on my "to do" list for a while but I finally got around to looking at it because some behavior like cross apply is being discussed in Arrow (apache/arrow#14682).

@jacques-n I am curious if this really is a subquery. @ashvina I am also uncertain about expressing the right side as an "expression". That being said, I am very much a novice here :)

Correlated subqueries traditionally return a single value (maybe a single row if multiple aggregates) for each input row:

  • WHERE X IN ... (returns a single bool for each row)
  • WHERE X < ANY(...) (returns a single bool for each row)
  • SELECT X, (SUM(...) FROM ...) (returns a single int64 sum for each row)

However, the right side in a cross apply could return a complete table for each input row. Or it could (I think) return nothing at all.

That being said, I may be significantly misunderstanding any of these points and would appreciate any clarification.

@westonpace
Copy link
Member

@ashvina is this still in progress?

@gforsyth gforsyth added the awaiting-user-input This issue is waiting on further input from users label Jun 27, 2023
@jacques-n
Copy link
Contributor

No progress has been made in more than six months. Closing without prejudice.

@jacques-n jacques-n closed this Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-user-input This issue is waiting on further input from users
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for SQL APPLY operation
6 participants