-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for SQL APPLY operation #357
Labels
enhancement
New feature or request
Comments
The difference between the join and APPLY operator becomes evident there is a table-valued expression on the right side and the table-valued expression is to be evaluated for each row from the left relation. |
ashvina
added a commit
to ashvina/substrait-java
that referenced
this issue
Nov 24, 2022
This change adds support for parsing of SQL queries with APPLY (join with correlated subquery), and to build OuterReferences map of correlated variables present in the query's join predicates. The OuterRefs will be used while constructing Substrait plans to bind correlated variables. The change also adds few example queries which depend on APPLY / LATERAL operators. This change still does not map calcite-correlated-join to Substrait, as the spec for APPLY is still not approved. As such, while the parsing of calcite query plans will succeed after this change, the unit tests and run time conversion will continue to fail in the final step of building the Substrait plan. Additional changes are needed to support APPLY. Refs #substrait-io/substrait/issues/357
ashvina
added a commit
to ashvina/substrait-java
that referenced
this issue
Nov 27, 2022
This change adds support for parsing of SQL queries with APPLY (join with correlated subquery), and to build OuterReferences map of correlated variables present in the query's join predicates. The OuterRefs will be used while constructing Substrait plans to bind correlated variables. The change also adds few example queries which depend on APPLY / LATERAL operators. This change still does not map calcite-correlated-join to Substrait, as the spec for APPLY is still not approved. As such, while the parsing of calcite query plans will succeed after this change, the unit tests and run time conversion will continue to fail in the final step of building the Substrait plan. Additional changes are needed to support APPLY. Refs #substrait-io/substrait/issues/357
jinfengni
pushed a commit
to substrait-io/substrait-java
that referenced
this issue
Dec 2, 2022
* feat: support parsing of SQL queries with APPLY This change adds support for parsing of SQL queries with APPLY (join with correlated subquery), and to build OuterReferences map of correlated variables present in the query's join predicates. The OuterRefs will be used while constructing Substrait plans to bind correlated variables. The change also adds few example queries which depend on APPLY / LATERAL operators. This change still does not map calcite-correlated-join to Substrait, as the spec for APPLY is still not approved. As such, while the parsing of calcite query plans will succeed after this change, the unit tests and run time conversion will continue to fail in the final step of building the Substrait plan. Additional changes are needed to support APPLY. Refs #substrait-io/substrait/issues/357 * fix: unit test cases to validate correlated vars This change addresses review comments, the unit tests validate the outer reference map built from calcite plans of APPLY queries. * fix: add test for nested APPLY This change addresses review comments. A new test case to validate nested APPLY join parsing is added. Also added validation of depth information in existing tests.
ajegou
pushed a commit
to ajegou/substrait-java
that referenced
this issue
Mar 29, 2024
* feat: support parsing of SQL queries with APPLY This change adds support for parsing of SQL queries with APPLY (join with correlated subquery), and to build OuterReferences map of correlated variables present in the query's join predicates. The OuterRefs will be used while constructing Substrait plans to bind correlated variables. The change also adds few example queries which depend on APPLY / LATERAL operators. This change still does not map calcite-correlated-join to Substrait, as the spec for APPLY is still not approved. As such, while the parsing of calcite query plans will succeed after this change, the unit tests and run time conversion will continue to fail in the final step of building the Substrait plan. Additional changes are needed to support APPLY. Refs #substrait-io/substrait/issues/357 * fix: unit test cases to validate correlated vars This change addresses review comments, the unit tests validate the outer reference map built from calcite plans of APPLY queries. * fix: add test for nested APPLY This change addresses review comments. A new test case to validate nested APPLY join parsing is added. Also added validation of depth information in existing tests.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Related Substrait issues #119 #206
APPLY
[1], introduced in SQL Server in 2005, is frequently used in practice. In Calcite the operation isLATERAL
[2]. This operator looks similar to aJOIN
, i.e. it executes an expression on the two input relations. However it is semantically different. In aJOIN
, both sets to be joined must be self-sufficient. Otherwise it may result in errors like "multi-part identifier could not be bound". If dependencies between sets exists, a user can use correlated-subqueries. However, nested subqueries result in recursive evaluation strategies which incur O(n2) runtime [3].Example [4]:
Intent: For each row from
table1
select firstrowcount
rows fromtable2
, ordered bytable2.id
Query using
APPLY
SELECT * FROM table1, CROSS APPLY ( SELECT TOP (table1.rowcount) * FROM table2 ORDER BY id ) t2
Queries without using
APPLY
As benchmarked in [4], execution of query with
APPLY
in SQL Server is 2 orders of magnitude faster than the query without apply. I.e.APPLY
yields efficiency and is not replaceable with traditionalJOIN
s.Currently APPLY operation cannot be represented in Substrait. Which means, optimizations in various engines cannot be leveraged if
APPLY
like operations are deconstructed and mapped into existing operations supported bySubstrait
. To efficiently support full SQL, operation forAPPLY
is needed in Substrait. De-correlation should not be enforced.CC: @jcamachor
[1] https://www.mssqltips.com/sqlservertip/1958/sql-server-cross-apply-and-outer-apply/
[2] CALCITE-1472
[3] https://15721.courses.cs.cmu.edu/spring2018/papers/16-optimizer2/hyperjoins-btw2017.pdf
[4] https://explainextended.com/2009/07/16/inner-join-vs-cross-apply/
The text was updated successfully, but these errors were encountered: