-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Allows table scans to be disabled, even for multi key lookups #9631
base: master
Are you sure you want to change the base?
fix: Allows table scans to be disabled, even for multi key lookups #9631
Conversation
@@ -210,6 +210,12 @@ private PullPhysicalPlanType getPlanType() { | |||
} else if (lookupConstraints.stream().anyMatch(lc -> lc instanceof NonKeyConstraint)) { | |||
lookupConstraints = Collections.emptyList(); | |||
return PullPhysicalPlanType.TABLE_SCAN; | |||
// For issue #7174. Temporarily enable table scans if there are more than one key or if that | |||
// key is a multi-column key. | |||
} else if (lookupConstraints.stream().anyMatch(lc -> ((KeyConstraint) lc).getKey().size() > 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is just checking whether any of the key constraints involve more than one column right now. Should we also check if the table has multiple key columns, or is that handled elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is the PR description still accurate? The description mentions the config ksql.query.pull.table.scan.enabled
but the actual implementation seems different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is just checking whether any of the key constraints involve more than one column right now. Should we also check if the table has multiple key columns, or is that handled elsewhere?
Extracting the key information has already happened at this point and should match the schema for the table. A KeyConstraint will only be created if the AST makes references to columns which fully resolve a key, otherwise we throw an error. So no addition error checking should be required at this point. It's already happened at the logical planner level.
Also, is the PR description still accurate? The description mentions the config ksql.query.pull.table.scan.enabled but the actual implementation seems different.
queryPlannerOptions.getTableScansEnabled()
is doing that config check in PullQueryConfigPlannerOptions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
queryPlannerOptions.getTableScansEnabled() is doing that config check in PullQueryConfigPlannerOptions
But the change in this PR means that pull queries involving multiple key constraints against a table with a single key column will never be executed as a table scan now, regardless of whether the config is set to true or false, right? The PR description makes it sound like these types of queries will only be executed as non-table-scans if the config is set to false.
And for pull queries with key constraints against tables with multiple key columns, these queries will fail if the config is not enabled. (Is that what's already happening today?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
Allows table scans to be disabled, even when there are multiple keys or multi-column keys. Table scans were introduced to fix a serde bug in https://github.com/confluentinc/ksql/pull/8876/files#diff-ebcd3b261b6ef2904811e7d63a7158f0c4273b0f814c81bbfb8c8fc139a31322R125 . This worked, but always stopped the key lookup optimization for multiple keys or multi-column keys.
Now, this PR gives the option to set
ksql.query.pull.table.scan.enabled=false
in a request to still get the optimization, if the serde bug isn't an issue. Ideally, we would fix the issue so that table scans weren't attempted in this scenario anyway, but that's for a followup PR.Testing done
Added unit tests, ran manually.
Reviewer checklist