Fix 0-Task Plans in Single-Shard Router When Updating a Local Table with Reference Table in Subquery #7897

m3hm3t · 2025-02-12T14:05:20Z

This PR fixes an issue #7891 in the Citus planner where an UPDATE on a local table with a subquery referencing a reference table could produce a 0-task plan. Historically, the planner sometimes failed to detect that both the target and referenced tables were effectively “local,” assigning INVALID_SHARD_ID and yielding a no-op plan.

Root Cause

In the Citus router logic (PlanRouterQuery), we relied on shardId to determine whether a query should be routed to a single shard.
If shardId == INVALID_SHARD_ID, but we also had not marked the query as a “local table modification,” the code path would produce zero tasks.
Local + reference tables do not require multi-shard routing. Failing to detect this “purely local” scenario caused Citus to incorrectly route to zero tasks.

Changes

Enhanced Local Table Detection

Updated IsLocalTableModification and related checks to consider both local and reference tables as “local” for planning, preventing the 0-task scenario.
Expanded ContainsOnlyLocalOrReferenceTables to return true if there are no fully distributed tables in the query.

Added Regress Test

Introduced a new regress test (issue_7891.sql) which reproduces the scenario.
Verifies we get a valid single- or local-task plan rather than a 0-task plan.

codecov · 2025-02-12T14:21:26Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.42%. Comparing base (c55bc8c) to head (ce9915d).
Report is 20 commits behind head on release-13.0.

Additional details and impacted files

@@               Coverage Diff                @@
##           release-13.0    #7897      +/-   ##
================================================
- Coverage         89.48%   89.42%   -0.06%     
================================================
  Files               276      276              
  Lines             60063    59895     -168     
  Branches           7524     7510      -14     
================================================
- Hits              53747    53561     -186     
- Misses             4166     4188      +22     
+ Partials           2150     2146       -4

m3hm3t · 2025-02-13T08:07:15Z

src/backend/distributed/planner/merge_planner.c

@@ -1583,7 +1583,7 @@ IsLocalTableModification(Oid targetRelationId, Query *query, uint64 shardId,
 		return true;
 	}

-	if (shardId == INVALID_SHARD_ID && ContainsOnlyLocalTables(rteProperties))
+	if (shardId == INVALID_SHARD_ID && ContainsOnlyLocalOrReferenceTables(rteProperties))


Should I update the function name and description based on the adjustments here?

colm-mchugh · 2025-02-14T15:20:48Z

src/test/regress/sql/issue_7891.sql

+ WHERE EXISTS (
+   SELECT (SELECT c15 FROM t2_ref)
+   FROM t4_pg
+);


Can the test also show the query plan ? EXPLAIN (VERBOSE, COSTS OFF) output for this query, thanks.

I tried adding it, but it behaves differently across different PostgreSQL versions. In this case, I don't think it's worth maintaining separate test outputs. Do you have any suggestions on how to handle this?

https://github.com/citusdata/citus/actions/runs/13288741578

The only way to handle this is to use a helper function explain_with_pg17_initplan_format() (see the test columnar_chunk_filtering.sql for an example of its usage), but even with that I think this difference in EXPLAIN format requires more than one test output, and I agree its not worth maintaining. Query correctness is the main concern here. So let's drop this suggestion if it means 2 output files.

colm-mchugh · 2025-02-14T15:30:51Z

src/test/regress/sql/issue_7891.sql

+
+-- Show final data
+SELECT 't6_pg after' AS label, * FROM t6_pg;
+


Can we include a test that ensures updates on a reference table are handled correctly ? That is, all copies of the reference table are updated.

colm-mchugh · 2025-02-14T15:34:55Z

src/backend/distributed/planner/multi_router_planner.c

 {
-	return !rteProperties->hasDistributedTable && !rteProperties->hasReferenceTable;
+	/* If hasDistributedTable is false, then all tables are either local or reference. */
+	return !rteProperties->hasDistributedTable;


Does this enable a reference table to be updated ? e.g. update ref_table where <clause using local and ref tables> ?

UPDATE t2_ref SET c15 = '2099-01-01 00:00:00'::timestamp WHERE EXISTS ( SELECT 1 FROM t6_pg ); ERROR: relation t6_pg is not distributed

I executed the query above and encountered an error. Should we keep it? May they be duplicated or redundant?

We get the same error with main, so it is expected. I'm in favor of keeping, even if the error condition is tested elsewhere, as it serves to document the current behavior. Can we add a test query where t6_pg is replaced by a reference table ? Something like:

create table t3_ref(pkey int, c15 timestamp); select create_reference_table('t3_ref'); insert into t3_ref values (99, now()); UPDATE t2_ref SET c15 = '2088-08-08 00:00:00'::timestamp WHERE EXISTS ( SELECT 1 FROM t3_ref );

and verify that t2_ref is updated on all nodes ? I think it should be, because the modify planner will not be looking at an invalid shard-id, but just to cover all bases... Thanks!

@m3hm3t there was a typo in my previous comment - it should have read: verify that t2_ref is updated on all nodes.

I checked this case using your branch and it is okay; the reference table is updated correctly - the check I used is below, if you want to include it in the regress test.

This was the one remaining case I wanted to check, the PR now looks good to me overall.

create table t3_ref(pkey int, c15 text); select create_reference_table('t3_ref'); insert into t3_ref values (99, 'Initial Data'); UPDATE t2_ref SET c15 = '2088-08-08 00:00:00'::timestamp WHERE EXISTS ( SELECT 1 FROM t3_ref); SELECT 't2_ref after UPDATE' AS label, * FROM t2_ref; SELECT citus_remove_node('localhost', :worker_2_port); SELECT 't2_ref after UPDATE - without worker 2' AS label, * FROM t2_ref; SELECT 1 FROM citus_add_node('localhost', :worker_2_port); SELECT citus_remove_node('localhost', :worker_1_port); SELECT 't2_ref after UPDATE - without worker 1' AS label, * FROM t2_ref; SELECT 1 FROM citus_add_node('localhost', :worker_1_port);

colm-mchugh · 2025-02-17T10:10:42Z

src/test/regress/sql/issue_7891.sql

+--    - The outer subquery iterates over every row in t4_pg.
+--    - The inner subquery selects c15 from t2_ref.
+-- 5. The update should occur if the nested subquery returns any row, effectively updating t6_pg's vkey to 43.
+-- 6. The final state of t6_pg is displayed to confirm that the update was applied.


Is the issue present for DELETE and MERGE ? Can we include tests for those ? Thanks.

Considering the restrictions in the MERGE section, do you think keeping these tests might make them duplicated?

MERGE INTO t6_pg AS tgt USING t4_pg AS src ON (tgt.pkey = 14000) -- trivial condition to "match" row pkey=14000 WHEN MATCHED THEN UPDATE SET c26 = 'merged_' || (SELECT pkey FROM t2_ref WHERE pkey=24000 LIMIT 1) WHEN NOT MATCHED THEN INSERT (vkey, pkey, c26) VALUES (99, src.pkey, 'inserted_via_merge'); ERROR: non-IMMUTABLE functions are not yet supported in MERGE sql with distributed tables MERGE INTO t6_pg AS tgt USING t4_pg AS src ON (tgt.pkey = src.pkey) WHEN MATCHED THEN UPDATE SET c26 = 'merged_value' WHEN NOT MATCHED THEN INSERT (vkey, pkey, c26) VALUES (src.vkey, src.pkey, 'inserted_via_merge'); SELECT 't6_pg after MERGE' AS label, * FROM t6_pg; label | vkey | pkey | c26 --------------------------------------------------------------------- t6_pg after MERGE | 43 | 12000 | initial t6_pg after MERGE | 43 | 14000 | to_merge t6_pg after MERGE | 5 | 15000 | inserted_via_merge (3 rows)

Per previous comment, I'm in favor of keeping because it documents the current behavior. But I'll leave it to your call. I've a couple of test suggestions though, can we add:
MERGE into a reference table using a postgres table as source
MERGE into a reference table using another reference table as source

…ed tables

…for clarity and update related documentation

…f local and reference tables

…nt clarity

…dling of local and reference tables

…nting DELETE and MERGE operations to validate reference table handling

colm-mchugh

Lgtm, thanks for accommodating the test requests.

m3hm3t self-assigned this Feb 12, 2025

m3hm3t changed the title ~~Simplify ContainsOnlyLocalTables function to check only for distribut…~~ Fix 0-Task Plans in Single-Shard Router When Updating a Local Table with Reference Table in Subquery Feb 12, 2025

m3hm3t added the bug label Feb 12, 2025

m3hm3t commented Feb 13, 2025

View reviewed changes

m3hm3t added the cherry-pick-13.0 label Feb 13, 2025

m3hm3t marked this pull request as ready for review February 13, 2025 08:16

m3hm3t requested review from colm-mchugh, naisila and onurctirtir February 13, 2025 08:16

colm-mchugh reviewed Feb 14, 2025

View reviewed changes

colm-mchugh reviewed Feb 17, 2025

View reviewed changes

m3hm3t added 10 commits February 19, 2025 13:52

Simplify ContainsOnlyLocalTables function to check only for distribut…

ad67932

…ed tables

Rename ContainsOnlyLocalTables to ContainsOnlyLocalOrReferenceTables …

ecb8992

…for clarity and update related documentation

Add regression tests for issue 7891 to validate reference table behavior

ab45fa2

Enhance regression test for issue 7891 to verify planner's handling o…

cc9ca4c

…f local and reference tables

Refactor ContainsOnlyLocalOrReferenceTables function to improve comme…

d113c75

…nt clarity

whitespace

247b469

normalize test

fa9db87

Update reference table test to correct vkey value in update statement

2220908

Enhance regression test for issue 7891 to clarify nested subquery han…

ddf8517

…dling of local and reference tables

Enhance test data for issue 7891 by adding sample entries and impleme…

ce9915d

…nting DELETE and MERGE operations to validate reference table handling

m3hm3t force-pushed the m3hm3t/issue_7891 branch from a843139 to ce9915d Compare February 19, 2025 13:54

m3hm3t requested a review from colm-mchugh February 19, 2025 13:54

colm-mchugh approved these changes Feb 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix 0-Task Plans in Single-Shard Router When Updating a Local Table with Reference Table in Subquery #7897

Fix 0-Task Plans in Single-Shard Router When Updating a Local Table with Reference Table in Subquery #7897

m3hm3t commented Feb 12, 2025 •

edited

Loading

codecov bot commented Feb 12, 2025 •

edited

Loading

m3hm3t Feb 13, 2025

colm-mchugh Feb 14, 2025

m3hm3t Feb 18, 2025

colm-mchugh Feb 18, 2025

colm-mchugh Feb 14, 2025

m3hm3t Feb 19, 2025

colm-mchugh Feb 14, 2025

m3hm3t Feb 19, 2025 •

edited

Loading

colm-mchugh Feb 19, 2025 •

edited

Loading

colm-mchugh Feb 20, 2025

colm-mchugh Feb 17, 2025

m3hm3t Feb 19, 2025

colm-mchugh Feb 19, 2025

colm-mchugh left a comment


		-- Show final data
		SELECT 't6_pg after' AS label, * FROM t6_pg;

Fix 0-Task Plans in Single-Shard Router When Updating a Local Table with Reference Table in Subquery #7897

Are you sure you want to change the base?

Fix 0-Task Plans in Single-Shard Router When Updating a Local Table with Reference Table in Subquery #7897

Conversation

m3hm3t commented Feb 12, 2025 • edited Loading

Root Cause

Changes

codecov bot commented Feb 12, 2025 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

m3hm3t Feb 19, 2025 • edited Loading

Choose a reason for hiding this comment

colm-mchugh Feb 19, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

colm-mchugh left a comment

Choose a reason for hiding this comment

m3hm3t commented Feb 12, 2025 •

edited

Loading

codecov bot commented Feb 12, 2025 •

edited

Loading

m3hm3t Feb 19, 2025 •

edited

Loading

colm-mchugh Feb 19, 2025 •

edited

Loading