Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shard count affects MERGE ... WHEN NOT MATCHED BY SOURCE #7890

Open
naisila opened this issue Feb 6, 2025 · 0 comments · May be fixed by #7900
Open

Shard count affects MERGE ... WHEN NOT MATCHED BY SOURCE #7890

naisila opened this issue Feb 6, 2025 · 0 comments · May be fixed by #7900

Comments

@naisila
Copy link
Member

naisila commented Feb 6, 2025

Steps to reproduce:

set citus.shard_count = 4;

CREATE TABLE citus_distributed_target (tid integer, balance float, val text);
CREATE TABLE citus_reference_source (sid integer, delta float);
SELECT create_distributed_table('citus_distributed_target', 'tid');
SELECT create_reference_table('citus_reference_source');
INSERT INTO citus_distributed_target SELECT id, id * 100, 'initial' FROM generate_series(1,5,2) AS id;
INSERT INTO citus_reference_source SELECT id, id * 10  FROM generate_series(1,4) AS id;

-- this is the correct behaviour
BEGIN;
MERGE INTO citus_distributed_target t
  USING citus_reference_source s
  ON t.tid = s.sid
  WHEN MATCHED THEN
    UPDATE SET balance = balance + delta, val = val || ' updated by merge'
  WHEN NOT MATCHED THEN
    INSERT VALUES (sid, delta, 'inserted by merge')
  WHEN NOT MATCHED BY SOURCE THEN
    UPDATE SET val = val || ' not matched by source';

-- MERGE 5, meaning merge operated on 5 rows

select * from citus_distributed_target order by tid;

 tid | balance |              val
-----+---------+-------------------------------
   1 |     110 | initial updated by merge
   2 |      20 | inserted by merge
   3 |     330 | initial updated by merge
   4 |      40 | inserted by merge
   5 |     500 | initial not matched by source
(5 rows)

rollback;

-- increase shard count
SELECT alter_distributed_table('citus_distributed_target', shard_count:=8);

-- suddenly, WHEN NOT MATCHED BY SOURCE  is ineffective
BEGIN;
MERGE INTO citus_distributed_target t
  USING citus_reference_source s
  ON t.tid = s.sid
  WHEN MATCHED THEN
    UPDATE SET balance = balance + delta, val = val || ' updated by merge'
  WHEN NOT MATCHED THEN
    INSERT VALUES (sid, delta, 'inserted by merge')
  WHEN NOT MATCHED BY SOURCE THEN
    UPDATE SET val = val || ' not matched by source';

-- MERGE 4, meaning merge operated on only 4 rows
-- last entry tid = 5 should have been updated
select * from citus_distributed_target order by tid;
 tid | balance |           val
-----+---------+--------------------------
   1 |     110 | initial updated by merge
   2 |      20 | inserted by merge
   3 |     330 | initial updated by merge
   4 |      40 | inserted by merge
   5 |     500 | initial
(5 rows)
rollback;

It seems that the issue happens when we use a reference table or a Citus local table as a source, not a distributed table.
Also, it seems that the issue doesn't reproduce when there is enough data to populate all the shards of the table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants