Skip to content

Conversation

egormanga
Copy link

@egormanga egormanga commented Jun 23, 2025

Description of Changes

Previously, only INNER and CROSS joins were possible.
This adds support for LEFT OUTER joins with certain limitations (basically, same as for INNER).

Doesn't play so well with RLS currently, though (I assume it's something to do with query optimization for similar joins, or that RLS discards already right-matched rows without substituting them with null-matched instead).

API and ABI breaking changes

Non-present column values come in as () units (for null) instead of the expected type, so it's like everything suddenly became Optional. This might require some additional handling on the clients.

Probably won't work with the PG wire protocol for the same reason, although the fix for that will be server-side, as the protocol itself is fine with nulls.

Expected complexity level and risk

3.

Testing

  • Manual testing
  • Needs unit tests

@bfops
Copy link
Collaborator

bfops commented Jun 24, 2025

Thank you for opening this! Some initial questions while we work on getting a reviewer for this:

  1. Is this blocking a particular use case for you?
  2. When you say "Doesn't play so well with RLS", what exactly do you mean? What incorrect behavior do you see?
  3. What kind of manual testing did you perform? Are updates delivered incrementally to clients?

@egormanga
Copy link
Author

egormanga commented Jun 24, 2025

@bfops Thanks for the quick response! Firstly, I'd like to use this opportunity to ask for you to notice my other PRs as well, if possible. Most of them are there for a solid reason, too.

  1. Yes, this is a major blocker for multiple of my cases, because a 3-way joining query with a 2-3-way joining RLS for each of the joined tables causes major slowdown due to the necessity of using cross product joins to emulate outer joins.
  2. First of all, I only tested it with my earlier RLS patch applied so far, due to the complexity of the cases required to properly test it (I didn't have time to write synthetic tests for it).
    With this in mind: RLS seems to discard extra lhs rows that matched the non-visible rhs rows, not giving them a chance to not match rhs (i.e. match against null rows) instead. This results in less rows overall being reported than without the patch (albeit in my particular current case I may trade off correctness for multiple magnitudes' speed increase).
  3. I only tested plain queries for correctness (e.g. seeing partially matched rows with outer vs omission of such rows with inner) and heavy complex RLS-enabled queries for drastic speed improvement and lack of duplicate rows (those were from outer join emulation via cross joins, for obvious reasons).

There's also currently a slight issue with the resulting format, such that the official CLI tool fails to parse nulls (() units) in place of another expected column type (see below). API itself works fine, though. I have not tested this with client SDKs at all, because I'm not using those in the current project at the moment.

@bfops
Copy link
Collaborator

bfops commented Jun 25, 2025

Thank you for all the helpful info! We're still waiting on some reviewers for this part of the codebase to become available - right now they're all hands on deck with supporting the BitCraft launch, but all of your PRs are on our radar! I think they will probably have more thoughts about making sure we get corresponding support added in the other parts of the codebase (e.g. CLI, client SDKs)

@egormanga
Copy link
Author

Update: the CLI seems to handle nulls just fine now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants