Description
Context
We have a multi-tenant SaaS setup where each tenant has a database in a managed database cluster (Scaleway/DigitalOcean/...). Currently we are using a single cluster. The goal for us for using pgcat is that we can limit the amount of connections to our database cluster as the cluster has a limited amount of connections available.
Problem
As we have one cluster we essentially only need one pgcat pool, the only issue I'm bouncing against is that it is impossible for us to allow multiple different databases inside a single pool. And if I were to create separate pools for each tenant, then the implementation with pgcat has no benefit as we can't limit the connections as a while towards the database cluster. A simple example:
- Max database cluster connections: 100
- Tenant pools: 200
- Pgcat connections to our cluster: 200
Potential solution
It would be awesome if we could allow a regex pattern for which databases should connect to a certain pool, in that way we could say that tenant-1
until tenant-50
should connect to pool 1 and tenant-51
until tenant-100
should connect to pool 2.
[general]
# ...
[pools]
[pools.postgres]
# ...
[pools.postgres.users.0]
# ...
[pools.postgres.shards.0]
servers = [["...", 5432, "primary"]]
database = "^tenant-(?:[1-9]|[1-4][0-9]|50)$"
Or is all this unneeded and am I missing something?
Activity
nealoke commentedon Sep 8, 2024
I just noticed that we can have a shard per tenant and that would fix the issue where we can't limit the connections overall but these are not possible to be added through the pgcat admin api which would mean that we need to manually add them to the
pgcat.toml
file and restart the service everytime a tenant gets created which seems like a hassle and will cause interuptions in service of tenants using the application.magec commentedon Sep 12, 2024
Hey, I just read the issue. I see you've already kind of found the answer. Although both things seem 'doable' there is no way of having 'dynamic db names' on pools neither adding new pool config through admin api (which is mostly for statistics and a few commands). Either way, you can leverage pgcat live configuration reloading functionality to achive what you want. The only piece left is a simple service that handles the config changes, rewrite the config using some kind of templating system and signal pgcat to reload the configuration, that would be the way I'd do it. Also, bear in mind that you have to execute SET SHARD TO 'SHARD_NUMBER'; before each query/transaction for this to work.
nealoke commentedon Sep 12, 2024
Thanks for the reply, we are working with Prisma so we can't really alter the SQL query in there so I'm affraid this won't be an option. If you have any other ideas that would be great, or other tools that could work as I'm not finding anything 🥲
magec commentedon Sep 16, 2024
There is also a possibility, If you can 'annotate' with comments your query, pgcat can extract that information and do the sharding selection.
rvsoni commentedon Nov 17, 2024
@nealoke
You can try the #855 fix
this would allow having a PGBouncer-type proxy behavior, and only on pool config would it allow connecting many databases; if the database is not pooled, it would be created dynamically on the fly and start serving.
Vad1mo commentedon Jan 14, 2025
@nealoke @rvsoni, we have the same challenges, multiple tenants each has own DB with username/pwd, and multiple tenants share a DB cluster. We are currently looking to make it work with PgCat, PgBouncer or Supavisor. (in that order)
On top there are a few less known pooler (at least to me) odyssey
pgagroal
cf-pgbouncer
@rvsoni, why did you try to make it work with pgcat, hence your pr?
It seems there are quite a few options out there already.
rvsoni commentedon Jan 16, 2025
Hello @Vad1mo
This fix works as you described: set one Pool on pgcat with Auth_quey config and make proxy = true (in my PR). This would dynamically create a pool of connections for each user/db on the same cluster.
From the client side, you need to maintain just only one pool to pgcat and keep login using your user/pass map to database (user name and database name must need to be same)
Ravi
nealoke commentedon Jan 24, 2025
@Vad1mo I'm not that familiar with all that is related to revers proxies and half of what the pgcat tries to do 😅. So I read your comment and also looked at the messages in the #855 and the PR but can't seem to understand how this would solve my issue or how I would even need to implement this.
I don't directly see the link to how I can map
tenant-1
totenant-50
toshard 0
for example.rvsoni commentedon Jan 25, 2025
Hello @Vad1mo, @nealoke
The most important point this discussion misses is the need for an authentication configuration on PgCat based on an
auth_query
.You need to set a Pool on pgcat points to a Postgre cluster with auth method of
auth_query
;on each new authentication, pgcat query postgres cluster
pg_shadow
table likeauth_query="SELECT usename, passwd FROM pg_shadow WHERE usename='$1'
.This query returns a password from the PostgreSQL cluster for each tenant user you try to authenticate over the pgcat pool setting. This allows you to connect to many databases on one pgcat pool setting.
Refer https://github.com/rvsoni/pgcat/blob/main/pgcat.proxy.toml and https://hub.docker.com/r/rvsoni/pgcat docker image to test this dynamic connection pool creation for all tenant database,
szguoxz commentedon May 5, 2025
So, you PR is not merged yet?
I thought pgcat is a drop-in replacement of pgbouncer. I am really suprised to find out it does not have this feature.
But I don't quite like your design, it's kindof hacky, I still like pgbouncer implementation. At least, the enduser does not need to mess around with username/databasename. What do you think?
Maybe also implement a default pool which will accept all different database names?
Calvin
rvsoni commentedon May 19, 2025
@Vad1mo
I implemented this fix in PgCat to handle opening many connections on the client application.
I am working on a Java application and using PGBouncer with an
Auth_query
based setup would open too many connections on the Java application too,for example: 100 Tenant database, 10 Pool for tech tenant scenario,
The Java application connection pool would be set to 1000 connections for all 100 Active tenants. In my case, I had to support 30k to 50k active tenants. PGBouncer would also not be able to work on my case,
With PgCat, I get a single proxy database (same name as PgCat Pool name), so my Java app would need to manage the pool only with the proxy DB of the PgCat pool. Behind PgPool, I can open and manage as many databases as I require per tenant database.
PgCat uses bb8 rust lib to create a connection pool internally, https://docs.rs/bb8/latest/bb8/index.html
This fix helps me manage just a few connections on a Java application. Using the PgCat pool, I can switch to any tenant database connection on the fly. I can also use the same PgCat pool against many PostgreSQL Cluster setups, reducing my resource requirements.
With
auth_query
mode, PGBouncer would connect to one PostgreSQL cluster only,Having 10 PostgreSQL clusters set up would need 10 PGBouncer,
rvsoni commentedon May 19, 2025
@szguoxz
PGBouncer and a transparent Proxy setup, all tenant databases are directly visible to the client,
PgCat uses a proxy database (same as pool Name) and hides all the tenant databases behind it so that you can do a connection routine with pgcat for each resource requirement on the client side for opening and managing connections.