feat(db): per-query session variable scoping via SET LOCAL [2/2]#27
feat(db): per-query session variable scoping via SET LOCAL [2/2]#27rlindgren wants to merge 6 commits into
Conversation
Replace *pgxpool.Pool with a DBTX interface in all ~60 package-level db functions. DBTX is satisfied by both *pgxpool.Pool and pgx.Tx, following the sqlc convention. This enables future session variable scoping via SET LOCAL in transactions without changing callers. Zero behavior change: all Client methods continue to pass c.pool, which satisfies DBTX. No new dependencies, no test changes required.
Enable multi-tenant RLS from a shared connection pool by scoping
PostgreSQL session variables per-query using SET LOCAL inside
transactions. Compatible with all PG deployment topologies:
direct PG, PgBouncer (session + transaction mode), and RDS Proxy.
Key design:
- acquireDBTX(ctx) transparently wraps queries in BEGIN/SET LOCAL/COMMIT
when session vars are present; returns pool directly (zero overhead)
when they're not
- Baseline vars from config/CLI merged with per-request context vars
- Import methods inject SET LOCAL into their existing transactions
- context.Background() used for commit/rollback to prevent silent
write loss on request context cancellation
API:
- Library: db.WithSessionVars(ctx, db.SessionVars{"app.user_id": id})
- CLI: tigerfs mount --session-var app.user_id=42
- Config: session_variables: {app.user_id: "42"}
Tested with -race: concurrent isolation (20 goroutines, shared pool,
RLS enforced), rollback on failure, invalid GUC recovery, SET LOCAL
non-leaking, context override, import operations.
SessionVars is now a struct with sorted keys computed once at construction via NewSessionVars(). Eliminates per-query sort allocation in applySessionVars — keys are iterated via Range() in pre-sorted order with zero allocation.
Add session_variables config field to spec.md Configuration section, --session-var flag to CLI Mount Options, and changelog entry under Unreleased.
|
Thank you for this PR, @rlindgren. One direction that I've been focusing tigerfs on is the ability to support fairly arbitrary file/operational "undo" via it's history, see this new (mega) PR: #29 This relies pretty heavily on TimescaleDB to store history and logs; unfortunately, TimescaleDB does not support RLS. I'm not sure this change is therefore aligned with the direction I'm taking tigerfs. Can you say more about the use case? |
|
Hey @mfreed thanks for the response. #29 looks like a massive effort (I knew you must be cooking up something big!). It's a great direction for the project and is something that has been top-of-mind for me recently as well. The biggest issue I face is multi-tenancy and ACL. My use case involves many concurrent per-user mounts. Users have owned folders and files and possess varying degrees of access (r/w) to shared tenant directories and folders based on their application-level permissions and roles. The DBTX refactor was intentionally minimal and standalone, just type widening that I think is generally useful. That work however was meant to facilitate the session-vars implementation. I understand that TimescaleDB doesn't support RLS, but I am still thinking that the change has value beyond RLS. For instance, I am using them in triggers for audit logging, ACL checks and computed column defaults. None of it depends on RLS specifically. The ability to apply them via SET LOCAL allows pgbouncer or RDS Proxy to maintain a reasonable connection pool, by not pinning the settings to the connection itself. Do you expect to run in to other issues as a result of allowing them? I would think that session vars could be useful for recording user_id, for example, in the operation log even though the hypertable itself can't enforce RLS. Do you have a different approach to supporting multi-tenancy or similar use case in mind? I totally understand if this is outside the direction you want to take the project. Happy to maintain it on a fork if so. Just wanted to offer it upstream first since I think it composes well with your existing architecture without affecting the TimescaleDB path. EDIT I also noticed --user-id in #29 for the operation log. Session vars could complement that nicely. The same user identity that the Go code writes to the log could be visible to PostgreSQL via current_setting('app.user_id') (or 'tiger.user_id'?), making it available to triggers, defaults, and views without extra plumbing. One flag feeding both paths. |
# Conflicts: # internal/tigerfs/db/query.go
# Conflicts: # internal/tigerfs/cmd/mount.go # internal/tigerfs/config/config.go # internal/tigerfs/db/query.go
Summary
Add per-query PostgreSQL session variable scoping using SET LOCAL inside transactions. Enables multi-tenant RLS from a shared connection pool, compatible with all PG deployment topologies: direct PG, PgBouncer (session + transaction mode), and RDS Proxy (no connection pinning).
Builds on #26 which introduced the DBTX interface.
Motivation
Currently, RLS session variables can only be set via the connection string (
options=-c app.user_id=42), forcing one pool per distinct identity. For N users × pool_size, this causes connection fan-out into the hundreds. PgBouncer and RDS Proxy can't help because session state is set at connection time.Design
acquireDBTX(ctx)transparently wraps queries inBEGIN; SET LOCAL; query; COMMITwhen session vars are present — returns the pool directly (zero overhead) when they're notSessionVarsstruct with pre-sorted keys — sort once at construction, zero allocation at query timecontext.Background()for commit/rollback to prevent silent write loss on request context cancellationAPI