feat(db): dedicated postgres schema + goose migrations (PR1/3)#64
feat(db): dedicated postgres schema + goose migrations (PR1/3)#64gavinelder wants to merge 3 commits into
Conversation
StaticReg's postgres tables previously landed in the public schema, and the schema was bootstrapped by re-executing an embedded SQL blob on every startup with no version tracking. That blocks enterprise deployments where the database is shared and schema evolution needs to be versioned and reversible. - Add STATICREG_DB_SCHEMA (default: staticreg), validated as a postgres identifier before interpolation. - Set search_path on every pooled connection so unqualified queries — including the existing webhook batch INSERT — resolve to the dedicated schema with no hot-path code change. - Replace the embed-and-Exec bootstrap with pressly/goose v3 migrations embedded via embed.FS. goose_db_version lives inside the dedicated schema. - Drop the post-DDL existence checks; goose tracks state authoritatively. - Document STATICREG_DB_SCHEMA in CONFIGURATION.md and surface it as an optional ConfigMap key in the deployment manifest. - Add integration tests (build tag: integration) that verify schema creation, search_path, migration version, idempotent re-runs, invalid-schema rejection, and a batch insert landing in the dedicated schema. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Introduces a dedicated PostgreSQL schema (configurable via STATICREG_DB_SCHEMA) and switches DB bootstrapping to embedded pressly/goose migrations, ensuring tables no longer land in public and schema state is tracked via migrations.
Changes:
- Add
STATICREG_DB_SCHEMAwith strict identifier validation and setsearch_pathon pooled connections. - Replace direct schema SQL execution with embedded goose migrations (
pkg/sql/migrations/*.sql). - Add integration tests verifying schema creation, migration application, and webhook batch inserts landing in the dedicated schema.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/db/postgres.go | Adds schema resolution/validation, sets search_path, and runs embedded goose migrations at startup. |
| pkg/sql/schema.go | Switches from embedded SQL string to embedded migrations filesystem (embed.FS). |
| pkg/sql/migrations/00001_create_container_pull_metrics.sql | Adds goose Up/Down migration for metrics table + indexes. |
| pkg/db/postgres_integration_test.go | Integration tests for dedicated schema creation, migrations, and invalid schema rejection. |
| pkg/webhook/batchserviceadapter_integration_test.go | Integration test asserting unqualified INSERTs resolve into the dedicated schema via search_path. |
| manifests/deployment.yml | Surfaces STATICREG_DB_SCHEMA as an optional ConfigMap-provided env var. |
| docs/CONFIGURATION.md | Documents STATICREG_DB_SCHEMA defaults and validation requirements. |
| go.mod / go.sum | Adds goose v3.24.3 and transitive dependencies. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -77,79 +115,58 @@ func InitPool() *pgxpool.Pool { | |||
| // Attempt to create the connection pool | |||
| pool, err := pgxpool.NewWithConfig(ctx, config) | |||
There was a problem hiding this comment.
InitPool uses a single 10s timeout context for pool creation (and later passes the same ctx into initSchema/goose.UpContext). If migrations take longer than the remaining time on that ctx, startup can fail due to deadline exceeded even when the DB is healthy. Consider creating a separate (and likely longer / configurable) context for initSchema after the pool is established.
There was a problem hiding this comment.
Fixed in 3eae76d — initSchema now runs under its own 1-minute context.WithTimeout from context.Background(). The 10s budget is scoped to pool construction + initial ping. Verified locally that schema bootstrap still completes well inside the new budget.
| t.Cleanup(func() { | ||
| ctx := context.Background() | ||
| _, _ = pool.Exec(ctx, "DROP SCHEMA IF EXISTS staticreg_test CASCADE") | ||
| pool.Close() | ||
| }) |
There was a problem hiding this comment.
The test closes pool explicitly later (pool.Close() before the second InitPool), but the t.Cleanup closure still tries to use that same pool to Exec the DROP SCHEMA. On a closed pool the DROP likely won’t run (error is ignored), leaving the schema behind between runs. Consider moving the first pool.Close() into cleanup (and using a different variable for the second pool), or have cleanup drop the schema using whichever pool is still open.
There was a problem hiding this comment.
Fixed in 3eae76d — the test no longer closes the first pool before creating the second. Both pools coexist (pgxpool tolerates multiple instances against the same DB/schema), and t.Cleanup runs the DROP via the still-open first pool. Verified end-to-end: \dn staticreg_test returns 0 rows after the test exits.
Two integration test files in this PR (pkg/db and pkg/webhook) duplicate the same STATICREG_DB_* env-var bootstrap. PR3 will add a third caller (pkg/server). Extract the shared setup into a dedicated package. - pkg/db/dbtest/dbtest.go (new, build tag integration): exports SetEnv(t, schema) which configures the docker-compose postgres defaults and a per-test schema name. Compiled only when the integration tag is active so it does not enter normal builds. - pkg/db/postgres_integration_test.go: drop the local setTestDBEnv helper, call dbtest.SetEnv instead. - pkg/webhook/batchserviceadapter_integration_test.go: replace the inline env map with dbtest.SetEnv. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address two review comments on PR #64: - InitPool reused a single 10s context for pool construction, ping, and goose migrations. A migration that runs late in that budget could fail with "context deadline exceeded" even on a healthy database. Use the 10s budget only for pool create + ping; give initSchema a separate 1m context. - TestInitPool_CreatesDedicatedSchemaAndAppliesMigrations closed the first pool before constructing the second, but t.Cleanup still used the (now closed) pool to DROP SCHEMA — silently failing and leaving staticreg_test behind between runs. Keep the first pool open through cleanup; pgxpool tolerates a second pool against the same database/schema for the idempotency check. Verified: after the test runs, "\dn staticreg_test" returns 0 rows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
This is PR 1 of 3 in the postgres enterprise hardening plan (see internal plan doc). It addresses the two critical findings from the postgres audit: tables landing in the
publicschema and schema initialization with no migration system.STATICREG_DB_SCHEMA(default:staticreg), validated as a postgres identifier before interpolation into DDL.search_pathon every pooled connection so the existing webhook batch INSERT (and any future unqualified query) resolves to the dedicated schema with no hot-path code change.embed.FS.goose_db_versionlives inside the dedicated schema.STATICREG_DB_SCHEMAindocs/CONFIGURATION.mdand inmanifests/deployment.yml(optional ConfigMap key).Why goose v3.24 (not 3.27)
The latest goose tags require Go 1.25.7. Pinned to
v3.24.3to avoid forcing a Go toolchain bump and a transitive pgx upgrade in this PR.What's not in this PR (queued for PR 2/3)
sslmode=require+ warn ondisable).MaxConns,MinConns, lifetimes via env).Behavior change for operators
Existing deployments will get a new
staticregschema on next startup. Tables inpublic.container_pull_metricsare not auto-migrated — historical data inpublicis left in place. Operators with existing data need to either:STATICREG_DB_SCHEMA=publicto keep current behavior (not recommended long-term), orINSERT INTO staticreg.container_pull_metrics SELECT * FROM public.container_pull_metrics;after the new schema is created.Happy to add a one-shot data migration to PR 1 if reviewers want it; left out for now to keep this PR focused.
Test plan
go build ./... && go vet ./...cleango test ./...(unit) passesgo test -tags=integration ./pkg/db/... ./pkg/webhook/...passes againstdocker compose up postgrespsqlinspection:\dnshows the dedicated schema,\dt staticreg.*showscontainer_pull_metrics+goose_db_version, migration version1recordedstaticreg serveis a goose no-op (idempotent)Bad-Schema; DROP TABLE) is rejected before any SQL runsSTATICREG_DB_SCHEMAoverridden via ConfigMap🤖 Generated with Claude Code