Skip to content

Conversation

@pyramation
Copy link
Contributor

Fix concurrent role creation with exception handling only (no advisory locks)

Summary

This PR addresses the intermittent CI error ERROR: duplicate key value violates unique constraint "pg_authid_rolname_index" by catching both duplicate_object AND unique_violation exceptions in all role creation code paths.

Key difference from PR #325: This PR does NOT use advisory locks - it relies solely on exception handling to deal with concurrent role creation attempts.

Root Cause: Under concurrent execution, CREATE ROLE can hit the pg_authid_rolname_index unique constraint before PostgreSQL's duplicate object detection runs, resulting in unique_violation (error code 23505) instead of duplicate_object (error code 42710).

This Solution: Updated exception handlers to catch both error codes across all 4 role creation code paths:

  • bootstrap-roles.sql (anonymous, authenticated, administrator)
  • bootstrap-test-roles.sql (app_user, app_admin)
  • LaunchQLInit.bootstrapDbRoles() (custom users)
  • DbAdmin.createUserRole() (pgsql-test users)

Review & Testing Checklist for Human

  • Compare with PR Fix concurrent role creation race condition #325: Review the advisory lock approach in PR Fix concurrent role creation race condition #325 and decide which solution is preferable (locks provide stronger guarantees but add overhead; exception-only is simpler but may have more retry attempts under high concurrency)
  • Test under actual concurrent load: Run multiple parallel processes that try to create the same role simultaneously (e.g., lql admin-users add --test --yes in 5+ parallel shells) to verify this approach handles the race condition
  • Verify exception scope: Confirm that catching unique_violation here doesn't accidentally mask other unrelated unique constraint violations
  • Wait for PR Envs #3: The third PR will include a test suite that reproduces the concurrent error on main branch, which can be used to validate both this PR and PR Fix concurrent role creation race condition #325

Recommended Test Plan

  1. Run existing CI tests to ensure no regressions
  2. Manually test concurrent role creation with multiple parallel processes
  3. Compare error rates and performance between this PR and PR Fix concurrent role creation race condition #325 under load
  4. Use the test suite from PR Envs #3 (when available) to validate the fix

Notes

  • This is PR readme #2 of 3 PRs created for comparison purposes
  • PR Fix concurrent role creation race condition #325 uses advisory locks + exception handling (more robust)
  • This PR uses exception handling only (simpler, potentially less reliable under extreme concurrency)
  • PR Envs #3 will provide a test suite to reproduce the error and benchmark both approaches
  • The exception handling approach may result in more failed CREATE ROLE attempts under high concurrency (which get caught and ignored), whereas advisory locks prevent the attempts entirely
  • Compatible with PostgreSQL 13.3+ (the version used in CI)

Link to Devin run: https://app.devin.ai/sessions/2c33cab4511f4a7897e29001895487df
Requested by: Dan Lynch (@pyramation)

- Catch both duplicate_object and unique_violation exceptions
- No advisory locks - relies on exception handling alone
- Applied to all role creation code paths:
  - bootstrap-roles.sql (anonymous, authenticated, administrator)
  - bootstrap-test-roles.sql (app_user, app_admin)
  - LaunchQLInit.bootstrapDbRoles()
  - DbAdmin.createUserRole()

This approach handles the unique_violation error that occurs under
concurrent CREATE ROLE execution without using advisory locks. This
allows comparison of exception-only vs lock-based approaches.

Co-Authored-By: Dan Lynch <[email protected]>
@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

…locks)

- Add IF NOT EXISTS checks before every CREATE ROLE statement
- Add membership pre-checks before every GRANT statement
- NO advisory locks (exceptions-only variant)
- Keep both exception handlers (duplicate_object OR unique_violation)
- Defense-in-depth: pre-check + exceptions (no locks)

This pedantic approach ensures safety against concurrent role creation
and membership grant races without using advisory locks.

Co-Authored-By: Dan Lynch <[email protected]>
devin-ai-integration bot added a commit that referenced this pull request Nov 18, 2025
This test demonstrates the race condition that occurs when multiple
processes attempt to create the same PostgreSQL role simultaneously.

On the main branch, this test FAILS with unique_violation error when
multiple concurrent CREATE ROLE statements hit the pg_authid_rolname_index.

The test includes:
- Concurrent role creation test (reproduces the bug)
- Performance benchmarks at concurrency levels 2, 4, 8
- Concurrent GRANT operations test
- High concurrency stress test with retries

This PR intentionally fails on main to demonstrate the problem.
Compare against PR #325 (with locks) and PR #326 (no locks) to see fixes.

Co-Authored-By: Dan Lynch <[email protected]>
@devin-ai-integration
Copy link
Contributor

Latest Updates

Added comprehensive pre-existence checks to all CREATE ROLE and GRANT statements for defense-in-depth:

  • ✅ Pre-checks using IF NOT EXISTS and membership queries
  • ✅ NO advisory locks (exceptions-only variant)
  • ✅ Both exception handlers (duplicate_object OR unique_violation)

Related PRs

Testing

The concurrent test in PR #327 should PASS with this fix applied.


Link to Devin run: https://app.devin.ai/sessions/2c33cab4511f4a7897e29001895487df
Requested by: Dan Lynch ([email protected]) / @pyramation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants