Skip to content

feat(pam): add MongoDB support#162

Merged
saifsmailbox98 merged 31 commits into
mainfrom
saif/pam-146-support-mongodb-resources-in-pam-2
Apr 8, 2026
Merged

feat(pam): add MongoDB support#162
saifsmailbox98 merged 31 commits into
mainfrom
saif/pam-146-support-mongodb-resources-in-pam-2

Conversation

@saifsmailbox98
Copy link
Copy Markdown
Contributor

@saifsmailbox98 saifsmailbox98 commented Mar 27, 2026

Description 📣

Adds MongoDB support to PAM

Related Infisical/infisical#5817

Type ✨

  • Bug fix
  • New feature
  • Improvement
  • Breaking change
  • Documentation

Tests 🛠️

# Here's some code block to paste some code snippets

Implement MongoDB handler using the mongo.Client bridge approach (Approach B).
The proxy parses client OP_MSG commands, executes them via RunCommand on a
real mongo.Client connection, and encodes responses back as OP_MSG.

- Wire protocol helpers for reading/writing OP_MSG with Kind 0/1 sections
- mongo.Client setup with SetDirect(true), SetMaxPoolSize(1), SRV support
- Command bridge with full BSON session recording
- Auth interception (saslStart/saslContinue faked as success)
- Hello response sanitization (strips compression/auth fields)
- Connection string display for MongoDB resources
mongosh sends OP_QUERY for the initial isMaster/hello handshake even on
modern servers. Add OP_QUERY parsing and OP_REPLY response building so
the proxy can complete the handshake before switching to OP_MSG.
- Verify raw TCP connectivity before creating mongo.Client
- Reduce timeouts (5s connect, 10s server selection, 2s heartbeat)
- Add ServerMonitor to log actual heartbeat failures (auth/TLS errors)
The mongo.Client already sent its own hello with client metadata during
connection setup. Strip client, compression, saslSupportedMechs, and
speculativeAuthenticate from hello/isMaster commands before forwarding,
since MongoDB only allows these in the first hello on a connection.
- Strip lsid, $clusterTime, $readPreference, txnNumber, startTransaction,
  autocommit from all commands before RunCommand to prevent duplicate BSON
  field errors (fixes 'ping.lsid is a duplicate field')
- Consolidate all field stripping into executeAndLog
- Handle client EOF as graceful disconnect instead of error
- Use fresh context for client.Disconnect to ensure clean shutdown
- Suppress expected 'context canceled' heartbeat errors during shutdown
When the client sets moreToCome (bit 1), it will send more messages
before expecting a response. Execute the command for side-effects but
don't reply — responding to moreToCome messages injects unexpected bytes
into the TCP stream, corrupting subsequent commands.

Also add named constants for OP_MSG flag bits.
Increase connect timeout to 10s, server selection to 15s, and reduce
heartbeat frequency to 30s. Prevents connection storms when mongosh
opens multiple connections to high-latency remote servers.
Three fixes for high-latency remote MongoDB servers:

1. Synthetic initial handshake: respond to mongosh's first hello
   immediately before connectToTarget, preventing 2s timeout when
   the target takes 1-5s to connect (TCP + TLS + SCRAM).

2. Strip monitoring long-poll fields: remove topologyVersion and
   maxAwaitTimeMS from hello/isMaster commands so the server responds
   immediately instead of blocking the bridge for 10+ seconds.

3. Clean up debug logging to Debug level.
Strip driver/protocol fields ($clusterTime, lsid, $readPreference, etc.)
from logged input so the actual command data is visible. Skip logging
internal commands (ismaster, hello, ping) that are not user activity.
@linear
Copy link
Copy Markdown

linear Bot commented Mar 27, 2026

@saifsmailbox98 saifsmailbox98 changed the title Saif/pam 146 support mongodb resources in pam 2 feat(pam): add MongoDB support Mar 27, 2026
mongosh opens 4+ connections and cycles them every 10-20s. Each
connection previously created a full mongo.Connect() with pooling,
heartbeats, and topology monitoring — adding 2-5s per reconnect.

Replace the Go mongo driver with direct TCP/TLS connections and raw
wire protocol forwarding, matching the pattern used by the Postgres,
MySQL, MSSQL, and Redis handlers. Implement SCRAM-SHA-256/SHA-1 auth
using the xdg-go/scram library (already a transitive dep). Add SRV
resolution via net.LookupSRV/LookupTXT for mongodb+srv:// URLs.

Client commands are now forwarded as raw wire messages instead of being
parsed, field-stripped, and re-executed via RunCommand. Session logging
still inspects BSON for recording.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 27, 2026

Greptile Summary

This PR adds MongoDB support to the PAM proxy, implementing a custom MongoDB wire protocol bridge (bridge.go, protocol.go, proxy.go) that intercepts client connections, injects credentials, proxies commands to the real MongoDB server, and logs all user-initiated commands to the session log. The design follows the existing PAM handler pattern (PostgreSQL, MySQL, MSSQL, Redis) and fits cleanly into the existing resource type / session machinery.

Key areas to address before merging:

  • moreToCome errors silently dropped (bridge.go:261) — when the client sets the moreToCome flag (used in write batches), errors from command execution are fully discarded. Per the MongoDB wire protocol spec, a server-side failure during a moreToCome batch should close the connection; the current code reads the next message instead, potentially masking failures or leaving sessions in an inconsistent state.
  • Auth source hardcoded to \"admin\" (proxy.go:1029) — many production deployments (Atlas, self-managed) require authenticating against a specific database, not admin. This should be made configurable.
  • SRV host interpolated without validation (proxy.go:997) — the Host field is embedded directly into a mongodb+srv:// URI. Special characters in the host value could manipulate the URI structure; a hostname validation step would close this gap.
  • Regex not using re2 (session/uploader.go:61) — the modified regexp.MustCompile call should be migrated to github.com/wasilibs/go-re2 per project convention to guard against ReDoS on the (.+) capture group.
  • Logging missing username (proxy.go:953)log.* calls in the MongoDB handler omit the username field required by the project's logging convention ([username=${username}]).
  • No user-facing documentation — no docs were found in the /docs folder for this feature. How will customers discover MongoDB PAM support?

Confidence Score: 4/5

Safe to merge after addressing the moreToCome error-handling bug and the hardcoded auth source limitation

One clear P1 logic bug (moreToCome errors silently dropped) could cause silent write failures or inconsistent state during batch operations. The remaining findings are P2 quality/convention issues that don't block the happy path but should be resolved soon.

packages/pam/handlers/mongodb/bridge.go (moreToCome error handling), packages/pam/handlers/mongodb/proxy.go (hardcoded auth source, SRV URI construction)

Important Files Changed

Filename Overview
packages/pam/handlers/mongodb/bridge.go New file implementing the MongoDB wire protocol bridge; silently drops executeAndLog errors when moreToCome flag is set (P1)
packages/pam/handlers/mongodb/protocol.go New file with solid OP_MSG/OP_QUERY/OP_REPLY wire protocol parsing; bounds checks and size limits are in place
packages/pam/handlers/mongodb/proxy.go New MongoDB proxy connecting to target; auth source is hardcoded to "admin" and SRV host is interpolated directly into a URI without validation
packages/pam/pam-proxy.go Adds MongoDB to the supported resource types list and wires up the new proxy; follows existing patterns consistently
packages/pam/session/uploader.go Adds ResourceTypeMongodb constant and extends session filename regex; regex should use the re2 package per project convention
packages/pam/local/database-proxy.go Minimal addition printing the MongoDB connection string for the local proxy; straightforward and correct

Comments Outside Diff (4)

  1. packages/pam/session/uploader.go, line 1189 (link)

    P2 Regex should use the re2 package

    This line modifies an existing regexp.MustCompile call. Per the project's convention, all regex usage — including modifications to existing patterns — should use the github.com/wasilibs/go-re2 package to prevent potential ReDoS attacks. The (.+) capture group in particular can behave catastrophically on adversarially crafted filenames with standard Go regexp.

    Please verify this pattern at https://devina.io/redos-checker and migrate to re2.MustCompile. The same applies to the unchanged legacyFormatRegex on line 83.

  2. packages/pam/handlers/mongodb/proxy.go, line 953-956 (link)

    P2 Log entries missing username field

    Per the project's logging convention, log messages should include the username in the format [username=${username}] for easier querying in cloud services. The connection log here (and other log.* calls in the MongoDB handler) omits this field.

    The same pattern should be applied in pam-proxy.go at the "Starting MongoDB PAM proxy" log line and in the error-path logs inside bridge.go.

    Rule Used: Follow a specific logging pattern for easier query... (source)

    Learnt From
    Infisical/infisical#3651

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

  3. packages/pam/handlers/mongodb/proxy.go, line 1026-1031 (link)

    P2 AuthSource hardcoded to "admin"

    The AuthSource is hardcoded to "admin", which is the default for SCRAM-SHA-256 but won't match all MongoDB deployments. Many production configurations authenticate users against the database they are granted access to, and some Atlas or self-managed clusters require a different auth database entirely.

    Consider adding an AuthSource field to MongoDBProxyConfig (falling back to "admin" if empty), or at minimum defaulting to p.config.InjectDatabase when it is non-empty, so administrators can target deployments with non-standard auth databases.

  4. packages/pam/handlers/mongodb/proxy.go, line 997-998 (link)

    P2 SRV URI built via string interpolation with unvalidated host

    p.config.Host is embedded directly into a connection URI string. If Host ever contains @, /, or ? characters — even due to a misconfigured or maliciously crafted credential entry — the resulting URI could inject a username, override a path, or append query parameters, redirecting the connection to an unintended target.

    Although Host comes from system credentials rather than direct end-user input, consider validating that the value is a plain hostname (no scheme, no @, no ?) before building the URI, or use a URL builder to safely encode each component.

Reviews (1): Last reviewed commit: "perf: replace mongo driver with direct T..." | Re-trigger Greptile

Comment thread packages/pam/handlers/mongodb/bridge.go Outdated
Copy link
Copy Markdown
Member

@x032205 x032205 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready to approve, just want to test with Atlas first

…g and correctness

The old MongoDB proxy reimplemented SRV resolution, SCRAM auth, and TLS
from scratch, causing per-connection overhead and a fabricated server hello.
Replace with the official driver's topology which handles all of this
internally, pools connections per session, and returns real server responses.

- Use topology.Topology with driver-managed SRV, TLS, auth, and pooling
- Share topology per session in the gateway (first connection creates, others reuse)
- Send warmup connection on session start so topology is ready before first client
- Replace custom wire protocol parsing with driver's wiremessage/bsoncore packages
- Block auth commands with proper wire protocol errors instead of fake SCRAM responses
- Add message size validation from server hello
- Accept MongoDB URIs in host field (authSource and other options in the URI)
- Fix TLS ServerName for MongoDB (let driver set per replica set member)
… pool

The warmup connection's bridge was holding a pooled Atlas connection
hostage, forcing every real client to create a new TCP+TLS+auth
connection (~1-2s), exceeding mongosh's 2-second timeout.

- Close warmup connection immediately after triggering topology creation
- Set minPoolSize=2 so the pool proactively maintains warm connections
v1.17.x is no longer actively developed (bug fixes only until early 2026).
Migrates to go.mongodb.org/mongo-driver/v2 and removes the unnecessary
bsoncore dependency in favor of bson.Raw native methods.

- Update all imports to v2 paths
- Replace driver.Connection with *mnet.Connection (Read/Write API)
- Replace description.ReadPrefSelector with local primarySelector
- Remove x/bsonx/bsoncore, use bson.Raw.IndexErr/LookupErr directly
Remove buildOpReply (never called), opReplyOpCode (only used by it),
flagChecksumPresent (redundant with wiremessage.ChecksumPresent),
and flagExhaustAllowed (proxy doesn't handle exhaust cursors).
MongoDB 6.1+ rejects hello commands with client metadata on connections
that already completed a handshake. Since the proxy forwards through
driver-pooled connections that are already handshaked, strip the `client`
and `compression` fields from hello requests before forwarding.
mongosh sends the initial isMaster as OP_QUERY (not OP_MSG). The
previous fix only sanitized OP_MSG hello requests, missing this path.
…sanitization

Two bugs fixed:

1. sanitizeHelloRequest used bson.M (map, random key order) which could
   reorder BSON fields so the command name was no longer first, causing
   "no such command: topologyVersion" errors. Switched to bson.D.

2. MongoDB 8.0+ removed OP_QUERY support entirely. The proxy now converts
   OP_QUERY hello/isMaster to OP_MSG before forwarding, and converts the
   OP_MSG response back to OP_REPLY for the client.
…meout

The caller passes a non-nil TLSConfig even when EnableTLS=false, which
caused the driver to attempt TLS on non-TLS servers (standalone Docker
MongoDB). Now only applies TLSConfig when EnableTLS is true.

Also adds directConnection for non-SRV hosts (needed for standalone
servers) and a 10s server selection timeout to surface connection
errors instead of blocking forever.
…ection corruption

- Strip topologyVersion and maxAwaitTimeMS from hello requests to prevent
  the server from entering streaming mode (moreToCome responses)
- Clear moreToCome flag from all responses sent to the client
- Clear exhaustAllowed flag from client requests to prevent exhaust cursors
- Fix bson.M → bson.D in sanitizeHelloWireMessage (response key ordering)
- Remove minPoolSize to avoid warm connection interference
…nnection

The root cause of "Server ended moreToCome unexpectedly" was that hello
responses included topologyVersion, causing mongosh's SDAM monitor to use
exhaustCommand() for streaming hello. The proxy strips awaitable fields
from requests but the client still attempted exhaust mode because it saw
topologyVersion in the response. Stripping it forces polling mode, which
is fully compatible with a request/response proxy.

Additional fixes:
- Sanitize hello responses in the OP_QUERY path (was missing)
- Add clearMoreToCome to forwardRaw and handleOpQuery non-hello paths
- Drain server-side moreToCome continuations as a safety net
- Pre-warm a pool connection during NewMongoDBProxy
- Make warmup synchronous: send a real hello through the full chain
  (local → relay → gateway → MongoDB) and block until the response
  confirms the topology is ready, so mongosh connects on first try
Move the MongoDB wire protocol hello/response logic from database-proxy.go
into mongodb.Warmup(), keeping the local proxy thin and protocol-agnostic.
Instead of sending MongoDB wire protocol data to detect when the gateway's
topology is ready, simply include serverSelectionTimeoutMS=15000 in the
displayed connection string. This gives the first connection enough time
for topology creation while keeping the local proxy protocol-agnostic.

The async warmup is kept to trigger topology creation early on the gateway.
- Remove unused opQuery fields (Flags, Skip, Return)
- Replace client-specific references (mongosh) with generic language
- Simplify overly verbose comments to focus on intent over mechanics
- Fix misleading moreToCome comment (was describing wrong direction)
The serverSelectionTimeoutMS=15000 in the connection string is sufficient
for the first connection to succeed while the gateway creates the topology.
The warmup optimization is unnecessary complexity.
@saifsmailbox98 saifsmailbox98 requested review from x032205 and removed request for sheensantoscapadngan April 6, 2026 22:33
Comment thread packages/pam/handlers/mongodb/bridge.go Outdated
Comment thread packages/pam/handlers/mongodb/bridge.go Outdated
Comment thread packages/pam/handlers/mongodb/protocol.go
Comment thread packages/pam/local/database-proxy.go
When the backend sends a full MongoDB URI as the connection string,
buildURI only injected credentials but silently dropped the database
field. Also fixed credential escaping in the plain-host-spec branch
to use url.UserPassword instead of url.PathEscape.
- Extract packOpMsg helper to deduplicate OP_MSG rebuild logic
- Add max iteration cap on moreToCome drain loop
- Add comment explaining request ID offset at 1000
@saifsmailbox98 saifsmailbox98 requested a review from x032205 April 7, 2026 14:56
Comment thread packages/gateway-v2/gateway.go
Comment thread packages/pam/handlers/mongodb/proxy.go
…stead

The fallback path on url.Parse failure silently produced a URI missing
the database argument. Since url.Parse practically never fails on valid
URIs, replace the fallback with a proper error return and propagate it
through buildURI → NewMongoDBProxy.
@saifsmailbox98 saifsmailbox98 merged commit ec49ef5 into main Apr 8, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants