feat(pam): add MongoDB support#162
Conversation
Implement MongoDB handler using the mongo.Client bridge approach (Approach B). The proxy parses client OP_MSG commands, executes them via RunCommand on a real mongo.Client connection, and encodes responses back as OP_MSG. - Wire protocol helpers for reading/writing OP_MSG with Kind 0/1 sections - mongo.Client setup with SetDirect(true), SetMaxPoolSize(1), SRV support - Command bridge with full BSON session recording - Auth interception (saslStart/saslContinue faked as success) - Hello response sanitization (strips compression/auth fields) - Connection string display for MongoDB resources
mongosh sends OP_QUERY for the initial isMaster/hello handshake even on modern servers. Add OP_QUERY parsing and OP_REPLY response building so the proxy can complete the handshake before switching to OP_MSG.
- Verify raw TCP connectivity before creating mongo.Client - Reduce timeouts (5s connect, 10s server selection, 2s heartbeat) - Add ServerMonitor to log actual heartbeat failures (auth/TLS errors)
The mongo.Client already sent its own hello with client metadata during connection setup. Strip client, compression, saslSupportedMechs, and speculativeAuthenticate from hello/isMaster commands before forwarding, since MongoDB only allows these in the first hello on a connection.
- Strip lsid, $clusterTime, $readPreference, txnNumber, startTransaction, autocommit from all commands before RunCommand to prevent duplicate BSON field errors (fixes 'ping.lsid is a duplicate field') - Consolidate all field stripping into executeAndLog - Handle client EOF as graceful disconnect instead of error - Use fresh context for client.Disconnect to ensure clean shutdown - Suppress expected 'context canceled' heartbeat errors during shutdown
When the client sets moreToCome (bit 1), it will send more messages before expecting a response. Execute the command for side-effects but don't reply — responding to moreToCome messages injects unexpected bytes into the TCP stream, corrupting subsequent commands. Also add named constants for OP_MSG flag bits.
Increase connect timeout to 10s, server selection to 15s, and reduce heartbeat frequency to 30s. Prevents connection storms when mongosh opens multiple connections to high-latency remote servers.
Three fixes for high-latency remote MongoDB servers: 1. Synthetic initial handshake: respond to mongosh's first hello immediately before connectToTarget, preventing 2s timeout when the target takes 1-5s to connect (TCP + TLS + SCRAM). 2. Strip monitoring long-poll fields: remove topologyVersion and maxAwaitTimeMS from hello/isMaster commands so the server responds immediately instead of blocking the bridge for 10+ seconds. 3. Clean up debug logging to Debug level.
Strip driver/protocol fields ($clusterTime, lsid, $readPreference, etc.) from logged input so the actual command data is visible. Skip logging internal commands (ismaster, hello, ping) that are not user activity.
mongosh opens 4+ connections and cycles them every 10-20s. Each connection previously created a full mongo.Connect() with pooling, heartbeats, and topology monitoring — adding 2-5s per reconnect. Replace the Go mongo driver with direct TCP/TLS connections and raw wire protocol forwarding, matching the pattern used by the Postgres, MySQL, MSSQL, and Redis handlers. Implement SCRAM-SHA-256/SHA-1 auth using the xdg-go/scram library (already a transitive dep). Add SRV resolution via net.LookupSRV/LookupTXT for mongodb+srv:// URLs. Client commands are now forwarded as raw wire messages instead of being parsed, field-stripped, and re-executed via RunCommand. Session logging still inspects BSON for recording.
Greptile SummaryThis PR adds MongoDB support to the PAM proxy, implementing a custom MongoDB wire protocol bridge ( Key areas to address before merging:
Confidence Score: 4/5Safe to merge after addressing the moreToCome error-handling bug and the hardcoded auth source limitation One clear P1 logic bug (moreToCome errors silently dropped) could cause silent write failures or inconsistent state during batch operations. The remaining findings are P2 quality/convention issues that don't block the happy path but should be resolved soon. packages/pam/handlers/mongodb/bridge.go (moreToCome error handling), packages/pam/handlers/mongodb/proxy.go (hardcoded auth source, SRV URI construction) Important Files Changed
|
x032205
left a comment
There was a problem hiding this comment.
Ready to approve, just want to test with Atlas first
…g and correctness The old MongoDB proxy reimplemented SRV resolution, SCRAM auth, and TLS from scratch, causing per-connection overhead and a fabricated server hello. Replace with the official driver's topology which handles all of this internally, pools connections per session, and returns real server responses. - Use topology.Topology with driver-managed SRV, TLS, auth, and pooling - Share topology per session in the gateway (first connection creates, others reuse) - Send warmup connection on session start so topology is ready before first client - Replace custom wire protocol parsing with driver's wiremessage/bsoncore packages - Block auth commands with proper wire protocol errors instead of fake SCRAM responses - Add message size validation from server hello - Accept MongoDB URIs in host field (authSource and other options in the URI) - Fix TLS ServerName for MongoDB (let driver set per replica set member)
… pool The warmup connection's bridge was holding a pooled Atlas connection hostage, forcing every real client to create a new TCP+TLS+auth connection (~1-2s), exceeding mongosh's 2-second timeout. - Close warmup connection immediately after triggering topology creation - Set minPoolSize=2 so the pool proactively maintains warm connections
v1.17.x is no longer actively developed (bug fixes only until early 2026). Migrates to go.mongodb.org/mongo-driver/v2 and removes the unnecessary bsoncore dependency in favor of bson.Raw native methods. - Update all imports to v2 paths - Replace driver.Connection with *mnet.Connection (Read/Write API) - Replace description.ReadPrefSelector with local primarySelector - Remove x/bsonx/bsoncore, use bson.Raw.IndexErr/LookupErr directly
Remove buildOpReply (never called), opReplyOpCode (only used by it), flagChecksumPresent (redundant with wiremessage.ChecksumPresent), and flagExhaustAllowed (proxy doesn't handle exhaust cursors).
MongoDB 6.1+ rejects hello commands with client metadata on connections that already completed a handshake. Since the proxy forwards through driver-pooled connections that are already handshaked, strip the `client` and `compression` fields from hello requests before forwarding.
mongosh sends the initial isMaster as OP_QUERY (not OP_MSG). The previous fix only sanitized OP_MSG hello requests, missing this path.
…sanitization Two bugs fixed: 1. sanitizeHelloRequest used bson.M (map, random key order) which could reorder BSON fields so the command name was no longer first, causing "no such command: topologyVersion" errors. Switched to bson.D. 2. MongoDB 8.0+ removed OP_QUERY support entirely. The proxy now converts OP_QUERY hello/isMaster to OP_MSG before forwarding, and converts the OP_MSG response back to OP_REPLY for the client.
…meout The caller passes a non-nil TLSConfig even when EnableTLS=false, which caused the driver to attempt TLS on non-TLS servers (standalone Docker MongoDB). Now only applies TLSConfig when EnableTLS is true. Also adds directConnection for non-SRV hosts (needed for standalone servers) and a 10s server selection timeout to surface connection errors instead of blocking forever.
…ection corruption - Strip topologyVersion and maxAwaitTimeMS from hello requests to prevent the server from entering streaming mode (moreToCome responses) - Clear moreToCome flag from all responses sent to the client - Clear exhaustAllowed flag from client requests to prevent exhaust cursors - Fix bson.M → bson.D in sanitizeHelloWireMessage (response key ordering) - Remove minPoolSize to avoid warm connection interference
…nnection The root cause of "Server ended moreToCome unexpectedly" was that hello responses included topologyVersion, causing mongosh's SDAM monitor to use exhaustCommand() for streaming hello. The proxy strips awaitable fields from requests but the client still attempted exhaust mode because it saw topologyVersion in the response. Stripping it forces polling mode, which is fully compatible with a request/response proxy. Additional fixes: - Sanitize hello responses in the OP_QUERY path (was missing) - Add clearMoreToCome to forwardRaw and handleOpQuery non-hello paths - Drain server-side moreToCome continuations as a safety net - Pre-warm a pool connection during NewMongoDBProxy - Make warmup synchronous: send a real hello through the full chain (local → relay → gateway → MongoDB) and block until the response confirms the topology is ready, so mongosh connects on first try
Move the MongoDB wire protocol hello/response logic from database-proxy.go into mongodb.Warmup(), keeping the local proxy thin and protocol-agnostic.
Instead of sending MongoDB wire protocol data to detect when the gateway's topology is ready, simply include serverSelectionTimeoutMS=15000 in the displayed connection string. This gives the first connection enough time for topology creation while keeping the local proxy protocol-agnostic. The async warmup is kept to trigger topology creation early on the gateway.
- Remove unused opQuery fields (Flags, Skip, Return) - Replace client-specific references (mongosh) with generic language - Simplify overly verbose comments to focus on intent over mechanics - Fix misleading moreToCome comment (was describing wrong direction)
The serverSelectionTimeoutMS=15000 in the connection string is sufficient for the first connection to succeed while the gateway creates the topology. The warmup optimization is unnecessary complexity.
When the backend sends a full MongoDB URI as the connection string, buildURI only injected credentials but silently dropped the database field. Also fixed credential escaping in the plain-host-spec branch to use url.UserPassword instead of url.PathEscape.
- Extract packOpMsg helper to deduplicate OP_MSG rebuild logic - Add max iteration cap on moreToCome drain loop - Add comment explaining request ID offset at 1000
…stead The fallback path on url.Parse failure silently produced a URI missing the database argument. Since url.Parse practically never fails on valid URIs, replace the fallback with a proper error return and propagate it through buildURI → NewMongoDBProxy.
Description 📣
Adds MongoDB support to PAM
Related Infisical/infisical#5817
Type ✨
Tests 🛠️
# Here's some code block to paste some code snippets