feat: enrich PostHog person records for machine identities with Redis-based dedup#5689
Conversation
…-based dedup Co-Authored-By: arsh <arshsb1998@gmail.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
Greptile SummaryThis PR adds PostHog person-record enrichment for machine identities by introducing Key observations:
Confidence Score: 4/5
Important Files Changed
Last reviewed commit: 909af55 |
…, add identity context to error logs, document intentional catch fall-through Co-Authored-By: arsh <arshsb1998@gmail.com>
Context
Machine identity PostHog person records (those with
identity-{uuid}distinctIds) are created automatically whenpostHog.capture()orpostHog.groupIdentify()fires during secret pulls, but they never receive apostHog.identify()call — leaving them with zero useful properties (no name, no auth method). This makes it difficult to understand machine identity activity in the PostHog UI.This PR adds an
identifyIdentity()function that enriches these person records withnameandauthMethodproperties. It fires on every authenticated machine identity request (theIDENTITY_ACCESS_TOKENauth path), deduped via an atomic RedisSET NX EXoperation with a 10-minute TTL. This ensures the dedup is global across all horizontally-scaled instances (30+), unlike an in-memory-only approach.Key design decisions:
identifyIdentity()is only called from theIDENTITY_ACCESS_TOKENauth case and internally prefixes the distinctId withidentity-. Real users (JWT/API key auth) are never affected.setItemWithExpiryNX(RedisSET key value EX ttl NX) for atomic, cross-instance dedup. Matches the pattern from feat: add PostHog identifyUser in auth hook with Redis dedup cache #5643.Setwith matching TTL limits blast radius — prevents flooding PostHog during outages. The catch block intentionally falls through so the first caller during an outage still firespostHog.identify().${identityId}-${authMethod}so auth method changes are reflected immediately in PostHog..catch(): The call is not awaited and errors are logged (with identityId context) without affecting the request.KeyStorePrefixes:TelemetryIdentifyIdentityprefix andTelemetryIdentifyIdentityInSecondsTTL are registered in the central keystore catalog to prevent future namespace collisions.Updates since last revision
KeyStorePrefixes.TelemetryIdentifyIdentityandKeyStoreTtls.TelemetryIdentifyIdentityInSecondsinkeystore.ts(addresses key-namespace collision risk)[identityId=...]structured context to error log ininject-identity.tscall site// falls through intentionallycomment documenting the catch-block control flowSteps to verify the change
identity-{uuid}should now havenameandauthMethodproperties$identifyevents (Redis dedup working)Human review checklist
setItemWithExpiryNXreturn type handling:nullmeans key already existed (skip identify),"OK"means first caller winsidentity.identityNameis the correct field to send (notidentity.name, which is the nullable token label)Setaccumulation is acceptable for expected concurrent identity count during Redis outages.catch()at call site ininject-identity.tsproperly prevents unhandled promise rejections from breaking auth flowsetItemWithExpiryNXimplementations match NX semantics (TTL intentionally ignored, consistent with existingsetItemWithExpiry)setItemWithExpiryNXType
Checklist
Link to Devin Session: https://app.devin.ai/sessions/f60882fcc61c491ab00e59ce7ee3f584
Requested by: @0xArshdeep