Skip to content

feat: enrich PostHog person records for machine identities with Redis-based dedup#5689

Merged
0xArshdeep merged 2 commits into
mainfrom
devin/1773440051-enrich-machine-identity-redis-dedup
Mar 14, 2026
Merged

feat: enrich PostHog person records for machine identities with Redis-based dedup#5689
0xArshdeep merged 2 commits into
mainfrom
devin/1773440051-enrich-machine-identity-redis-dedup

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot commented Mar 13, 2026

Context

Machine identity PostHog person records (those with identity-{uuid} distinctIds) are created automatically when postHog.capture() or postHog.groupIdentify() fires during secret pulls, but they never receive a postHog.identify() call — leaving them with zero useful properties (no name, no auth method). This makes it difficult to understand machine identity activity in the PostHog UI.

This PR adds an identifyIdentity() function that enriches these person records with name and authMethod properties. It fires on every authenticated machine identity request (the IDENTITY_ACCESS_TOKEN auth path), deduped via an atomic Redis SET NX EX operation with a 10-minute TTL. This ensures the dedup is global across all horizontally-scaled instances (30+), unlike an in-memory-only approach.

Key design decisions:

  • Only targets machine identities by construction: identifyIdentity() is only called from the IDENTITY_ACCESS_TOKEN auth case and internally prefixes the distinctId with identity-. Real users (JWT/API key auth) are never affected.
  • Redis-based dedup: Uses setItemWithExpiryNX (Redis SET key value EX ttl NX) for atomic, cross-instance dedup. Matches the pattern from feat: add PostHog identifyUser in auth hook with Redis dedup cache #5643.
  • In-memory fallback: If Redis is unreachable, an in-memory Set with matching TTL limits blast radius — prevents flooding PostHog during outages. The catch block intentionally falls through so the first caller during an outage still fires postHog.identify().
  • Dedup key includes authMethod: ${identityId}-${authMethod} so auth method changes are reflected immediately in PostHog.
  • Fire-and-forget with .catch(): The call is not awaited and errors are logged (with identityId context) without affecting the request.
  • Cache key registered in KeyStorePrefixes: TelemetryIdentifyIdentity prefix and TelemetryIdentifyIdentityInSeconds TTL are registered in the central keystore catalog to prevent future namespace collisions.

Updates since last revision

  • Moved cache key prefix and TTL from local constants to KeyStorePrefixes.TelemetryIdentifyIdentity and KeyStoreTtls.TelemetryIdentifyIdentityInSeconds in keystore.ts (addresses key-namespace collision risk)
  • Added [identityId=...] structured context to error log in inject-identity.ts call site
  • Added explicit // falls through intentionally comment documenting the catch-block control flow

Steps to verify the change

  1. Deploy to a Cloud instance
  2. Authenticate as a machine identity (e.g., via Universal Auth access token)
  3. Check PostHog — the person record for identity-{uuid} should now have name and authMethod properties
  4. Make additional requests within 10 minutes — verify no duplicate $identify events (Redis dedup working)
  5. Verify real user requests (JWT auth) do not trigger any identify calls for machine identities

Human review checklist

  • Verify setItemWithExpiryNX return type handling: null means key already existed (skip identify), "OK" means first caller wins
  • Confirm identity.identityName is the correct field to send (not identity.name, which is the nullable token label)
  • Check that in-memory fallback Set accumulation is acceptable for expected concurrent identity count during Redis outages
  • Verify .catch() at call site in inject-identity.ts properly prevents unhandled promise rejections from breaking auth flow
  • Confirm mock and in-memory keystore setItemWithExpiryNX implementations match NX semantics (TTL intentionally ignored, consistent with existing setItemWithExpiry)
  • Verify no conflicts with feat: add PostHog identifyUser in auth hook with Redis dedup cache #5643 if that PR also introduces setItemWithExpiryNX

Type

  • Fix
  • Feature
  • Improvement
  • Breaking
  • Docs
  • Chore

Checklist


Link to Devin Session: https://app.devin.ai/sessions/f60882fcc61c491ab00e59ce7ee3f584
Requested by: @0xArshdeep


Open with Devin

…-based dedup

Co-Authored-By: arsh <arshsb1998@gmail.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@maidul98
Copy link
Copy Markdown
Collaborator

maidul98 commented Mar 13, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

Copy link
Copy Markdown
Contributor Author

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 13, 2026

Greptile Summary

This PR adds PostHog person-record enrichment for machine identities by introducing identifyIdentity() in the telemetry service, backed by an atomic Redis SET NX EX for cross-instance deduplication and an in-memory Set fallback during Redis outages. The feature is correctly scoped to the IDENTITY_ACCESS_TOKEN auth path and uses the right identity field (identity.identityName, not the nullable token label identity.name).

Key observations:

  • Dedup logic is correct: setItemWithExpiryNX returns "OK" | null; the !wasSet guard correctly skips postHog.identify() when the key already existed in Redis. A minor suggestion to use wasSet !== "OK" would make the Redis contract more explicit (see inline comment).
  • In-memory fallback TTL consistency: The setTimeout in the catch block uses KeyStoreTtls.TelemetryIdentifyIdentityInSeconds * 1000 — matching the Redis TTL exactly, which is correct.
  • No unit tests added: The identifyIdentity() function and the new setItemWithExpiryNX keystore method have no dedicated unit tests. Consider adding tests covering the dedup path (first call fires, second call within TTL skips, Redis-outage path falls through to in-memory).
  • Mock/in-memory stores intentionally ignore expiry: Both memory.ts and e2e-test/mocks/keystore.ts treat _expiryInSeconds as a no-op, consistent with the existing setItemWithExpiry mock. This means dedup entries in test environments persist for the process lifetime — acceptable for integration tests but worth documenting.
  • KeyStorePrefixes registration: Correctly centralised in keystore.ts, preventing future namespace collisions.

Confidence Score: 4/5

  • Safe to merge; purely additive telemetry enrichment with no impact on request handling or data integrity.
  • The core dedup logic is correct, the fire-and-forget pattern prevents auth-flow interference, and the Redis key prefix is properly registered. The only notable gap is the absence of unit tests for the new identifyIdentity() function and the setItemWithExpiryNX dedup paths.
  • backend/src/services/telemetry/telemetry-service.ts — the new identifyIdentity() function warrants unit test coverage for the dedup and outage-fallback paths.

Important Files Changed

Filename Overview
backend/src/services/telemetry/telemetry-service.ts Adds identifyIdentity() with Redis NX dedup and in-memory fallback. Logic is sound: correct "OK" vs null return-value semantics, TTL matches between Redis and in-memory timer, identityName (not the nullable token label) is sent to PostHog.
backend/src/keystore/keystore.ts Adds setItemWithExpiryNX (Redis SET NX EX) and registers TelemetryIdentifyIdentity prefix and TelemetryIdentifyIdentityInSeconds TTL; consistent with existing patterns.
backend/src/server/plugins/auth/inject-identity.ts Fire-and-forget identifyIdentity() call correctly uses identity.identityName (not the nullable identity.name token label) and identity.authMethod from the validated token; scoped only to IDENTITY_ACCESS_TOKEN path.
backend/src/keystore/memory.ts Adds setItemWithExpiryNX with correct NX semantics (null if key exists, "OK" on insert); ignores expiry consistently with existing setItemWithExpiry.
backend/e2e-test/mocks/keystore.ts Adds setItemWithExpiryNX mock with correct parameter order and NX semantics, consistent with the production interface.

Last reviewed commit: 909af55

Comment thread backend/src/server/plugins/auth/inject-identity.ts
Comment thread backend/src/services/telemetry/telemetry-service.ts Outdated
Comment thread backend/src/services/telemetry/telemetry-service.ts Outdated
…, add identity context to error logs, document intentional catch fall-through

Co-Authored-By: arsh <arshsb1998@gmail.com>
@0xArshdeep
Copy link
Copy Markdown
Contributor

@greptileai

@0xArshdeep 0xArshdeep merged commit 755c712 into main Mar 14, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants