Skip to content

Agentic (User FIC) token flow bypasses MSAL cache when ClaimsPrincipal is null #3840

@rido-min

Description

@rido-min

Summary

When using the agentic token flow (User FIC via AcquireTokenByUsernamePassword) from a bot/service context where there is no authenticated HTTP user, CreateAuthorizationHeaderAsync() is called with a null ClaimsPrincipal. This causes MSAL to skip the silent token flow and make a network round-trip to Entra on every single call, even when a valid cached token exists.

Reproduction

  1. Register an Agent Identity (FIC) app
  2. Call IAuthorizationHeaderProvider.CreateAuthorizationHeaderAsync() with WithAgentUserIdentity() options and claimsPrincipal: null
  3. Observe MSAL logs: every call shows source: IdentityProvider (network), never source: Cache
  4. Call the same method again within token lifetime — still hits the network

Root Cause

In TokenAcquisition.TryGetAuthenticationResultForConfidentialClientUsingRopcAsync(), the silent flow guard at line ~440 requires:

if (!forceRefresh && user != null && user.GetMsalAccountId() != null)
{
    // AcquireTokenSilent — cache hit path
}

When user is null (common in bot/service scenarios where the agent identity comes from request parameters, not HTTP context), this check always fails. The code falls through to AcquireTokenByUsernamePassword, which goes to the network.

After the ROPC call succeeds, the account ID is written back to the ClaimsPrincipal (line ~520):

if (user != null && user.GetMsalAccountId() == null)
{
    user.AddIdentity(...);
}

But since user is null, this is also skipped. Even if the caller passes a non-null empty ClaimsPrincipal (our current workaround), the account information is only persisted on that specific object instance. If the caller creates a new ClaimsPrincipal per request, caching still doesn't work.

Impact

In our Teams bot SDK, a single incoming message triggers 2-4 outbound API calls, each needing an agentic token. This means 2-4 unnecessary HTTP round-trips to login.microsoftonline.com per message (~250ms each), adding 500ms-1s of latency. At scale this also increases Entra throttling risk.

Current Workaround

We cache a ClaimsPrincipal instance per (agenticAppId, agenticUserId) in a ConcurrentDictionary and reuse it across calls. After the first ROPC call populates the account ID on the ClaimsPrincipal, subsequent calls hit the silent flow.

Suggested Fix

For the agentic identity flow, TryGetAuthenticationResultForConfidentialClientUsingRopcAsync() has enough information from ExtraParameters (IDWEB_AGENT_IDENTITY + IDWEB_USER_ID) to construct a cache lookup key and attempt AcquireTokenSilent without requiring a ClaimsPrincipal. The method could:

  1. Extract the agent app ID and user OID from ExtraParameters
  2. Use IConfidentialClientApplication.GetAccountsAsync() to find a matching account
  3. Attempt AcquireTokenSilent with that account
  4. Fall back to ROPC only on cache miss

This would make the agentic flow cache-friendly without requiring callers to manage ClaimsPrincipal persistence.

Environment

  • Microsoft.Identity.Web: 3.x (latest)
  • MSAL.NET: 4.83.1.0
  • .NET 10
  • Scenario: Teams Bot SDK using Agent365 User FIC tokens

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions