Skip to content

quic: add TLS key log support (NSS Key Log Format)#44821

Open
bellatoris wants to merge 3 commits intoenvoyproxy:mainfrom
bellatoris:doogie/quic-keylog
Open

quic: add TLS key log support (NSS Key Log Format)#44821
bellatoris wants to merge 3 commits intoenvoyproxy:mainfrom
bellatoris:doogie/quic-keylog

Conversation

@bellatoris
Copy link
Copy Markdown
Contributor

@bellatoris bellatoris commented May 2, 2026

Commit Message: quic: add TLS key log support so QUIC TLS secrets can be exported in NSS Key Log Format
Additional Description:

Summary

When debugging QUIC TLS behavior — session ticket resumption, certificate compression, or other handshake-level questions — the typical workflow is to open a packet capture in Wireshark and supply per-connection TLS secrets via a NSS Key Log file. Today this only works when the client can emit the key log itself via SSLKEYLOGFILE. Some client stacks — for example Apple's Network framework on macOS and iOS, and SDKs built on top of it — don't expose that hook, so operators are left without an easy way to inspect those handshakes.

For TCP TLS this is already covered: Envoy's DownstreamTlsContext has a key_log field that hooks BoringSSL's key log callback and writes NSS-format lines to a file, with optional local/remote address filters. The same proto field exists for QUIC but is not yet plumbed (see #25418, under CommonTlsContext: key_log).

This PR bridges that gap so the same key_log configuration that already works for TCP TLS produces NSS Key Log lines for QUIC connections too, giving operators a server-side way to decrypt QUIC captures when the client can't help.

Implementation

We piggy-back on the per-connection EnvoyTlsServerHandshaker introduced by #42734 and add a sibling static callback EnvoyTlsServerHandshaker::keylogCallback next to the existing ticketKeyCallback. The callback retrieves the handshaker from SSL ex_data, downcasts the QUIC session to its Network::Connection facet (static_cast<EnvoyQuicServerSession*>(handshaker->session())), reads cached local/peer addresses from its connection info provider, and forwards the line to the pinned ServerContextImpl's maybeWriteKeyLog(...).

To keep the address-filter + file-write logic from being duplicated, the existing inline body of ContextImpl::keylogCallback is factored into a new void ContextImpl::maybeWriteKeyLog(line, local_addr, remote_addr) that both the TCP and QUIC paths call. The TCP callback is now a thin shim: fetch the per-connection Network::TransportSocketCallbacks from ex_data, pull addresses off the connection, and call maybeWriteKeyLog. No behavior change for TCP.

The SSL_CTX_set_keylog_callback install lives in EnvoyQuicProofSource::OnNewSslCtx, paired with the existing SSL_CTX_set_tlsext_ticket_key_cb install under the same runtime flag. That's where the QUICHE SSL_CTX is constructed, before any handshake on it exists, so installation is race-free and there is no per-connection bookkeeping. When no chain on the listener has key_log configured, BoringSSL still invokes the callback per secret derivation, but maybeWriteKeyLog short-circuits on a null key log file with a single load + compare; the cost is dominated by BoringSSL's own dispatch overhead and is indistinguishable from "not registered" in any production-relevant measurement.

A small preparatory cleanup is included: source/common/quic/envoy_quic_server_session.cc was including envoy_quic_proof_source.h without referencing any proof-source symbol, and the matching BUILD dep on envoy_quic_proof_source_lib existed only to satisfy strict-headers for that vestigial include. Removing the include + dep (and adding the direct quic_server_transport_socket_factory.h include the proof-source header was transitively providing) breaks a cycle that would otherwise prevent envoy_tls_server_handshaker from depending on envoy_quic_server_session_lib to do the cast.

Key design decisions:

  1. Reuse the pinned ServerContextImpl from quic: add TLS session ticket resumption support #42734. The handshaker already pins a ServerContextSharedPtr for session-ticket purposes. The key log callback uses the same pin, so an SDS rotation that swaps the factory's active context does not affect in-flight connections — every line a connection emits goes to the key log file the connection was created with, matching TCP TLS behavior.

  2. maybeWriteKeyLog lives on ContextImpl, shared by TCP and QUIC. The key log file and IP-list filter members already live on ContextImpl, so factoring the filter+write step into a member of that class is the natural home. Refactoring ContextImpl::keylogCallback to delegate makes the new method genuinely shared rather than parallel-copied logic.

  3. Addresses come from EnvoyQuicServerSession's Network::Connection facet. Inside keylogCallback, static_cast<EnvoyQuicServerSession*>(handshaker->session())->connectionInfoProvider() returns the same cached envoy address objects the connection was created with — no per-line allocation.

  4. Graceful fallback on null handshaker — same shape as ticketKeyCallback. If the runtime guard is off and the vanilla quic::TlsServerHandshaker is in use, no handshaker is in ex_data, the callback finds null, and silently returns. No ENVOY_BUG, no crash.

Flow

Server startup (once per SSL_CTX):
  EnvoyQuicProofSource::OnNewSslCtx()
    └─ if quic_session_ticket_support runtime flag on:
         ├─ SSL_CTX_set_tlsext_ticket_key_cb(ssl_ctx, EnvoyTlsServerHandshaker::ticketKeyCallback)
         └─ SSL_CTX_set_keylog_callback     (ssl_ctx, EnvoyTlsServerHandshaker::keylogCallback)

Per connection (only when the runtime flag is on — see "Limitations" below):
  EnvoyQuicCryptoServerStreamFactoryImpl::createEnvoyQuicCryptoServerStream()
    └─ constructs EnvoyTlsServerHandshaker(session, crypto_config, factory.sslCtx(), ...)
         ├─ pins ServerContextSharedPtr
         └─ SSL_set_ex_data(ssl, handshakerExDataIndex(), this)

During handshake (BoringSSL-driven, every secret derivation):
  keylogCallback(ssl, line)
    ├─ handshaker = SSL_get_ex_data(ssl, handshakerExDataIndex())
    ├─ if null or no pinned context: return  // graceful fallback
    ├─ info = static_cast<EnvoyQuicServerSession*>(handshaker->session())->connectionInfoProvider()
    └─ handshaker->pinnedServerContext()->maybeWriteKeyLog(
                                              line,
                                              info.localAddress().get(),
                                              info.remoteAddress().get())
         ├─ if no key log file is configured: return  // shared with TCP TLS
         └─ apply tls_keylog_local_/remote_ filters; write the line

TCP TLS key log (refactored, behavior unchanged):
  ContextImpl::keylogCallback(ssl, line)
    ├─ callbacks = SSL_get_ex_data(ssl, sslSocketIndex())
    ├─ ctx       = SSL_CTX_get_app_data(SSL_get_SSL_CTX(ssl))
    └─ ctx->maybeWriteKeyLog(line,
                             callbacks->connection().connectionInfoProvider().localAddress().get(),
                             callbacks->connection().connectionInfoProvider().remoteAddress().get())

Limitations / Follow-ups

EnvoyTlsServerHandshaker is currently constructed only when the envoy.reloadable_features.quic_session_ticket_support runtime guard from #42734 is on. As a consequence, QUIC key log only fires for listeners on instances where that guard is enabled. This is fine for our debugging use case (we are happy to flip both at the same time), and the key log callback is harmless when not backed by a handshaker. A small follow-up could broaden handshaker creation to also trigger when key_log is configured, decoupling the two features entirely; we left that out of this PR to keep the diff minimal.

Risk Level: Low (no behavior change unless key_log is configured; the TCP TLS path is refactored but observably unchanged)

Testing:

  • Unit: EnvoyTlsServerHandshakerTest.KeylogCallbackNullHandshaker (graceful fallback when no handshaker is in ex_data).
  • E2E: 7 new tests in quic_http_integration_test (KeylogNoFilter, KeylogLocalFilterMatches, KeylogRemoteFilterMatches, KeylogLocalAndRemoteFilterMatch, KeylogLocalFilterNoMatch, KeylogRemoteFilterNoMatch, KeylogNotConfigured), each parameterized on IPv4/IPv6. Positive cases assert the file contains all five TLS 1.3 secrets (CLIENT_HANDSHAKE_TRAFFIC_SECRET, SERVER_HANDSHAKE_TRAFFIC_SECRET, CLIENT_TRAFFIC_SECRET, SERVER_TRAFFIC_SECRET, EXPORTER_SECRET); address-filter-no-match cases assert the file is created but empty; KeylogNotConfigured asserts no file is created when key_log is absent from the proto config.
  • Existing TCP TLS key log tests in test/common/tls/integration/ssl_integration_test continue to pass after the keylogCallbackmaybeWriteKeyLog refactor.

Docs Changes: N/A — uses the existing key_log proto already documented for TCP TLS.

Release Notes: N/A.

Platform Specific Features: N/A

[Optional Fixes #Issue] Fixes #35339. Also addresses the CommonTlsContext: key_log bullet of #25418.

Signed-off-by: Doogie Min <doogie.min@sendbird.com>
@bellatoris
Copy link
Copy Markdown
Contributor Author

Hi @danzh2010 and @RyanTheOptimist — when you have a moment, would you mind taking a look? This is a small follow-up to #42734 that addresses the CommonTlsContext: key_log bullet of #25418, and it reuses the EnvoyTlsServerHandshaker infrastructure you helped shape. Thanks in advance for your time!

@bellatoris
Copy link
Copy Markdown
Contributor Author

/retest

Comment thread source/common/tls/context_impl.cc Outdated
Comment thread source/common/quic/envoy_tls_server_handshaker.cc Outdated
Comment thread test/integration/quic_http_integration_test.cc Outdated
- Rename ContextImpl::writeKeyLog to maybeWriteKeyLog to make the
  no-op-when-not-configured contract obvious at the call site.
- Read connection addresses from the EnvoyQuicServerSession's
  Network::Connection facet (passed in at handshaker construction)
  instead of re-converting QUICHE addresses on every key log line.
- Inline the runtime-flag literal in the integration test fixture
  instead of holding a single-use constant.

Signed-off-by: Doogie Min <doogie.min@sendbird.com>
@bellatoris
Copy link
Copy Markdown
Contributor Author

bellatoris commented May 6, 2026

@danzh2010 addressed all three review comments in the latest commit:

  • ContextImpl::writeKeyLog is renamed to maybeWriteKeyLog.
  • The QUIC key log callback now does static_cast<EnvoyQuicServerSession*>(handshaker->session())->connectionInfoProvider() inline as you suggested.
  • The kQuicSessionTicketRuntimeFlag constant is inlined.

PR description updated to match. PTAL when you have a moment, thanks!

Match danzh2010's review comment literally by doing the
static_cast<EnvoyQuicServerSession*>(session()) inside the keylog
callback rather than at handshaker construction time. The original
deviation existed because the literal pattern triggered a Bazel
cycle (envoy_tls_server_handshaker -> envoy_quic_server_session_lib
-> envoy_quic_proof_source_lib -> envoy_tls_server_handshaker) - but
the cycle had a vestigial edge: envoy_quic_server_session.cc included
envoy_quic_proof_source.h while referencing zero proof source symbols.
Removing that stale include and adding the direct
quic_server_transport_socket_factory.h include it was transitively
providing dissolves the cycle. The envoy_connection_ member, its
constructor parameter, and the construction-time cast in
envoy_quic_crypto_server_stream.cc are no longer needed.

Signed-off-by: Doogie Min <doogie.min@sendbird.com>
@bellatoris
Copy link
Copy Markdown
Contributor Author

/retest

@bellatoris bellatoris requested a review from danzh2010 May 6, 2026 05:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support SSL keylog with QUIC

2 participants