Skip to content

Changes to make tests work with docker, and to REVERSE_CONNECTION cluster#4

Open
basundhara-c wants to merge 513 commits intoagrawroh:feat-rev-conns-newfrom
basundhara-c:feat-rev-conns-new
Open

Changes to make tests work with docker, and to REVERSE_CONNECTION cluster#4
basundhara-c wants to merge 513 commits intoagrawroh:feat-rev-conns-newfrom
basundhara-c:feat-rev-conns-new

Conversation

@basundhara-c
Copy link
Collaborator

This PR does the following:

  1. Minor nits and test changes to get the test work with docker containers
  2. Add logic to fetch the node ID for a request in the REVERSE_CONNECTION cluster. Therefore, the REVERSE_CONNECTION should call the SocketManager's getConnectionSocket() with the node ID always, and will always create a host for the node. This ensures that requests with the cluster ID get load balanced between nodes of the cluster.
  3. Add logic in the REVERSE_CONNECTION cluster to extract the node_uuid/cluster_uuid from the Host header. The request can now be sent with the Host set to "<cluster_uuid/node_uuid.tcpproxy.envoy.remote" (the suffix is configurable)

Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

@basundhara-c basundhara-c requested a review from agrawroh as a code owner July 23, 2025 10:24
abeyad and others added 29 commits August 20, 2025 12:07
…nvoyproxy#40802)

Adds two runtime guards to experiment with filtering reserved IP ranges
in the IPv6 probing check that were introduced in
envoyproxy#40345:

  1. `envoy_reloadable_features_mobile_ipv6_probe_simple_filtering`
  2. `envoy_reloadable_features_mobile_ipv6_probe_advanced_filtering`

Signed-off-by: Ali Beyad <abeyad@google.com>
This pr updates the documentation to inform new developers how to
troubleshoot crashes when building

Signed-off-by: Matthew Leon <matthew.leon.tech@gmail.com>
Commit Message: Fix crash in UDP proxy during ENVOY_SIGTERM with active
tunneling sessions
Additional Description:
The UDP proxy crashes during ENVOY_SIGTERM if there are active tunneling
sessions. This issue arises during the destruction of `UdpProxyFilter`,
which attempts to clean up all active sessions. When a
`TunnelingActionSession` is removed, it triggers `resetEncoder` in the
`HttpUpstreamImpl` destructor. This, in turn, calls
`upstream_callbacks_.onUpstreamEvent(event);`, which tries to remove the
session again—leading to a double removal and ultimately a crash.

<img width="1533" height="777" alt="image"
src="https://github.com/user-attachments/assets/74333e9c-3510-4c97-bd6f-2424c02b26eb"
/>

Risk Level: low
Testing: integration test
Docs Changes: N/A
Release Notes:
Platform Specific Features: N/A

---------

Signed-off-by: Issa Abu Kalbein <iabukalbein@microsoft.com>
Co-authored-by: Issa Abu Kalbein <iabukalbein@microsoft.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
…ide (envoyproxy#40763)

Add detailed step-by-step guide for setting up Envoy local development
environment using VSCode/Cursor dev containers. The guide includes:

- Complete prerequisites and system requirements
- Troubleshooting section for Docker Desktop issues  
- Debug configuration with LLDB setup for macOS compatibility
- Visual examples with image placeholders for key setup steps
- Verification steps for successful Envoy startup
- Key file locations and development workflow

This addresses the gap in comprehensive documentation for developers
who are not C++ experts and provides a complete walkthrough from
initial setup to debugging Envoy source code locally.

Tested on Apple M1 Max and Intel Core i9 machines running macOS 
Sequoia (version 15.5).

Additional Description: This PR significantly enhances the existing dev
container documentation by adding a comprehensive setup guide that walks
through the entire process from repository cloning to successful
debugging. The original documentation was brief and referenced external
guides, but this enhancement provides a complete, self-contained
walkthrough with troubleshooting solutions for common issues encountered
during setup.

The guide is particularly valuable for developers who are not C++
experts and may be unfamiliar with Bazel build systems. It includes
visual documentation with 16 screenshot images showing key steps and
verification points.

Risk Level: Low
Testing: 
- Manually tested complete setup process on Apple M1 Max and Intel Core
i9 machines
- Verified on macOS Sequoia (version 15.5)
- Tested with both VSCode and Cursor IDEs
- Validated all Docker Desktop configuration steps
- Confirmed LLDB debugger setup and functionality
- Verified all admin endpoints and proxy functionality
- All pre-push validation checks passed

Docs Changes: Enhanced .devcontainer/README.md with comprehensive setup
guide including:
- Detailed step-by-step instructions
- Troubleshooting section for Docker Desktop issues
- Debug configuration with both GDB and LLDB options
- Visual documentation with 16 supporting images
- Complete verification steps
- Development workflow guidelines

Release Notes: N/A
Platform Specific Features: N/A - Documentation changes only, though the
guide specifically addresses macOS compatibility issues with debugger
configuration.

---------

Signed-off-by: prashanth.chaitanya <prashanth.chaitanya@salesforce.com>
Co-authored-by: prashanth.chaitanya <prashanth.chaitanya@salesforce.com>
…hare state (envoyproxy#40497)

Commit Message:
Persist RLQS client state & bucket assignment+usage caches across filter
config updates (e.g. via LDS). This is done by aggregating
rate_limit_quota filters to share RLQS global clients & bucket caches
based on their configured RLQS server destination & domain, instead of
creating a new global client + cache per filter factory.

Now, the TlsStores (global client + tls slots) are referenced via
weak_ptrs in a static map & owned by filter factories. If all filter
factories drop and stop owning a TlsStore, its shared_ptr will trigger
destruction of the global resources & index.

This persistence has a positive side-effect of centralizing usage
reporting + assignment generation from the RLQS server's perspective,
while still allowing for separation between filter configs via the
domain field if needed, e.g. if 2 filter chains send traffic to
different upstreams & want separate rate limit assignments for each.

Additional Description:
- The design for the global, static map came from the registration model
for filter factory cbs (//envoy/registry/registry.h, FactoryRegistry).
- Updates to the global map (removals or additions of indices) are not
thread-safe, and so are always done from the main thread.
- Additions occur during filter factory creation, if needed, which is
handled by the main thread (e.g. during startup or when triggered by an
LDS update).
- A garbage collection timer (every 10s) on the main thread handles
removal of any map indices that are no longer referenced by any active
filter factories.
- Map indexing does not account for differences in gRPC client
configurations (excluding RLQS server destination), such as differences
in timeouts. The global RLQS client will be created according to the
first-seen configuration.

Risk Level: Minimal to moderate (thread-safety warrants scrutiny but
changes are to a WIP filter)
Testing: integration & manual testing (config changes + filter
replacements shown to not interrupt rate limiting).
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional [API
Considerations](https://github.com/envoyproxy/envoy/blob/main/api/review_checklist.md):]

---------

Signed-off-by: Brian Surber <bsurber@google.com>
…rTimeout (envoyproxy#40822)

<!--
!!!ATTENTION!!!

If you are fixing *any* crash or *any* potential security issue, *do
not*
open a pull request in this repo. Please report the issue via emailing
envoy-security@googlegroups.com where the issue will be triaged
appropriately.
Thank you in advance for helping to keep Envoy secure.

!!!ATTENTION!!!

For an explanation of how to fill out the fields, please see the
relevant section
in
[PULL_REQUESTS.md](https://github.com/envoyproxy/envoy/blob/main/PULL_REQUESTS.md)
-->

The comments on the ContinueOnListenerTimeout doesn't make sense.
1. the max length is changed to 16K.
2. the reason is also not correct.

Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional [API
Considerations](https://github.com/envoyproxy/envoy/blob/main/api/review_checklist.md):]

Signed-off-by: Boteng Yao <boteng@google.com>
)

Commit Message: aws: preserve plus sign inside query parameters

Additional Description:
Ambiguity in the way query parameters containing literal plus signs
could have been encoded leads to a signature verification failure.
We now correctly preserve the intent behind plus signs when they arrive
in the canonicalizer logic. A raw plus becomes a space, an encoded plus
(%2B) stays as a %2B and an encoded space (%20) stays as a %20.

Risk Level: Low
Testing: Unit
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue] envoyproxy#40523
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional [API
Considerations](https://github.com/envoyproxy/envoy/blob/main/api/review_checklist.md):]

---------

Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Commit Message: aws: assumerole missing protobuf validations
Additional Description: Adds missing protobuf validations that were
unintentionally skipped, plus test cases to match
Risk Level: Low
Testing: Unit
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional [API
Considerations](https://github.com/envoyproxy/envoy/blob/main/api/review_checklist.md):]

---------

Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <108375408+nbaws@users.noreply.github.com>
This is needed to support the new `request_body_buffer_limit` route
configuration.

WatermarkBufferTest.OverflowWatermarkDisabledOnVeryHighValue had to be
refactored to not used allocations as it is no longer practical with 64
bit limit and 32 bit multiplier.

Risk Level: none
Testing: unit tests
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

---------

Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
…#40169)

## Description

This PR adds support for per-route gRPC service override in the
`ext_authz` HTTP filter, allowing different routes to use different
external authorization backends. Routes would now be able to specify a
different authorization service by configuring `grpc_service` in the
per-route `check_settings`.

---

Commit Message: ext_authz: add grpc_service field on the per-route
filter
Additional Description: Add a new `grpc_service` field on the per-route
ExtAuthZ filter to be able to override the AuthService backend on a
per-route basis.
Risk Level: Low
Testing: Added Unit & Integration Tests
Docs Changes: Added
Release Notes: Added

---------

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>
…nvoyproxy#40747)

Commit Message: generic proxy: add onDownstreamConnected() callback to
ServerCodec
Additional Description:
This adds a new method ServerCodec::onDownstreamConnected() that is
called when the downstream connection is established. This can be used
for initialization steps that require the connection returned by
ServerCodecCallbacks::connection() to have a value and be in a connected
state.

Risk Level: Low; the new method has an empty default implementation, as
to not break existing ServerCodec implementations.
Testing: Updated unit tests and mocks
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional [API
Considerations](https://github.com/envoyproxy/envoy/blob/main/api/review_checklist.md):]

Signed-off-by: Joe Kralicky <joekralicky@gmail.com>
… for ratelimit filter (envoyproxy#40760)

Signed-off-by: pcrao <pcrao@google.com>
…ections (envoyproxy#40801)

Commit Message: Introduces new enum value
`ConnectionPool::DrainBehavior::DrainExistingNonMigratableConnections`.
When this behavior is used, HTTP/3 connection pools will only drain
existing connections if QUIC connection migration is not enabled.

The mobile engine is updated to use this new drain behavior when DNS
refreshing is disabled on network changes, allowing migratable QUIC
connections to persist across network transitions.

Additional Description: also add QUICHE migration code into build target
Risk Level: low, the new behavior is disabled
Testing: new unit test pass
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: N/A

---------

Signed-off-by: Dan Zhang <danzh@google.com>
Co-authored-by: Dan Zhang <danzh@google.com>
…rawroh#4… (envoyproxy#40836)

…0169)"

This reverts commit 8c03b2a.

Signed-off-by: Ryan Northey <ryan@synca.io>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
dependabot bot and others added 26 commits September 23, 2025 16:58
Bumps the contrib-golang group in /contrib/golang/router/cluster_specifier/test/test_data/simple with 1 update: google.golang.org/protobuf.


Updates `google.golang.org/protobuf` from 1.36.6 to 1.36.7

---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
  dependency-version: 1.36.7
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: contrib-golang
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps the contrib-golang group in /contrib/golang/upstreams/http/tcp/test/test_data with 1 update: google.golang.org/protobuf.


Updates `google.golang.org/protobuf` from 1.36.6 to 1.36.7

---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
  dependency-version: 1.36.7
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: contrib-golang
...

Signed-off-by: dependabot[bot] <support@github.com>
…netns (envoyproxy#40933)

When creating a listener socket in another network namespace, a nested
`absl::StatusOr<>` is returned. The outer status shows the result of the
attempt to switch network namespaces and the inner status is the result
of the creation of a listener socket.

This patch fixes a bug where we checked the inner status when we should
have been checking the outer status. This results in a segfault because
the outer status is dereferenced without first checking if it was OK.

We now check the correct results and additional comments have been added
to improve clarity of this code.

<!--
!!!ATTENTION!!!

If you are fixing *any* crash or *any* potential security issue, *do
not*
open a pull request in this repo. Please report the issue via emailing
envoy-security@googlegroups.com where the issue will be triaged
appropriately.
Thank you in advance for helping to keep Envoy secure.

!!!ATTENTION!!!

For an explanation of how to fill out the fields, please see the
relevant section
in
[PULL_REQUESTS.md](https://github.com/envoyproxy/envoy/blob/main/PULL_REQUESTS.md)
-->

Risk Level: Low
Testing: New unit test and fuzz test case
Docs Changes: n/a
Release Notes: Added
Platform Specific Features: Linux only

---------

Signed-off-by: Tony Allen <txallen@google.com>
Created by Envoy dependency bot for @phlax

Fix envoyproxy#41183



Signed-off-by: dependency-envoy[bot]
<148525496+dependency-envoy[bot]@users.noreply.github.com>

Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
Co-authored-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
Signed-off-by: Ryan Northey <ryan@synca.io>
Created by Envoy dependency bot for @phlax

Fix envoyproxy#41184



Signed-off-by: dependency-envoy[bot]
<148525496+dependency-envoy[bot]@users.noreply.github.com>

Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
Co-authored-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
…oyproxy#41174)

Prior to this PR there were issues when using OD-CDS without cds_config.
This PR converts OD-CDS over ADS to use the new XdstpOdCdsApiImpl (that
is also used for xDS-federation based OD-CDS subscriptions).


Signed-off-by: Adi Suissa-Peleg <adip@google.com>
…y#40958)

To close envoyproxy#40892. In the
previous implementation, at stream complete phase, the substitution
formatter couldn't works as expected for limit request. This is because
no related context is provided to the grpc client to avoid potential
lifetime problem.

This new implementation add a detach() method at `Grpc::AsyncRequest`
which will clean up the context (parent span, stream info, and so on).
And then, at the rate limit filter, at stream complete phase, the stream
info will be provided to the grpc client and then be cleaned up by the
detach() to avoid potential dangling reference.

Risk Level: low.
Testing: unit.
Docs Changes: n/a.
Release Notes: added.

---------

Signed-off-by: WangBaiping <wbphub@gmail.com>
Signed-off-by: code <wbphub@gmail.com>
…oyproxy#41199)

## Description

This PR enables the use of `NetworkNamespaceInput` in the RBAC filter to
be able to match on the incoming network namespace and do RBAC
enforcement based on that.

---

**Commit Message:** rbac: enable use of NetworkNamespaceInput in network
RBAC filter
**Additional Description:** Enable `NetworkNamespaceInput` to be used in
the Network RBAC filter.
**Risk Level:** Low
**Testing:** Added Unit Tests
**Docs Changes:** Added
**Release Notes:** Added

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>
## Description

This PR adds a new terminal network filter which accepts or rejects the
incoming reverse connection requests sent from the reverse tunnels
downstream connection interface and optionally validating the Node ID
and Cluster ID values that come as part of the handshake protocol using
the Envoy filter state metadata. Filter state could be populated by
other network filters like "Set-Filter State", etc.

This new filter could be configured with the type URL
`type.googleapis.com/envoy.extensions.filters.network.reverse_tunnel.v3.ReverseTunnel`.

---

**Commit Message:** filter: added reverse tunnel terminal network filter
**Additional Description:** Adds a new terminal network filter for
accepting the incoming reverse tunnel handshake requests from the
downstream instances.
**Risk Level:** Low
**Testing:** Added Unit & Integration Tests
**Docs Changes:** Added
**Release Notes:** Added

---------

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Co-authored-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
This PR have some nits and styling fixes to make the GeoIP filter docs
consistent to rest of the codebase.

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>
…nvoyproxy#41158)

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Ryan Northey <ryan@synca.io>
…oyproxy#40403)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Commit Message: formatter: removed legacy router formatter support
Additional Description:

Three yeas ago, at envoyproxy#21932, we unified all the formatters to the
substitution formatter. And we add a warn log for the legacy
UPSTREAM_METADATA and DYNAMIC_METADATA.

Now, I think it's time to remove it finally. This PR Removed legacy
header formatter support for `%DYNAMIC_METADATA(["namespace", "key",
...])%` and `%UPSTREAM_METADATA(["namespace", "key", ...])%`. Please use
`%DYNAMIC_METADATA(namespace:key:...])%` and
`%UPSTREAM_METADATA(namespace:key:...])%` as alternatives.
This change can be reverted temporarily by setting the runtime guard
`envoy.reloadable_features.remove_legacy_route_formatter` to `false`.

Risk Level: low.
Testing: unit.
Docs Changes: n/a.
Release Notes: added.

---------

Signed-off-by: WangBaiping <wbphub@gmail.com>
…oyproxy#41207)

The TCP read can be partial, improve the integration tests to consider this to eliminate flakiness.

Risk Level: low

---------

Signed-off-by: Boteng Yao <boteng@google.com>
…ks (envoyproxy#41114)

Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Ryan Northey <ryan@synca.io>
Fix envoyproxy#41162

Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
Signed-off-by: Ryan Northey <ryan@synca.io>
Replace deprecated absl::MutexLock::MutexLock(Mutex*) constructor with
absl::MutexLock::MutexLock(Mutex&)

Risk Level: none
Testing: unit tests
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

Signed-off-by: Yan Avlasov <yavlasov@google.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
… I/O

Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>
…unit tests

Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>

Signed-off-by: Basundhara Chakrabarty <basundhara.c@nutanix.com>
agrawroh added a commit that referenced this pull request Dec 16, 2025
…voyproxy#42554)

## Description

Today, when a filesystem watch callback returns a non-OK status or
throws an exception, the error gets propagated to `FileEventImpl` which
uses `THROW_IF_NOT_OK`.

Since there's no exception handler in the `libevent` loop, this causes
`std::terminate` to be called, which crashes Envoy.

**Stack Trace:**
```
Dec 11 00:11:26 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:26.119][234999][warning][misc] [source/common/protobuf/message_validator_impl.cc:23] Deprecated field: type envoy.config.core.v3.HeaderValueOption Using deprecated option 'envoy.config.core.v3.HeaderValueOption.append' from file base.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/version_history/version_history for details. If continued use of this field is absolutely necessary, see https://www.envoyproxy.io/docs/envoy/latest/configuration/operations/runtime#using-runtime-overrides-for-deprecated-features for how to apply a temporary and highly discouraged override.
Dec 11 00:11:26 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:26.120][234999][info][upstream] [source/common/listener_manager/lds_api.cc:109] lds: add/update listener '0_listener'
Dec 11 00:11:26 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:26.123][234999][info][upstream] [source/common/listener_manager/lds_api.cc:109] lds: add/update listener '1_listener'
Dec 11 00:11:26 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:26.126][234999][info][upstream] [source/common/listener_manager/lds_api.cc:109] lds: add/update listener '2_listener'
Dec 11 00:11:26 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:26.127][234999][info][upstream] [source/common/listener_manager/lds_api.cc:109] lds: add/update listener '3_listener'
Dec 11 00:11:26 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:26.128][234999][info][upstream] [source/common/listener_manager/lds_api.cc:109] lds: add/update listener '4_listener'
Dec 11 00:11:26 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:26.130][234999][info][upstream] [source/common/listener_manager/lds_api.cc:109] lds: add/update listener '5_listener'
Dec 11 00:11:26 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:26.132][234999][info][upstream] [source/common/listener_manager/lds_api.cc:109] lds: add/update listener '6_listener'
Dec 11 00:11:26 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:26.134][234999][info][upstream] [source/common/listener_manager/lds_api.cc:109] lds: add/update listener 'mtls_untrusted_regional_transparent_tunnel_listener'
Dec 11 00:11:26 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:26.135][234999][info][upstream] [source/common/listener_manager/lds_api.cc:109] lds: add/update listener 'mtls_app_trusted_regional_transparent_tunnel_listener'
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.097][234999][critical][main] [source/exe/terminate_handler.cc:36] std::terminate called! Uncaught unknown exception, see trace.
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.097][234999][critical][backtrace] [./source/server/backtrace.h:113] Backtrace (use tools/stack_decode.py to get line numbers):
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.097][234999][critical][backtrace] [./source/server/backtrace.h:114] Envoy version: 5eaabe0bbaad4612cb85473cd151039d8f1a2760/1.34.2-dev/Clean/RELEASE/BoringSSL
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.097][234999][critical][backtrace] [./source/server/backtrace.h:116] Address mapping: 558d8afcc000-558d8ee2f000 /usr/local/bin/envoy
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.100][234999][critical][backtrace] [./source/server/backtrace.h:123] #0: [0x558d8da5784f]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.102][234999][critical][backtrace] [./source/server/backtrace.h:123] #1: [0x558d8edd8673]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.104][234999][critical][backtrace] [./source/server/backtrace.h:123] #2: [0x558d8e3b120b]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.106][234999][critical][backtrace] [./source/server/backtrace.h:121] #3: Envoy::Filesystem::WatcherImpl::onInotifyEvent() [0x558d8e3990c3]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.108][234999][critical][backtrace] [./source/server/backtrace.h:123] #4: [0x558d8e3998d2]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.109][234999][critical][backtrace] [./source/server/backtrace.h:123] #5: [0x558d8e393de6]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.111][234999][critical][backtrace] [./source/server/backtrace.h:121] #6: Envoy::Event::FileEventImpl::mergeInjectedEventsAndRunCb() [0x558d8e394eb5]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.113][234999][critical][backtrace] [./source/server/backtrace.h:123] #7: [0x558d8e710823]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.115][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#8: event_base_loop [0x558d8e70d4a1]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.117][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#9: Envoy::Server::InstanceBase::run() [0x558d8daa2b99]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.119][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#10: Envoy::MainCommonBase::run() [0x558d8da4327a]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.121][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#11: Envoy::MainCommon::main() [0x558d8da44234]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.123][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#12: main [0x558d8afcc11c]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.123][234999][critical][backtrace] [./source/server/backtrace.h:123] envoyproxy#13: [0x7f1d54073efb]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.123][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#14: __libc_start_main [0x7f1d54073fbb]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.124][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#15: _start [0x558d8afcc02e]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.124][234999][critical][backtrace] [./source/server/backtrace.h:129] Caught Aborted, suspect faulting address 0x395f7
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.124][234999][critical][backtrace] [./source/server/backtrace.h:113] Backtrace (use tools/stack_decode.py to get line numbers):
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.124][234999][critical][backtrace] [./source/server/backtrace.h:114] Envoy version: 5eaabe0bbaad4612cb85473cd151039d8f1a2760/1.34.2-dev/Clean/RELEASE/BoringSSL
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.124][234999][critical][backtrace] [./source/server/backtrace.h:116] Address mapping: 558d8afcc000-558d8ee2f000 /usr/local/bin/envoy
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.124][234999][critical][backtrace] [./source/server/backtrace.h:123] #0: [0x7f1d54089c90]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.124][234999][critical][backtrace] [./source/server/backtrace.h:121] #1: gsignal [0x7f1d54089bde]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.124][234999][critical][backtrace] [./source/server/backtrace.h:121] #2: abort [0x7f1d54072832]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.126][234999][critical][backtrace] [./source/server/backtrace.h:123] #3: [0x558d8da5785c]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.128][234999][critical][backtrace] [./source/server/backtrace.h:123] #4: [0x558d8edd8673]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.129][234999][critical][backtrace] [./source/server/backtrace.h:123] #5: [0x558d8e3b120b]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.129][234999][critical][backtrace] [./source/server/backtrace.h:121] #6: Envoy::Filesystem::WatcherImpl::onInotifyEvent() [0x558d8e3990c3]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.131][234999][critical][backtrace] [./source/server/backtrace.h:123] #7: [0x558d8e3998d2]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.133][234999][critical][backtrace] [./source/server/backtrace.h:123] envoyproxy#8: [0x558d8e393de6]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.133][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#9: Envoy::Event::FileEventImpl::mergeInjectedEventsAndRunCb() [0x558d8e394eb5]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.135][234999][critical][backtrace] [./source/server/backtrace.h:123] envoyproxy#10: [0x558d8e710823]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.135][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#11: event_base_loop [0x558d8e70d4a1]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.135][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#12: Envoy::Server::InstanceBase::run() [0x558d8daa2b99]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.135][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#13: Envoy::MainCommonBase::run() [0x558d8da4327a]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.135][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#14: Envoy::MainCommon::main() [0x558d8da44234]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.135][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#15: main [0x558d8afcc11c]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.135][234999][critical][backtrace] [./source/server/backtrace.h:123] envoyproxy#16: [0x7f1d54073efb]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.135][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#17: __libc_start_main [0x7f1d54073fbb]
Dec 11 00:11:30 dbletE9433T node-envoy[234999]: [2025-12-11 00:11:30.135][234999][critical][backtrace] [./source/server/backtrace.h:121] envoyproxy#18: _start [0x558d8afcc02e]
```

In this change, we are making the `inotify` and `kqueue` watchers handle
callback errors gracefully by catching any exceptions using
`TRY_ASSERT_MAIN_THREAD`, logging errors instead of propagating them and
always returning the `OkStatus` to the event loop.

---

**Commit Message:** filesystem: Fix crash when watch callback returns
error or throws
**Additional Description:** Make `inotify` and `kqueue` watchers handle
callback errors gracefully.
**Risk Level:** Low
**Testing:** CI
**Docs Changes:** N/A
**Release Notes:** N/A

---------

Signed-off-by: Rohit Agrawal <rohit.agrawal@salesforce.com>
Signed-off-by: Rohit Agrawal <rohit.agrawal@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.