Skip to content

Commit 554e4ea

Browse files
Blog: Subnet labels and external traffic (#29)
* Subnet labels external traffic * Mention decentralized subnet labels config * Text updates * Use EXT:, do not mention editing flowmetrics as that's done directly in the examples * refresh date * update EXT/slice * no format in admonition blocks * address feedback * Update content/posts/2026-02-20-subnet-labels/index.md Co-authored-by: Leandro Beretta <lea.beretta@gmail.com> * Update content/posts/2026-02-20-subnet-labels/index.md Co-authored-by: Leandro Beretta <lea.beretta@gmail.com> * Update content/posts/2026-02-20-subnet-labels/index.md Co-authored-by: Leandro Beretta <lea.beretta@gmail.com> * add reviewers --------- Co-authored-by: Leandro Beretta <lea.beretta@gmail.com>
1 parent 8998ca8 commit 554e4ea

File tree

10 files changed

+201
-0
lines changed

10 files changed

+201
-0
lines changed
114 KB
Loading
80.9 KB
Loading
98.1 KB
Loading
95.9 KB
Loading
Lines changed: 201 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,201 @@
1+
---
2+
layout: :theme/post
3+
title: "Identifying cluster external traffic with subnet labels"
4+
description: "Check how NetObserv can help you understanding the traffic to cluster-external workloads and services"
5+
tags: subnet,labels,cardinality,external,metrics,flowmetrics
6+
authors: [jotak]
7+
---
8+
9+
_Thanks to: Mike Fiedler, Leandro Beretta and Mehul Modi for reviewing_
10+
11+
Often times, people who are installing NetObserv are especially looking for a solution that monitors the traffic from and to the cluster. For them, in-cluster traffic monitoring only comes as a secondary consideration. NetObserv does not process external traffic in any particular way, by default, internal and external traffic are just regular network traffic, period.
12+
13+
The eBPF agents know nothing about the network topology. They see packets, extract the IP addresses and metadata, do some aggregation, and forward that to `flowlogs-pipeline`. The agents can operate in multiple contexts, even though we're mostly interested in Kubernetes here, you can run them on your Linux PC if you wish. They are context-agnostic for everything above the host networking stack.
14+
15+
`flowlogs-pipeline` is context-aware, depending on its configuration. It's the one that knows about Kubernetes, and it uses that knowledge to enrich the network flows with pod names, namespaces, and so on.
16+
17+
But again, it doesn't know with absolute certainty what IP should be considered cluster-internal, and what should be cluster-external. It's actually the NetObserv operator that can provide this information based on the `FlowCollector` configuration, via `spec.processor.subnetLabels`.
18+
19+
## What are subnet labels?
20+
21+
As per [the doc](https://github.com/netobserv/network-observability-operator/blob/main/docs/FlowCollector.md#flowcollectorspecprocessorsubnetlabels), subnet labels "allow to define custom labels on subnets and IPs or to enable automatic labelling of recognized subnets in OpenShift, which is used to identify cluster external traffic. When a subnet matches the source or destination IP of a flow, a corresponding field is added: `SrcSubnetLabel` or `DstSubnetLabel`."
22+
23+
In OpenShift, NetObserv checks the Cluster Network Operator configuration to know which CIDRs are configured for Pods, Services and Nodes, then it configures `flowlogs-pipeline` accordingly. You can verify that in the generated configmap:
24+
25+
```bash
26+
$ kubectl get cm -n netobserv flowlogs-pipeline-config -ojsonpath='{.data.config\.json}' | jq '.parameters[1].transform.network.subnetLabels'
27+
[
28+
{
29+
"cidrs": [
30+
"10.128.0.0/14"
31+
],
32+
"name": "Pods"
33+
},
34+
{
35+
"cidrs": [
36+
"172.30.0.0/16"
37+
],
38+
"name": "Services"
39+
},
40+
{
41+
"cidrs": [
42+
"10.0.0.0/16"
43+
],
44+
"name": "Machines"
45+
}
46+
]
47+
```
48+
49+
Those are the same values that you can find in the `cluster` resource from `networks.config.openshift.io`
50+
51+
When you open the Console plugin and configure columns to show the subnet labels, this is what you get:
52+
53+
![Subnet labels by default](./subnet-labels-default.png)
54+
55+
Every time `flowlogs-pipeline` has to process a network flow, it checks if the IP belongs to any of the defined subnets, and if so, it associates the flow with the related label.
56+
57+
This is not just for OpenShift. If you're not running on OpenShift, or if you want to customize the default setup for OpenShift, you can perfectly configure different CIDRs. For instance, to add more machine networks, you can write in `FlowCollector`:
58+
59+
```yaml
60+
spec:
61+
processor:
62+
subnetLabels:
63+
openShiftAutoDetect: true # (this is ignored when not running on OpenShift)
64+
customLabels:
65+
- cidrs:
66+
- 10.0.0.0/16
67+
- 10.1.0.0/16
68+
- 10.2.0.0/16
69+
name: "Machines"
70+
```
71+
72+
## How does that help for external traffic?
73+
74+
You can figure out what is the external traffic based on subnet labels, or the absence thereof. In this default configuration, all the cluster network entities are expected to be covered by these 3 subnets: Pods, Services and Machines. So all the rest is external.
75+
76+
In the Console plugin, you can for example filter for empty Destination Subnet Label, by setting an empty string in the filter:
77+
78+
![Filtering for empty destination subnet label](./filter-empty-subnet.png)
79+
80+
It gives you all the traffic to external workloads or services.
81+
82+
You can also create `FlowMetrics` resources dedicated to outside traffic. Thankfully, we provide some examples that should work out of the box with the default subnet labels:
83+
84+
```bash
85+
kubectl apply -n netobserv -f https://raw.githubusercontent.com/netobserv/network-observability-operator/refs/heads/main/config/samples/flowmetrics/cluster_external_egress_traffic.yaml
86+
kubectl apply -n netobserv -f https://raw.githubusercontent.com/netobserv/network-observability-operator/refs/heads/main/config/samples/flowmetrics/cluster_external_ingress_traffic.yaml
87+
```
88+
89+
(More examples available [here](https://github.com/netobserv/network-observability-operator/tree/main/config/samples/flowmetrics), including for external traffic latency)
90+
91+
These metrics leverage the absence of Subnet Labels in order to track external traffic. They also consider Subnet Labels prefixed with `EXT:` as external traffic. If you look at their definition, you'll see these rules expressed as that:
92+
93+
```yaml
94+
filters:
95+
- field: DstSubnetLabel
96+
matchType: Absence
97+
- field: DstSubnetLabel
98+
matchType: MatchRegex
99+
value: "^EXT:.*"
100+
```
101+
102+
{#admon title="Info"}
103+
In FlowMetrics, when there are several filters for the same key, those filters are OR'ed, i.e. the match is satisfied if at least one is satisfied. Filters on different keys are AND'ed.
104+
{/}
105+
106+
In Prometheus, you can query them with the following `promQL`:
107+
108+
```
109+
topk(10, sum(rate(netobserv_cluster_external_egress_bytes_total{ SrcK8S_Namespace!="" }[2m])) by (SrcK8S_Namespace, SrcK8S_OwnerName))
110+
```
111+
112+
![Prometheus/promql for external egress traffic](./external-promql.png)
113+
114+
Or in the OpenShift Console, navigate to Observe > Dashboards > NetObserv / Main:
115+
116+
![Dashboard external traffic](./dashboard-external-traffic.png)
117+
118+
## Going further: identifying the external workloads
119+
120+
All good so far, however this doesn't answer the question: where is this traffic flowing to (or from)?
121+
122+
At this point, if we don't search into the per-flow details, we don't know. With the `FlowMetrics` API, we _could_ add the destination IPs as a metric label, however this is not recommended, because it results in a very high metrics cardinality, causing your Prometheus index to balloon. If you try it, the `FlowMetrics` webhook will warn you about it. Let's try something different...
123+
124+
We'll take an example. The above picture shows that the OpenShift image registry has a regular ~500 KBps traffic rate to external IPs.
125+
126+
If we go back to the Console plugin and look at the image registry topology, aggregated per owner, here's what we get:
127+
128+
![Image registry topology unknown](./topology-unknown.png)
129+
130+
There are connections to several other cluster components, and this enigmatic "Unknown" element. Clicking on it will suggest two things that we can do:
131+
132+
![Image registry topology unknown with details](./topology-unknown-details.png)
133+
134+
1. Decrease scope aggregation
135+
2. Configure subnet labels
136+
137+
Let's do 1, clicking on the "Resource" scope, on the left:
138+
139+
![Image registry topology per resource](./topology-resources.png)
140+
141+
Wow, that's plenty of different IPs! Ok, that helps a bit, but it's certainly not the best possible visualization.
142+
143+
A `whois` on any of these IPs tells us that it's Amazon S3 under the cover. So let's ask Amazon what CIDRs are used in our region:
144+
145+
```bash
146+
curl https://ip-ranges.amazonaws.com/ip-ranges.json | jq -r '.prefixes[] | select(.region=="eu-west-3") | select(.service=="S3") | .ip_prefix'
147+
16.12.20.0/24
148+
52.95.156.0/24
149+
3.5.204.0/22
150+
52.95.154.0/23
151+
16.12.18.0/23
152+
3.5.224.0/22
153+
13.36.84.48/28
154+
13.36.84.64/28
155+
```
156+
157+
We can inject them in our `subnetLabels` config:
158+
159+
```yaml
160+
subnetLabels:
161+
openShiftAutoDetect: true
162+
customLabels:
163+
- cidrs:
164+
- 16.12.20.0/24
165+
- 52.95.156.0/24
166+
- 3.5.204.0/22
167+
- 52.95.154.0/23
168+
- 16.12.18.0/23
169+
- 3.5.224.0/22
170+
- 13.36.84.48/28
171+
- 13.36.84.64/28
172+
name: EXT:AWS_S3_eu-west-3
173+
```
174+
175+
It is recommended to use the "EXT:" prefix for all labels on external traffic, in order to distinguish external and internal subnet labels. As we've seen before, this pattern is used in the sample metrics definitions to match external traffic. It's also used in Traffic Health for external traffic trends, and in the Quick Filters of the web console.
176+
177+
You can go ahead and mark all the known external traffic in a similar way: databases, VMs, web services, etc.
178+
179+
{#admon title="Info"}
180+
Granted, in the past releases of NetObserv, going through every Subnet Labels configuration could be cumbersome. FlowCollector is a centralized API, typically managed by cluster admins, whereas knowing the various subnet dependencies might be more in the perimeter of application teams. In 1.11, there is a new API called FlowCollectorSlice that allows delegating that kind of configuration: non-admin users can now own a FlowCollectorSlice and add their specific subnet labels.
181+
{/}
182+
183+
With this setup, we are finally able to understand where the traffic is flowing to:
184+
185+
![Prometheus/promql for external egress traffic, labelled](./external-promql-labelled.png)
186+
_Our destination label appears in the Prometheus metrics._
187+
188+
As well as in the topology view:
189+
190+
![Image registry topology labelled](./topology-labelled.png)
191+
_Our destination label is visible as a topology element._
192+
193+
## Wrapping it up
194+
195+
We've seen:
196+
- How NetObserv monitors all the traffic, internal and external.
197+
- How we can use the subnet labels to mark both the internal and the external traffic.
198+
- How to leverage it in metrics with the `FlowMetrics` API.
199+
- And finally how to visualize that with a Prometheus console or with the NetObserv Console plugin.
200+
201+
As always, you can reach out to the development team on Slack ([#netobserv-project](https://cloud-native.slack.com/archives/C08HHHDA9ND) on https://slack.cncf.io/) or via our [discussion pages](https://github.com/orgs/netobserv/discussions).
135 KB
Loading
83.8 KB
Loading
336 KB
Loading
146 KB
Loading
121 KB
Loading

0 commit comments

Comments
 (0)