Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BMP sending withdrawal messages both when route added or removed #18070

Open
2 tasks done
slashdoom opened this issue Feb 9, 2025 · 5 comments
Open
2 tasks done

BMP sending withdrawal messages both when route added or removed #18070

slashdoom opened this issue Feb 9, 2025 · 5 comments
Assignees
Labels
triage Needs further investigation

Comments

@slashdoom
Copy link

Description

Say router1 has bmp configured to OpenBMP. It's also connected to and has eBGP peered with router2. When I shut an interface on router2, I see a BMP UPDATE with withdrawal data as expected. When I do a no shut on the interface on router2, the route is re-added to router1 but router1 sends another BMP UPDATE with withdrawal data rather than an UPDATE with advertise.

The first update is the shut/remove router, the second, highlighted, is the the no shut/add route.

Image

Version

FRRouting 10.2.1_git (router1) on Linux(6.6.14-0-virt).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--prefix=/usr' '--sysconfdir=/etc' '--localstatedir=/var' '--sbindir=/usr/lib/frr' '--libdir=/usr/lib' '--enable-rpki' '--enable-vtysh' '--enable-multipath=64' '--enable-vty-group=frrvty' '--enable-user=frr' '--enable-group=frr' '--enable-pcre2posix' '--enable-scripting' 'CC=gcc' 'CXX=g++'

How to reproduce

ROUTER1

router1# sh int bri
Interface       Status  VRF             Addresses
---------       ------  ---             ---------
eth0            up      default         172.20.0.200/24
lo              up      default

router1# sh run
Building configuration...

Current configuration:
!
frr version 10.2.1_git
frr defaults traditional
hostname 5f7a497036cb
no ipv6 forwarding
hostname router1
service integrated-vtysh-config
!
interface eth0
 description Mgmt
exit
!
router bgp 65100
 no bgp ebgp-requires-policy
 neighbor 172.20.0.201 remote-as 65200
 neighbor 172.20.0.201 password PASS4BGP
 !
 address-family ipv4 unicast
  network 172.20.0.0/24
  redistribute connected
 exit-address-family
 !
 bmp mirror buffer-limit 512000000
 !
 bmp targets openbmp
  bmp stats interval 60000
  bmp monitor ipv4 unicast pre-policy
  bmp connect 172.20.0.199 port 5000 min-retry 1000 max-retry 5000
 exit
exit
!
end

ROUTER2

router2# sh int bri
Interface       Status  VRF             Addresses
---------       ------  ---             ---------
eth0            up      default         172.20.0.201/24
eth1            up      default         192.168.101.128/24
eth2            up      default         192.168.102.128/24
eth3            up      default         192.168.103.128/24
lo              up      default

router2# sh run
Building configuration...

Current configuration:
!
frr version 10.2.1_git
frr defaults traditional
hostname 078ec9b74749
no ipv6 forwarding
hostname router2
service integrated-vtysh-config
!
interface eth0
 description Mgmt
exit
!
interface eth1
 description Test1
exit
!
interface eth2
 description Test2
exit
!
interface eth3
 description Test3
exit
!
router bgp 65200
 no bgp ebgp-requires-policy
 neighbor 172.20.0.200 remote-as 65100
 neighbor 172.20.0.200 password PASS4BGP
 !
 address-family ipv4 unicast
  network 172.20.10.0/24
  network 192.168.101.0/24
  network 192.168.102.0/24
  network 192.168.103.0/24
  redistribute connected
 exit-address-family
exit
!
end

STEPS

shut and no shut eth 1 - 3 on router2 while packet capturing eth0 on router1 for BMP traffic.

Expected behavior

Expected to see a BMP UPDATE advertise route.

Actual behavior

Received a second BMP UPDATE withdrawal route.

Additional context

Beyond the packet capture, the was first noticed observing the OpenBMP messages. We saw action del both when the route was added and removed.

logstash-1      | [2025-02-09T01:01:27,000][ERROR][logstash.codecs.json     ][main][c04837a832c1abedebb88398830e9e96b3d38fffce310856e2500f3e1351439c] JSON parse error, original data now in message field {:message=>"Unrecognized token 'V': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')\n at [Source: REDACTED (`StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION` disabled); line: 1, column: 2]", :exception=>LogStash::Json::ParserError, :data=>"V: 1.7\nC_HASH_ID: 91e3a7ff9f5676ed6ae6fcd8a6b455ec\nT: unicast_prefix\nL: 206\nR: 1\n\ndel\t3\t990507a54143c2f34f76063aab3be0e1\t155e88eed6af4a70c7cb14d744a733ad\t172.20.0.200\t\t12b76ba6f98718a875e55c6e2e7f7660\t172.20.0.201\t65200\t2025-02-09 01:01:26.739395\t192.168.103.0\t24\t1\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t0\t\t1\t1\t\n"}
logstash-1      | {
logstash-1      |            "is_atomic_agg" => nil,
logstash-1      |                "router_ip" => "172.20.0.200",
logstash-1      |                   "prefix" => "192.168.103.0",
logstash-1      |               "prefix_len" => 24,
logstash-1      |                     "TYPE" => "unicast_prefix",
logstash-1      |           "base_attr_hash" => nil,
logstash-1      |                   "action" => "del",
logstash-1      |              "router_hash" => "155e88eed6af4a70c7cb14d744a733ad",
logstash-1      |            "is_pre_policy" => "1",
logstash-1      |                  "path_id" => 0,
logstash-1      |                 "peer_asn" => 65200,
logstash-1      |                "origin_as" => nil,
logstash-1      |       "ext_community_list" => nil,
logstash-1      |                   "labels" => nil,
logstash-1      |                   "origin" => nil,
logstash-1      |               "@timestamp" => 2025-02-09T01:01:26.739Z,
logstash-1      |               "local_pref" => nil,
logstash-1      |                     "tags" => [
logstash-1      |         [0] "_jsonparsefailure"
logstash-1      |     ],
logstash-1      |                "is_adj_in" => "1",
logstash-1      |            "originator_id" => nil,
logstash-1      |               "aggregator" => nil,
logstash-1      |         "is_next_hop_IPv4" => nil,
logstash-1      |                     "hash" => "990507a54143c2f34f76063aab3be0e1",
logstash-1      |                 "sequence" => 3,
logstash-1      |            "as_path_count" => nil,
logstash-1      |                 "@version" => "1",
logstash-1      |           "community_list" => nil,
logstash-1      |                    "TOPIC" => "openbmp.parsed.unicast_prefix",
logstash-1      |                  "is_IPv4" => "1",
logstash-1      |     "large_community_list" => nil,
logstash-1      |                      "MED" => nil,
logstash-1      |             "cluster_list" => nil,
logstash-1      |                  "peer_ip" => "172.20.0.201",
logstash-1      |                "peer_hash" => "12b76ba6f98718a875e55c6e2e7f7660",
logstash-1      |                  "as_path" => nil,
logstash-1      |                 "next_hop" => nil
logstash-1      | }

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@slashdoom slashdoom added the triage Needs further investigation label Feb 9, 2025
@pguibert6WIND pguibert6WIND self-assigned this Feb 11, 2025
@pguibert6WIND
Copy link
Member

@slashdoom , can you retry to do the procedure by ensuring that the nexthop tracking are completely resolved ?

When a BGP update whose nexthop is unresolved, what happens is that BMP sends a post-policy withdraw message.
Then, a few milliseconds after having validated the nexthop (with show bgp nexthop), a post-policy update message is sent for the same prefix.

Is it related to what you observe?

@slashdoom
Copy link
Author

Hi @pguibert6WIND, thanks for taking a look here.

I don't believe this is what I'm observing. Since I only have bmp monitor ipv4 unicast pre-policy set I wouldn't expect to see the post-policy withdraw messages. But also you'll see in my wireshark screen grab that the two update messages were ~10sec apart (the time to took me to switch windows and issue the interface no shut command).

Here is the procedure retry with some timestamps...

1. router1 - BGP nexthop starting state

router1# exit
62195d3dd159:/# date
Tue Feb 11 19:53:08 UTC 2025
62195d3dd159:/# vtysh

Hello, this is FRRouting (version 10.2.1_git).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

router1# show ip bgp nexthop 172.20.0.129
 172.20.0.129 valid [IGP metric 0], #paths 4, peer 172.20.0.129
  Resolved prefix 172.20.0.0/24
  if eth0
  Last update: Tue Feb 11 19:40:32 2025
  Paths:
    1/1 192.168.103.0/24 VRF default flags 0x418
    1/1 192.168.102.0/24 VRF default flags 0x418
    1/1 192.168.101.0/24 VRF default flags 0x418
    1/1 172.20.0.0/24 VRF default flags 0x410
router1# exit

2. router2 - Shutting interface

9aa72f7bda00:/# date
Tue Feb 11 19:53:17 UTC 2025
9aa72f7bda00:/# vtysh

Hello, this is FRRouting (version 10.2.1_git).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

router2# conf t
router2(config)# int eth3
router2(config-if)# shut
router2(config-if)# exit
router2(config)# exit
router2# exit

3. router1 - Re-check BGP nexthop

62195d3dd159:/# date
Tue Feb 11 19:53:35 UTC 2025
62195d3dd159:/# vtysh

Hello, this is FRRouting (version 10.2.1_git).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

router1# show ip bgp nexthop 172.20.0.129
 172.20.0.129 valid [IGP metric 0], #paths 3, peer 172.20.0.129
  Resolved prefix 172.20.0.0/24
  if eth0
  Last update: Tue Feb 11 19:40:32 2025
  Paths:
    1/1 192.168.102.0/24 VRF default flags 0x418
    1/1 192.168.101.0/24 VRF default flags 0x418
    1/1 172.20.0.0/24 VRF default flags 0x410

First BMP Update PCAP

Image

4. router2 - No shutting interface

9aa72f7bda00:/# date
Tue Feb 11 19:53:53 UTC 2025
9aa72f7bda00:/# vtysh

Hello, this is FRRouting (version 10.2.1_git).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

router2# conf t
router2(config)# ith eth3
% Unknown command: ith eth3
router2(config)# int eth3
router2(config-if)# no shut
router2(config-if)# exit
router2(config)# exit
router2#

5. router1 - Re-check BGP nexthop

62195d3dd159:/# date
Tue Feb 11 19:54:02 UTC 2025
62195d3dd159:/# vtysh

Hello, this is FRRouting (version 10.2.1_git).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

router1# show ip bgp nexthop 172.20.0.129
 172.20.0.129 valid [IGP metric 0], #paths 4, peer 172.20.0.129
  Resolved prefix 172.20.0.0/24
  if eth0
  Last update: Tue Feb 11 19:40:32 2025
  Paths:
    1/1 192.168.103.0/24 VRF default flags 0x418
    1/1 192.168.102.0/24 VRF default flags 0x418
    1/1 192.168.101.0/24 VRF default flags 0x418
    1/1 172.20.0.0/24 VRF default flags 0x410
router1#

Second BMP Update PCAP

Image

router1.txt
router2.txt

@pguibert6WIND
Copy link
Member

I used the same similar config with the bmp collector provided in the topotest, and I could not observe the issue.
which branch are you using please?

@slashdoom
Copy link
Author

I've been using the docker images. I've tried...
quay.io/frrouting/frr:10.2.1
...and...
quay.io/frrouting/frr:8.5.7

@pguibert6WIND
Copy link
Member

pguibert6WIND commented Feb 12, 2025

not reproduced by using frr: 10.2.1. ( I rebuilt outside of docker)
Is it possible for you to use topotests to reproduce the issue.
I slightly modified the current test.
By using --pause, you can manually run shutdown and no shutdown, to see the incoming messages at collector side.

From a31407cc21ecd9f215ff3a0a9961c1a02aab2555 Mon Sep 17 00:00:00 2001
From: Philippe Guibert <[email protected]>
Date: Tue, 11 Feb 2025 22:23:20 +0100
Subject: [PATCH] trial

---
 tests/topotests/bgp_bmp/r2/zebra.conf   |  5 ++++
 tests/topotests/bgp_bmp/test_bgp_bmp.py | 35 +++++++++++++++++++++++++
 2 files changed, 40 insertions(+)

diff --git a/tests/topotests/bgp_bmp/r2/zebra.conf b/tests/topotests/bgp_bmp/r2/zebra.conf
index 9d82bfe2df5c..34bf800daee5 100644
--- a/tests/topotests/bgp_bmp/r2/zebra.conf
+++ b/tests/topotests/bgp_bmp/r2/zebra.conf
@@ -6,3 +6,8 @@ interface r2-eth1
  ip address 172.31.0.2/24
  ipv6 address 172:31::2/64
 !
+interface r2-eth2
+ ip address 172.31.10.2/24
+ ipv6 address 172:31:10::2/64
+ shutdown
+!
diff --git a/tests/topotests/bgp_bmp/test_bgp_bmp.py b/tests/topotests/bgp_bmp/test_bgp_bmp.py
index 80e291b2bdb6..8deb3b4f5007 100644
--- a/tests/topotests/bgp_bmp/test_bgp_bmp.py
+++ b/tests/topotests/bgp_bmp/test_bgp_bmp.py
@@ -54,6 +54,7 @@ LOC_RIB = "loc-rib"
 def build_topo(tgen):
     tgen.add_router("r1")
     tgen.add_router("r2")
+    tgen.add_router("r3")
     tgen.add_bmp_server("bmp1", ip="192.0.2.10", defaultRoute="via 192.0.2.1")
 
     switch = tgen.add_switch("s1")
@@ -61,6 +62,8 @@ def build_topo(tgen):
     switch.add_link(tgen.gears["bmp1"])
 
     tgen.add_link(tgen.gears["r1"], tgen.gears["r2"], "r1-eth1", "r2-eth0")
+    tgen.add_link(tgen.gears["r3"], tgen.gears["r2"], "r3-eth0", "r2-eth1")
+    tgen.add_link(tgen.gears["r3"], tgen.gears["r2"], "r3-eth1", "r2-eth2")
 
 
 def setup_module(mod):
@@ -302,6 +305,38 @@ def test_bmp_bgp_vpn():
     vpn_prefixes(LOC_RIB)
 
 
+def test_bgp_new_redistribute():
+    """
+    Add no shutdown on r2-eth2 interface
+    """
+    tgen = get_topogen()
+
+    tgen.gears["r2"].vtysh_cmd(
+        """
+        configure terminal\n
+        interface r2-eth2\n
+        no shutdown
+        """
+    )
+    logger.info("checking for BMP update messages with 172.31.10.2/24")
+
+
+def test_bgp_new_redistribute():
+    """
+    Add no shutdown on r2-eth2 interface
+    """
+    tgen = get_topogen()
+
+    tgen.gears["r2"].vtysh_cmd(
+        """
+        configure terminal\n
+        interface r2-eth2\n
+        shutdown
+        """
+    )
+    logger.info("checking for BMP withdraw messages with 172.31.10.2/24")
+
+
 if __name__ == "__main__":
     args = ["-s"] + sys.argv[1:]
     sys.exit(pytest.main(args))
-- 
2.34.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

2 participants