Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ospf6d / OSPFv3: Post graceful-restart grace lsa flush failed on the GR-Restarter #18061

Open
2 tasks done
rchandrans opened this issue Feb 7, 2025 · 0 comments
Open
2 tasks done
Labels
triage Needs further investigation

Comments

@rchandrans
Copy link

Description

OSPFv3 Graceful-restart grace lsa flush failed on the GR-Restarter. Post restart, after all adjacencies are re-established on the restarting node grace lsa flush(grace lsa maxage send) failed.

Version

FRRouting 8.2.2 (node1).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--localstatedir=/var/run/frr' '--sbindir=/usr/lib/frr' '--sysconfdir=/etc/frr' '--with-vtysh-pager=/usr/bin/pager' '--libdir=/usr/lib/x86_64-linux-gnu/frr' '--with-moduledir=/usr/lib/x86_64-linux-gnu/frr/modules' '--disable-dependency-tracking' '--disable-rpki' '--disable-scripting' '--with-libpam' '--enable-doc' '--enable-doc-html' '--enable-snmp' '--enable-fpm' '--enable-iptrackd' '--disable-protobuf' '--disable-zeromq' '--disable-isisd' '--disable-eigrpd' '--disable-ldpd' '--disable-ripd' '--disable-ripngd' '--disable-babeld' '--disable-vrrpd' '--disable-fabricd' '--enable-ospfapi' '--disable-ospfclient' '--disable-bgp-vnc' '--disable-pbrd' '--disable-nhrpd' '--disable-watchfrr' '--enable-multipath=256' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' 'build_alias=x86_64-linux-gnu' 'PYTHON=python3

How to reproduce

GR configuration

  1. There are 2 nodes(node1---node2) connected back-to-back over a single link and ospfv3 is enabled on the interface
  2. node1 is configured with "graceful-restart grace period 600"
  3. node2 is configured with "graceful-restart helper enable"
  4. on node-1, ospf6d is restarted which will originate grace-lsa on the link
  5. on node-2, ospf neighbor will receive grace lsa and will become a GR-Helper
  6. post restart ospf6d does graceful restart procedures
    6a. receives pre-restart lsa's from the helper and installs the lsa's(both self originated and non-self originated)
    6b. after all adjacanceis are re-established exits GR and re-originates all self-originated LSA's
    6c. flush grace-lsa's and send max-age lsa to helper so that helper exits GR-helper

Expected behavior

GR-restarter(node-1) grace lsa flush should not fail. grace-lsa maxage should be sent out after and the gr-helper(node-2) should exit GR successfully

Actual behavior

  • GR-restarter(node-1) grace lsa flush fails
    1.pre-restart and post-restart ospfv3 interface if-index are different.
    2.graceful-restart is successful at the gr-restarter(node-1) but the grace-lsa flush fails because of different if-index used in the grace lsa lookup.
    3.gr-helper is waiting for the grace lsa flush (max-age grace-lsa from the gr-restarter) to exit GR-helper
    4.gr-restarter is exiting gr successfully and re-originates all the lsa's
    5.The re-originated router-lsa is causing a topology-change on the gr-helper (strict lsa check enabled)
    or a grace timer expiry ( if strict lsa check is disabled), in both case a unsuccesful graceful-restart on the gr-helper

Additional context

I added some debugs in my environment to understand the problem better. OSPFv3 link state id for the link-lsa and grace-lsa are derived from the interface ifindex, and the assumption is that ifindex would be same after process restart(which is incorrect in this case?)

node-1(rtrid: 192.x.x.x) ----[Ethernet1]----- node-2(rtrid: 193.x.x.x)

/pre-restart/
ospf6d[301]: [VG0KG-PBTHZ] Originate Grace-LSA for Interface Ethernet1
ospf6d[301]: [TQB4M-JQ72Z] LSA Originate:
ospf6d[301]: [GNRNM-HX7W3] [Grace Id:0.0.1.218 Adv:192.x.x.x]
ospf6d[301]: [HA4BY-46K3Y] Age: 0 SeqNum: 0x80000001 Cksum: e0fa Len: 36
ospf6d[301]: [ZGT86-07C95] ospf6_install_lsa Install LSA: [Grace Id:0.0.1.218 Adv:192.x.x.x] age 0 seqnum 80000001 in LSDB.
/*pre-restart interface ifindex is 74 */
ospf6d[301]: [KAY2X-SAB58] ospf6_interface_lsdb_hook_add LSA: [Grace Id:0.0.1.218 Adv:192.x.x.x] age 0 seqnum 80000001 to oi(Ethernet1,74) LSDB.

/post-restart/
ospf6d[368]: [M3K4C-7HG7V] LSA Receive from 193.x.x.x%Ethernet1
ospf6d[368]: [GNRNM-HX7W3] [Link Id:0.0.1.218 Adv:192.x.x.x]
ospf6d[368]: [HA4BY-46K3Y] Age: 329 SeqNum: 0x80000001 Cksum: 3d50 Len: 56
ospf6d[368]: [Z8A48-ZTKMJ] Install, Flood, Possibly acknowledge the received LSA
ospf6d[368]: [ZGT86-07C95] ospf6_install_lsa Install LSA: [Link Id:0.0.1.218 Adv:192.x.x.x] age 329 seqnum 80000001 in LSDB.
/*post-restart interface ifindex is 73 */
ospf6d[368]: [KAY2X-SAB58] ospf6_interface_lsdb_hook_add LSA: [Link Id:0.0.1.218 Adv:192.x.x.x] age 329 seqnum 80000001 to oi(Ethernet1,73) LSDB.
ospf6d[368]: [RTGCS-FBH35] GR: exiting graceful restart[vrf default]: all adjacencies were reestablished
/*grace-lsa is being looked up with current ifindex/link-state-id 73 , but grace lsa is actually present in the interface LSDB with pre-restart ifindex/link-state-id 74 as received from the gr-helper */
ospf6d[368]: [Y052T-0BR52] ospf6_gr_flush_grace_lsas: GR: Grace-LSA not found [intf Ethernet1 ifdx 73 rtrid 192.x.x.x] [area 0.0.0.0] [vrf :default]

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@rchandrans rchandrans added the triage Needs further investigation label Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

1 participant