Skip to content

[frr] Default route installation failure during LAG member flap #17345

@nazariig

Description

@nazariig

Description

Before LAG memebr add/remove:

root@r-tigon-11:/home/admin# ip -4 route show default
default nhid 309 proto bgp src 10.1.0.32 metric 20
        nexthop via 10.0.0.9 dev PortChannel103 weight 1
        nexthop via 10.0.0.13 dev PortChannel104 weight 1
        nexthop via 10.0.0.1 dev PortChannel101 weight 1
        nexthop via 10.0.0.5 dev PortChannel102 weight 1

root@r-tigon-11:/home/admin# show ip route 0.0.0.0 json
{
    "0.0.0.0/0": [
        {
            "destSelected": true,
            "distance": 20,
            "installed": true,
            "installedNexthopGroupId": 309,
            "internalFlags": 264,
            "internalNextHopActiveNum": 4,
            "internalNextHopNum": 4,
            "internalStatus": 16,
            "metric": 0,
            "nexthopGroupId": 309,
            "nexthops": [
                {
                    "active": true,
                    "afi": "ipv4",
                    "fib": true,
                    "flags": 3,
                    "interfaceIndex": 419,
                    "interfaceName": "PortChannel101",
                    "ip": "10.0.0.1",
                    "weight": 1
                },
                {
                    "active": true,
                    "afi": "ipv4",
                    "fib": true,
                    "flags": 3,
                    "interfaceIndex": 420,
                    "interfaceName": "PortChannel102",
                    "ip": "10.0.0.5",
                    "weight": 1
                },
                {
                    "active": true,
                    "afi": "ipv4",
                    "fib": true,
                    "flags": 3,
                    "interfaceIndex": 421,
                    "interfaceName": "PortChannel103",
                    "ip": "10.0.0.9",
                    "weight": 1
                },
                {
                    "active": true,
                    "afi": "ipv4",
                    "fib": true,
                    "flags": 3,
                    "interfaceIndex": 422,
                    "interfaceName": "PortChannel104",
                    "ip": "10.0.0.13",
                    "weight": 1
                }
            ],
            "offloaded": true,
            "prefix": "0.0.0.0/0",
            "prefixLen": 0,
            "protocol": "bgp",
            "selected": true,
            "table": 254,
            "uptime": "00:07:31",
            "vrfId": 0,
            "vrfName": "default"
        }
    ]
}

root@r-tigon-11:/home/admin# show interfaces portchannel
Flags: A - active, I - inactive, Up - up, Dw - Down, N/A - not available,
       S - selected, D - deselected, * - not synced
  No.  Team Dev        Protocol     Ports
-----  --------------  -----------  ---------------------------
  101  PortChannel101  LACP(A)(Up)  Ethernet0(S) Ethernet4(S)
  102  PortChannel102  LACP(A)(Up)  Ethernet20(S) Ethernet16(S)
  103  PortChannel103  LACP(A)(Up)  Ethernet64(S) Ethernet68(S)
  104  PortChannel104  LACP(A)(Up)  Ethernet80(S) Ethernet84(S)

root@r-tigon-11:/home/admin# show ip interfaces
Interface       Master    IPv4 address/mask    Admin/Oper    BGP Neighbor    Neighbor IP
--------------  --------  -------------------  ------------  --------------  -------------
Loopback0                 10.1.0.32/32         up/up         N/A             N/A
PortChannel101            10.0.0.0/31          up/up         ARISTA01T1      10.0.0.1
PortChannel102            10.0.0.4/31          up/up         ARISTA02T1      10.0.0.5
PortChannel103            10.0.0.8/31          up/up         ARISTA03T1      10.0.0.9
PortChannel104            10.0.0.12/31         up/up         ARISTA04T1      10.0.0.13
Vlan1000                  192.168.0.1/21       up/up         N/A             N/A
docker0                   240.127.1.1/24       up/down       N/A             N/A
eth0                      10.210.25.10/22      up/up         N/A             N/A
lo                        127.0.0.1/16         up/up         N/A             N/A

root@r-tigon-11:/home/admin# show ip bgp summary

IPv4 Unicast Summary:
BGP router identifier 10.1.0.32, local AS number 64601 vrf-id 0
BGP table version 9555
RIB entries 12807, using 2458944 bytes of memory
Peers 4, using 2967904 KiB of memory
Peer groups 4, using 256 bytes of memory


Neighbhor      V     AS    MsgRcvd    MsgSent    TblVer    InQ    OutQ  Up/Down      State/PfxRcd  NeighborName
-----------  ---  -----  ---------  ---------  --------  -----  ------  ---------  --------------  --------------
10.0.0.1       4  64802       3462       3463         0      0       0  00:12:54             6400  ARISTA01T1
10.0.0.5       4  64802       3460       3460         0      0       0  00:12:47             6400  ARISTA02T1
10.0.0.9       4  64802       3463       3467         0      0       0  00:12:57             6400  ARISTA03T1
10.0.0.13      4  64802       3463       3467         0      0       0  00:12:57             6400  ARISTA04T1

Total number of neighbors 4

root@r-tigon-11:/home/admin# tail -F  /var/log/syslog | grep '0.0.0.0/0'

Nov  7 18:02:19.840060 r-tigon-11 INFO hostcfgd[213526]: TCPMSS  tcp opt -- in * out *  0.0.0.0/0  -> 10.1.0.32   tcp flags:0x02/0x02 TCPMSS set 1460
Nov  7 18:02:19.848076 r-tigon-11 INFO hostcfgd[213531]: TCPMSS  tcp opt -- in * out *  10.1.0.32  -> 0.0.0.0/0   tcp flags:0x02/0x02 TCPMSS set 1460
Nov  7 18:04:29.663742 r-tigon-11 INFO swss#orchagent: :- addRoutePost: Post set route 0.0.0.0/0 with next hop(s) 10.0.0.9@PortChannel103,10.0.0.13@PortChannel104
Nov  7 18:04:45.986055 r-tigon-11 INFO swss#orchagent: :- addRoutePost: Post set route 0.0.0.0/0 with next hop(s) 10.0.0.1@PortChannel101,10.0.0.9@PortChannel103,10.0.0.13@PortChannel104
Nov  7 18:04:59.260038 r-tigon-11 INFO swss#orchagent: :- addRoutePost: Post set route 0.0.0.0/0 with next hop(s) 10.0.0.1@PortChannel101,10.0.0.5@PortChannel102,10.0.0.9@PortChannel103,10.0.0.13@PortChannel104

After LAG memebr add/remove:

root@r-tigon-11:/home/admin# ./member.sh
=> remove Ethernet20 from PortChannel102
=> add Ethernet20 from PortChannel102
=> remove Ethernet64 from PortChannel103
=> add Ethernet64 from PortChannel103
=> remove Ethernet4 from PortChannel101
=> add Ethernet4 from PortChannel101
=> remove Ethernet80 from PortChannel104
=> add Ethernet80 from PortChannel104

root@r-tigon-11:/home/admin# tail -F /var/log/frr/zebra.log | grep '0.0.0.0/0'
Nov  7 18:22:35.567882 r-tigon-11 WARNING bgp#zebra[35]: [VYKYC-709DP] default(0:254):0.0.0.0/0: Route install failed

root@r-tigon-11:/home/admin# tail -F  /var/log/syslog | grep '0.0.0.0/0'
Nov  7 18:22:40.853675 r-tigon-11 INFO swss#orchagent: :- addRoutePost: Post set route 0.0.0.0/0 with next hop(s) 10.0.0.1@PortChannel101,10.0.0.5@PortChannel102,10.0.0.13@PortChannel104
Nov  7 18:22:54.432973 r-tigon-11 INFO swss#orchagent: :- addRoutePost: Post set route 0.0.0.0/0 with next hop(s) 10.0.0.1@PortChannel101,10.0.0.5@PortChannel102,10.0.0.9@PortChannel103

root@r-tigon-11:/home/admin# show ip bgp summary

IPv4 Unicast Summary:
BGP router identifier 10.1.0.32, local AS number 64601 vrf-id 0
BGP table version 21953
RIB entries 12807, using 2458944 bytes of memory
Peers 4, using 2967904 KiB of memory
Peer groups 4, using 256 bytes of memory


Neighbhor      V     AS    MsgRcvd    MsgSent    TblVer    InQ    OutQ  Up/Down      State/PfxRcd  NeighborName
-----------  ---  -----  ---------  ---------  --------  -----  ------  ---------  --------------  --------------
10.0.0.1       4  64802       6860       7251         0      0       0  00:04:37             6400  ARISTA01T1
10.0.0.5       4  64802       6858      10088         0      0       0  00:04:40             6400  ARISTA02T1
10.0.0.9       4  64802       6861      10096         0      0       0  00:04:38             6400  ARISTA03T1
10.0.0.13      4  64802       6861       6873         0      0       0  00:04:34             6400  ARISTA04T1

Total number of neighbors 4

root@r-tigon-11:/home/admin# show ip interfaces
Interface       Master    IPv4 address/mask    Admin/Oper    BGP Neighbor    Neighbor IP
--------------  --------  -------------------  ------------  --------------  -------------
Loopback0                 10.1.0.32/32         up/up         N/A             N/A
PortChannel101            10.0.0.0/31          up/up         ARISTA01T1      10.0.0.1
PortChannel102            10.0.0.4/31          up/up         ARISTA02T1      10.0.0.5
PortChannel103            10.0.0.8/31          up/up         ARISTA03T1      10.0.0.9
PortChannel104            10.0.0.12/31         up/up         ARISTA04T1      10.0.0.13
Vlan1000                  192.168.0.1/21       up/up         N/A             N/A
docker0                   240.127.1.1/24       up/down       N/A             N/A
eth0                      10.210.25.10/22      up/up         N/A             N/A
lo                        127.0.0.1/16         up/up         N/A             N/A

root@r-tigon-11:/home/admin# show interfaces portchannel
Flags: A - active, I - inactive, Up - up, Dw - Down, N/A - not available,
       S - selected, D - deselected, * - not synced
  No.  Team Dev        Protocol     Ports
-----  --------------  -----------  ---------------------------
  101  PortChannel101  LACP(A)(Up)  Ethernet0(S) Ethernet4(S)
  102  PortChannel102  LACP(A)(Up)  Ethernet20(S) Ethernet16(S)
  103  PortChannel103  LACP(A)(Up)  Ethernet64(S) Ethernet68(S)
  104  PortChannel104  LACP(A)(Up)  Ethernet80(S) Ethernet84(S)

root@r-tigon-11:/home/admin# show ip route 0.0.0.0 json
{
    "0.0.0.0/0": [
        {
            "destSelected": true,
            "distance": 20,
            "failed": true,
            "installedNexthopGroupId": 309,
            "internalFlags": 264,
            "internalNextHopActiveNum": 4,
            "internalNextHopNum": 4,
            "internalStatus": 32,
            "metric": 0,
            "nexthopGroupId": 309,
            "nexthops": [
                {
                    "active": true,
                    "afi": "ipv4",
                    "fib": true,
                    "flags": 3,
                    "interfaceIndex": 419,
                    "interfaceName": "PortChannel101",
                    "ip": "10.0.0.1",
                    "weight": 1
                },
                {
                    "active": true,
                    "afi": "ipv4",
                    "fib": true,
                    "flags": 3,
                    "interfaceIndex": 420,
                    "interfaceName": "PortChannel102",
                    "ip": "10.0.0.5",
                    "weight": 1
                },
                {
                    "active": true,
                    "afi": "ipv4",
                    "fib": true,
                    "flags": 3,
                    "interfaceIndex": 421,
                    "interfaceName": "PortChannel103",
                    "ip": "10.0.0.9",
                    "weight": 1
                },
                {
                    "active": true,
                    "afi": "ipv4",
                    "fib": true,
                    "flags": 3,
                    "interfaceIndex": 422,
                    "interfaceName": "PortChannel104",
                    "ip": "10.0.0.13",
                    "weight": 1
                }
            ],
            "offloaded": true,
            "prefix": "0.0.0.0/0",
            "prefixLen": 0,
            "protocol": "bgp",
            "selected": true,
            "table": 254,
            "uptime": "00:06:07",
            "vrfId": 0,
            "vrfName": "default"
        }
    ]
}

root@r-tigon-11:/home/admin# ip -4 route show default

Steps to reproduce the issue:

  1. Do LAG member flap

Describe the results you received:

root@r-tigon-11:/home/admin# tail -F /var/log/frr/zebra.log | grep '0.0.0.0/0'
Nov  7 18:22:35.567882 r-tigon-11 WARNING bgp#zebra[35]: [VYKYC-709DP] default(0:254):0.0.0.0/0: Route install failed

Describe the results you expected:

No errors are expected

Output of show version:

root@r-tigon-11:/host/bak# show version

SONiC Software Version: SONiC.master.402751-c0b0f2a69
SONiC OS Version: 11
Distribution: Debian 11.7
Kernel: 5.10.0-23-2-amd64
Build commit: c0b0f2a69
Build date: Sat Nov  4 11:41:31 UTC 2023
Built by: AzDevOps@vmss-soni002DIS

Platform: x86_64-mlnx_msn4600c-r0
HwSKU: Mellanox-SN4600C-C64
ASIC: mellanox
ASIC Count: 1
Serial Number: MT2023X22082
Model Number: MSN4600-CS2FO
Hardware Revision: A1
Uptime: 17:24:59 up 40 min,  4 users,  load average: 0.96, 0.88, 0.74
Date: Tue 07 Nov 2023 17:24:59

Docker images:
REPOSITORY                    TAG                       IMAGE ID       SIZE
docker-syncd-mlnx             latest                    dd53c89560c2   823MB
docker-syncd-mlnx             master.402751-c0b0f2a69   dd53c89560c2   823MB
docker-platform-monitor       latest                    65ede652e825   814MB
docker-platform-monitor       master.402751-c0b0f2a69   65ede652e825   814MB
docker-macsec                 latest                    620949b0c112   326MB
docker-dhcp-relay             latest                    f0b8ee99d063   307MB
docker-eventd                 latest                    53f49f35074e   299MB
docker-eventd                 master.402751-c0b0f2a69   53f49f35074e   299MB
docker-orchagent              latest                    f44c166f619c   336MB
docker-orchagent              master.402751-c0b0f2a69   f44c166f619c   336MB
docker-fpm-frr                latest                    8e3837c07e3e   356MB
docker-fpm-frr                master.402751-c0b0f2a69   8e3837c07e3e   356MB
docker-nat                    latest                    277b29a729c5   327MB
docker-nat                    master.402751-c0b0f2a69   277b29a729c5   327MB
docker-sflow                  latest                    a4afc7e8a2e7   326MB
docker-sflow                  master.402751-c0b0f2a69   a4afc7e8a2e7   326MB
docker-teamd                  latest                    6bdc0761134b   324MB
docker-teamd                  master.402751-c0b0f2a69   6bdc0761134b   324MB
docker-snmp                   latest                    c917c27c7836   338MB
docker-snmp                   master.402751-c0b0f2a69   c917c27c7836   338MB
docker-sonic-telemetry        latest                    a7bd888fd016   387MB
docker-sonic-telemetry        master.402751-c0b0f2a69   a7bd888fd016   387MB
docker-lldp                   latest                    c71d3ce2fcfc   341MB
docker-lldp                   master.402751-c0b0f2a69   c71d3ce2fcfc   341MB
docker-database               latest                    89c95bd0af0b   299MB
docker-database               master.402751-c0b0f2a69   89c95bd0af0b   299MB
docker-mux                    latest                    a6ec91023c3d   348MB
docker-mux                    master.402751-c0b0f2a69   a6ec91023c3d   348MB
docker-router-advertiser      latest                    4dabbcfd95e8   299MB
docker-router-advertiser      master.402751-c0b0f2a69   4dabbcfd95e8   299MB
docker-sonic-mgmt-framework   latest                    0d1f658824f2   416MB
docker-sonic-mgmt-framework   master.402751-c0b0f2a69   0d1f658824f2   416MB

Output of show techsupport:

  • N/A

Additional information you deem important (e.g. issue happens only occasionally):

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions