Skip to content

ARP entry is out of sync between kernel and APPL_DB after warm reboot if the ARP entry is updated more than once during warm reboot in PFC watchdog warm reboot test #13341

@stephenxs

Description

@stephenxs

Description

There is a logic to send lossless traffic to the DUT to verify whether the traffic is discarded while the egress queue is under a PFC storm in the PFCWD warm reboot test.
The flow is like this:

  1. It randomly chooses a port set to use in the test.
  2. It chooses the first port in the port set to test and configures an IP address, eg. 192.168.0.2 on the corresponding port in PTF docker to send the traffic. The DUT learns the first port's MAC address as the MAC of 192.168.0.2 in ARP entry.
  3. It warm-reboots the DUT and chooses the second port in the port set to test and configures the same IP address on the corresponding port in PTF docker (which is a different PTF port from the one chozen in step 2).
  4. DUT's kernel learns the second port's MAC address as the MAC of 192.168.0.2 in the ARP entry after warm-reboot, which means 192.168.0.2's ARP entry is updated.

But after warm reboot, what I observed is that

  • the output of ip neigh indicates the MAC in 192.168.0.2 is the second port's MAC address, which is correct
  • in APPL_DB (and below, ASIC_DB, SAI, SDK, and ASIC), the first port's MAC address is still used as the MAC of 192.168.0.2, which is wrong.
  • the MAC of the first port isn't learned since it doesn't send any traffic anymore
  • as a result, the traffic sent from the second port gets flooded

The log when the issue is observed

Jan 11 01:30:40.206914 mtbc-sonic-01-2410 INFO swss#supervisord 2023-01-11 01:30:40,206 INFO exited: restore_neighbors (exit status 0; expected)
Jan 11 01:30:40.213342 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- isNeighRestoreDone: neighbor table restore to kernel is done
Jan 11 01:30:40.213342 mtbc-sonic-01-2410 INFO swss#supervisord: neighsyncd Listens to neigh messages...
Jan 11 01:30:40.216894 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: Vlan1000:192.168.0.2, new value 
Jan 11 01:30:45.212413 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- setWarmStartState: neighsyncd warm start state changed to reconciled

An example log when the issue is NOT observed

Jan 12 03:42:33.517318 mtbc-sonic-01-2410 INFO swss#restore_neighbor: restore_neighbor service is done for system warmreboot
Jan 12 03:42:33.689348 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- isNeighRestoreDone: neighbor table restore to kernel is done
Jan 12 03:42:33.689842 mtbc-sonic-01-2410 INFO swss#supervisord: neighsyncd Listens to neigh messages...
Jan 12 03:42:33.762000 mtbc-sonic-01-2410 INFO swss#supervisord 2023-01-12 03:42:33,761 INFO exited: restore_neighbors (exit status 0; expected)
Jan 12 03:42:34.610528 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: Vlan1000:192.168.0.2, new value 
Jan 12 03:42:38.689387 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- reconcile: NEIGH_TABLE NEW, key: Vlan1000:192.168.0.2, neigh:24:8a:07:ab:a1:12, family:IPv4, cache-state:NEW, 
Jan 12 03:42:38.690414 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- setWarmStartState: neighsyncd warm start state changed to reconciled

There is a logic to hold all the ARP entries learned during warm reboot and apply them to APPL_DB only after it is reconciled.
I do see the following message in the log insertToMap: NEIGH_TABLE, found key: Vlan1000:192.168.0.2, new value per my understanding, it will push any keys whose state is NEW into APPL_DB in AppRestartAssist::reconcile() but I don't see messages like %s NEW, key: %s, %s in the else if (state == NEW) branch of the function, which is probably the cause.

Steps to reproduce the issue:

  1. run test script pfcwd/test_pfcwd_warm_reboot.py

the issue won't be caught by the script by default. but if we capture traffic on other ports within the same VLAN, the flooded traffic can benn seen.

Describe the results you received:

ARP entry is out of sync.

Describe the results you expected:

ARP should always be aligned between kernel and SONiC/SDK/ASIC.

Output of show version:

(paste your output here)
admin@mtbc-sonic-01-2410:~$ show version

SONiC Software Version: SONiC.202205_1_rc.12-d60ed3f21_Internal
Distribution: Debian 11.6
Kernel: 5.10.0-18-2-amd64
Build commit: d60ed3f21
Build date: Sat Dec 31 19:22:49 UTC 2022
Built by: sw-r2d2-bot@r-build-sonic-ci03-242

Platform: x86_64-mlnx_msn2410-r0
HwSKU: ACS-MSN2410
ASIC: mellanox
ASIC Count: 1
Serial Number: MT1639X02523
Model Number: MSN2410-CB2F
Hardware Revision: "A1"
Uptime: 01:37:11 up 8 min,  1 user,  load average: 1.57, 2.15, 1.31
Date: Thu 12 Jan 2023 01:37:11

Docker images:
REPOSITORY                                         TAG                                 IMAGE ID       SIZE
docker-syncd-mlnx                                  202205_1_rc.12-d60ed3f21_Internal   f818576f9b9d   869MB
docker-syncd-mlnx                                  latest                              f818576f9b9d   869MB
docker-fpm-frr                                     202205_1_rc.12-d60ed3f21_Internal   ee75a1c85813   496MB
docker-fpm-frr                                     latest                              ee75a1c85813   496MB
docker-orchagent                                   202205_1_rc.12-d60ed3f21_Internal   2314af1587bb   479MB
docker-orchagent                                   latest                              2314af1587bb   479MB
docker-teamd                                       202205_1_rc.12-d60ed3f21_Internal   6b04a311afb0   460MB
docker-teamd                                       latest                              6b04a311afb0   460MB
docker-macsec                                      202205_1_rc.11-d60ed3f21_Internal   4f7b3fb10f13   462MB
docker-platform-monitor                            202205_1_rc.12-d60ed3f21_Internal   478948a84604   868MB
docker-platform-monitor                            latest                              478948a84604   868MB
docker-snmp                                        202205_1_rc.12-d60ed3f21_Internal   033b50b59f59   489MB
docker-snmp                                        latest                              033b50b59f59   489MB
docker-dhcp-relay                                  202205_1_rc.11-d60ed3f21_Internal   d5f88b8a24d4   454MB
docker-sonic-telemetry                             202205_1_rc.12-d60ed3f21_Internal   e54856ce90c3   525MB
docker-sonic-telemetry                             latest                              e54856ce90c3   525MB
docker-lldp                                        202205_1_rc.12-d60ed3f21_Internal   68b2c3afdfa7   487MB
docker-lldp                                        latest                              68b2c3afdfa7   487MB
docker-database                                    202205_1_rc.12-d60ed3f21_Internal   c56dcb2f8f78   444MB
docker-database                                    latest                              c56dcb2f8f78   444MB
docker-mux                                         202205_1_rc.12-d60ed3f21_Internal   6ed113eec122   493MB
docker-mux                                         latest                              6ed113eec122   493MB
docker-router-advertiser                           202205_1_rc.12-d60ed3f21_Internal   e05b851d83e3   444MB
docker-router-advertiser                           latest                              e05b851d83e3   444MB
docker-sflow                                       202205_1_rc.12-d60ed3f21_Internal   da1869e43e5d   429MB
docker-sflow                                       latest                              da1869e43e5d   429MB
docker-nat                                         202205_1_rc.12-d60ed3f21_Internal   39694c3ecbf6   431MB
docker-nat                                         latest                              39694c3ecbf6   431MB
docker-sonic-mgmt-framework                        202205_1_rc.12-d60ed3f21_Internal   d57b752bacb5   557MB
docker-sonic-mgmt-framework                        latest                              d57b752bacb5   557MB
urm.nvidia.com/sw-nbu-sws-sonic-docker/sonic-wjh   1.3.1                               410975934531   310MB
urm.nvidia.com/sw-nbu-sws-sonic-docker/doroce      1.0.0-202205-internal-7             2aa171345b8f   200MB

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

Root cause analysis

The STATE of an entry in AppRestartAssist is maintained on a per-entry basis.
What happens if a NEW entry is updated with a value that existed in the cache? in that case, NEW state will be overridden to SAME and the new information will be lost.

  1. first port's MAC is in the ARP entry.
  2. kernel notifies ARP entry updated with the second's MAC. cache state => NEW
  3. kernel notifies ARP entry updated with whatever the first or second's MAC. cache state => SAME. the information updated in step 2 is lost.
  4. reconcile. SAME entry won't be programmed to the APPL_DB. the new information becomes SAME and won't be updated to APPL_DB, which means the new information is lost.

Details can be seen if set the severity to INFO

Jan 12 10:50:58.801935 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: Vlan1000:192.168.0.2, new value 
                                                                                                                                ^<<<<<<<============ first updated to new
Jan 12 10:50:58.807019 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: Received message NEIGH_TABLE, key: PortChannel104:fc00::7e, neigh:12:03:fe:c8:9a:c7, family:IPv6, , delete = 0
Jan 12 10:50:58.807019 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: PortChannel104:fc00::7e, same value
Jan 12 10:50:58.808143 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: Received message NEIGH_TABLE, key: PortChannel102:fc00::76, neigh:1e:20:2b:4d:e7:79, family:IPv6, , delete = 0
Jan 12 10:50:58.808143 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: PortChannel102:fc00::76, same value
Jan 12 10:50:58.808936 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: Received message NEIGH_TABLE, key: PortChannel103:fc00::7a, neigh:62:34:26:6d:0f:d8, family:IPv6, , delete = 0
Jan 12 10:50:58.808936 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: PortChannel103:fc00::7a, same value
Jan 12 10:50:58.809360 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- isLinkLocalEnabled: IPv6 Link local is not supported for lo 
Jan 12 10:50:58.809776 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: Received message NEIGH_TABLE, key: PortChannel101:fc00::72, neigh:1a:d2:50:3e:9e:4e, family:IPv6, , delete = 0
Jan 12 10:50:58.809776 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: PortChannel101:fc00::72, same value
Jan 12 10:50:58.811172 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- isLinkLocalEnabled: IPv6 Link local is not enabled on Vlan1000
Jan 12 10:51:01.432447 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: Received message NEIGH_TABLE, key: Vlan1000:192.168.0.2, neigh:24:8a:07:ab:a1:12, family:IPv4, , delete = 0
Jan 12 10:51:01.432447 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: Vlan1000:192.168.0.2, same value
                                                                                                        ^<<<<<<<<<<<========= second updated to the same (new) value but state is set to SAME
Jan 12 10:51:01.434169 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: Received message NEIGH_TABLE, key: Vlan1000:192.168.0.2, neigh:24:8a:07:ab:a1:12, family:IPv4, , delete = 0
Jan 12 10:51:01.434169 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: Vlan1000:192.168.0.2, same value
                                                                                                        ^<<<<<<<<<<<========= third updated to the same (new) value but state is set to SAME
Jan 12 10:51:03.796715 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- checkReconcileTimer: warmstart timer expired
Jan 12 10:51:03.796991 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel102:fc00::76, neigh:1e:20:2b:4d:e7:79, family:IPv6, cache-state:SAME, 
Jan 12 10:51:03.797170 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel102:10.0.0.59, neigh:1e:20:2b:4d:e7:79, family:IPv4, cache-state:SAME, 
Jan 12 10:51:03.797373 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel104:fc00::7e, neigh:12:03:fe:c8:9a:c7, family:IPv6, cache-state:SAME, 
Jan 12 10:51:03.797525 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: eth0:10.75.71.1, neigh:00:00:5e:00:01:04, family:IPv4, cache-state:SAME, 
Jan 12 10:51:03.797670 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel103:10.0.0.61, neigh:62:34:26:6d:0f:d8, family:IPv4, cache-state:SAME, 
Jan 12 10:51:03.797817 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel104:10.0.0.63, neigh:12:03:fe:c8:9a:c7, family:IPv4, cache-state:SAME, 
Jan 12 10:51:03.797952 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: Vlan1000:192.168.0.2, neigh:24:8a:07:ab:a1:12, family:IPv4, cache-state:SAME,
                                                                                                     ^<<<<<<<<<<<=========information that the entry has been updated is lost
Jan 12 10:51:03.798101 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: eth0:10.75.71.2, neigh:04:3f:72:64:36:48, family:IPv4, cache-state:SAME, 
Jan 12 10:51:03.798233 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel103:fc00::7a, neigh:62:34:26:6d:0f:d8, family:IPv6, cache-state:SAME, 
Jan 12 10:51:03.798379 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel101:fc00::72, neigh:1a:d2:50:3e:9e:4e, family:IPv6, cache-state:SAME, 
Jan 12 10:51:03.798511 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel101:10.0.0.57, neigh:1a:d2:50:3e:9e:4e, family:IPv4, cache-state:SAME, 
Jan 12 10:51:03.805636 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- setWarmStartState: neighsyncd warm start state changed to reconciled
Possible fix

Do not change an entry's state to SAME if it has been NEW.

Metadata

Metadata

Assignees

Labels

NVIDIATriagedthis issue has been triaged

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions