Description
There is a logic to send lossless traffic to the DUT to verify whether the traffic is discarded while the egress queue is under a PFC storm in the PFCWD warm reboot test.
The flow is like this:
- It randomly chooses a port set to use in the test.
- It chooses the first port in the port set to test and configures an IP address, eg.
192.168.0.2 on the corresponding port in PTF docker to send the traffic. The DUT learns the first port's MAC address as the MAC of 192.168.0.2 in ARP entry.
- It warm-reboots the DUT and chooses the second port in the port set to test and configures the same IP address on the corresponding port in PTF docker (which is a different PTF port from the one chozen in step 2).
- DUT's kernel learns the second port's MAC address as the MAC of 192.168.0.2 in the ARP entry after warm-reboot, which means 192.168.0.2's ARP entry is updated.
But after warm reboot, what I observed is that
- the output of
ip neigh indicates the MAC in 192.168.0.2 is the second port's MAC address, which is correct
- in APPL_DB (and below, ASIC_DB, SAI, SDK, and ASIC), the first port's MAC address is still used as the MAC of
192.168.0.2, which is wrong.
- the MAC of the first port isn't learned since it doesn't send any traffic anymore
- as a result, the traffic sent from the second port gets flooded
The log when the issue is observed
Jan 11 01:30:40.206914 mtbc-sonic-01-2410 INFO swss#supervisord 2023-01-11 01:30:40,206 INFO exited: restore_neighbors (exit status 0; expected)
Jan 11 01:30:40.213342 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- isNeighRestoreDone: neighbor table restore to kernel is done
Jan 11 01:30:40.213342 mtbc-sonic-01-2410 INFO swss#supervisord: neighsyncd Listens to neigh messages...
Jan 11 01:30:40.216894 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: Vlan1000:192.168.0.2, new value
Jan 11 01:30:45.212413 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- setWarmStartState: neighsyncd warm start state changed to reconciled
An example log when the issue is NOT observed
Jan 12 03:42:33.517318 mtbc-sonic-01-2410 INFO swss#restore_neighbor: restore_neighbor service is done for system warmreboot
Jan 12 03:42:33.689348 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- isNeighRestoreDone: neighbor table restore to kernel is done
Jan 12 03:42:33.689842 mtbc-sonic-01-2410 INFO swss#supervisord: neighsyncd Listens to neigh messages...
Jan 12 03:42:33.762000 mtbc-sonic-01-2410 INFO swss#supervisord 2023-01-12 03:42:33,761 INFO exited: restore_neighbors (exit status 0; expected)
Jan 12 03:42:34.610528 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: Vlan1000:192.168.0.2, new value
Jan 12 03:42:38.689387 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- reconcile: NEIGH_TABLE NEW, key: Vlan1000:192.168.0.2, neigh:24:8a:07:ab:a1:12, family:IPv4, cache-state:NEW,
Jan 12 03:42:38.690414 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- setWarmStartState: neighsyncd warm start state changed to reconciled
There is a logic to hold all the ARP entries learned during warm reboot and apply them to APPL_DB only after it is reconciled.
I do see the following message in the log insertToMap: NEIGH_TABLE, found key: Vlan1000:192.168.0.2, new value per my understanding, it will push any keys whose state is NEW into APPL_DB in AppRestartAssist::reconcile() but I don't see messages like %s NEW, key: %s, %s in the else if (state == NEW) branch of the function, which is probably the cause.
Steps to reproduce the issue:
- run test script
pfcwd/test_pfcwd_warm_reboot.py
the issue won't be caught by the script by default. but if we capture traffic on other ports within the same VLAN, the flooded traffic can benn seen.
Describe the results you received:
ARP entry is out of sync.
Describe the results you expected:
ARP should always be aligned between kernel and SONiC/SDK/ASIC.
Output of show version:
(paste your output here)
admin@mtbc-sonic-01-2410:~$ show version
SONiC Software Version: SONiC.202205_1_rc.12-d60ed3f21_Internal
Distribution: Debian 11.6
Kernel: 5.10.0-18-2-amd64
Build commit: d60ed3f21
Build date: Sat Dec 31 19:22:49 UTC 2022
Built by: sw-r2d2-bot@r-build-sonic-ci03-242
Platform: x86_64-mlnx_msn2410-r0
HwSKU: ACS-MSN2410
ASIC: mellanox
ASIC Count: 1
Serial Number: MT1639X02523
Model Number: MSN2410-CB2F
Hardware Revision: "A1"
Uptime: 01:37:11 up 8 min, 1 user, load average: 1.57, 2.15, 1.31
Date: Thu 12 Jan 2023 01:37:11
Docker images:
REPOSITORY TAG IMAGE ID SIZE
docker-syncd-mlnx 202205_1_rc.12-d60ed3f21_Internal f818576f9b9d 869MB
docker-syncd-mlnx latest f818576f9b9d 869MB
docker-fpm-frr 202205_1_rc.12-d60ed3f21_Internal ee75a1c85813 496MB
docker-fpm-frr latest ee75a1c85813 496MB
docker-orchagent 202205_1_rc.12-d60ed3f21_Internal 2314af1587bb 479MB
docker-orchagent latest 2314af1587bb 479MB
docker-teamd 202205_1_rc.12-d60ed3f21_Internal 6b04a311afb0 460MB
docker-teamd latest 6b04a311afb0 460MB
docker-macsec 202205_1_rc.11-d60ed3f21_Internal 4f7b3fb10f13 462MB
docker-platform-monitor 202205_1_rc.12-d60ed3f21_Internal 478948a84604 868MB
docker-platform-monitor latest 478948a84604 868MB
docker-snmp 202205_1_rc.12-d60ed3f21_Internal 033b50b59f59 489MB
docker-snmp latest 033b50b59f59 489MB
docker-dhcp-relay 202205_1_rc.11-d60ed3f21_Internal d5f88b8a24d4 454MB
docker-sonic-telemetry 202205_1_rc.12-d60ed3f21_Internal e54856ce90c3 525MB
docker-sonic-telemetry latest e54856ce90c3 525MB
docker-lldp 202205_1_rc.12-d60ed3f21_Internal 68b2c3afdfa7 487MB
docker-lldp latest 68b2c3afdfa7 487MB
docker-database 202205_1_rc.12-d60ed3f21_Internal c56dcb2f8f78 444MB
docker-database latest c56dcb2f8f78 444MB
docker-mux 202205_1_rc.12-d60ed3f21_Internal 6ed113eec122 493MB
docker-mux latest 6ed113eec122 493MB
docker-router-advertiser 202205_1_rc.12-d60ed3f21_Internal e05b851d83e3 444MB
docker-router-advertiser latest e05b851d83e3 444MB
docker-sflow 202205_1_rc.12-d60ed3f21_Internal da1869e43e5d 429MB
docker-sflow latest da1869e43e5d 429MB
docker-nat 202205_1_rc.12-d60ed3f21_Internal 39694c3ecbf6 431MB
docker-nat latest 39694c3ecbf6 431MB
docker-sonic-mgmt-framework 202205_1_rc.12-d60ed3f21_Internal d57b752bacb5 557MB
docker-sonic-mgmt-framework latest d57b752bacb5 557MB
urm.nvidia.com/sw-nbu-sws-sonic-docker/sonic-wjh 1.3.1 410975934531 310MB
urm.nvidia.com/sw-nbu-sws-sonic-docker/doroce 1.0.0-202205-internal-7 2aa171345b8f 200MB
Output of show techsupport:
(paste your output here or download and attach the file here )
Additional information you deem important (e.g. issue happens only occasionally):
Root cause analysis
The STATE of an entry in AppRestartAssist is maintained on a per-entry basis.
What happens if a NEW entry is updated with a value that existed in the cache? in that case, NEW state will be overridden to SAME and the new information will be lost.
- first port's MAC is in the ARP entry.
- kernel notifies ARP entry updated with the second's MAC. cache state =>
NEW
- kernel notifies ARP entry updated with whatever the first or second's MAC. cache state =>
SAME. the information updated in step 2 is lost.
- reconcile.
SAME entry won't be programmed to the APPL_DB. the new information becomes SAME and won't be updated to APPL_DB, which means the new information is lost.
Details can be seen if set the severity to INFO
Jan 12 10:50:58.801935 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: Vlan1000:192.168.0.2, new value
^<<<<<<<============ first updated to new
Jan 12 10:50:58.807019 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: Received message NEIGH_TABLE, key: PortChannel104:fc00::7e, neigh:12:03:fe:c8:9a:c7, family:IPv6, , delete = 0
Jan 12 10:50:58.807019 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: PortChannel104:fc00::7e, same value
Jan 12 10:50:58.808143 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: Received message NEIGH_TABLE, key: PortChannel102:fc00::76, neigh:1e:20:2b:4d:e7:79, family:IPv6, , delete = 0
Jan 12 10:50:58.808143 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: PortChannel102:fc00::76, same value
Jan 12 10:50:58.808936 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: Received message NEIGH_TABLE, key: PortChannel103:fc00::7a, neigh:62:34:26:6d:0f:d8, family:IPv6, , delete = 0
Jan 12 10:50:58.808936 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: PortChannel103:fc00::7a, same value
Jan 12 10:50:58.809360 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- isLinkLocalEnabled: IPv6 Link local is not supported for lo
Jan 12 10:50:58.809776 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: Received message NEIGH_TABLE, key: PortChannel101:fc00::72, neigh:1a:d2:50:3e:9e:4e, family:IPv6, , delete = 0
Jan 12 10:50:58.809776 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: PortChannel101:fc00::72, same value
Jan 12 10:50:58.811172 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- isLinkLocalEnabled: IPv6 Link local is not enabled on Vlan1000
Jan 12 10:51:01.432447 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: Received message NEIGH_TABLE, key: Vlan1000:192.168.0.2, neigh:24:8a:07:ab:a1:12, family:IPv4, , delete = 0
Jan 12 10:51:01.432447 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: Vlan1000:192.168.0.2, same value
^<<<<<<<<<<<========= second updated to the same (new) value but state is set to SAME
Jan 12 10:51:01.434169 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: Received message NEIGH_TABLE, key: Vlan1000:192.168.0.2, neigh:24:8a:07:ab:a1:12, family:IPv4, , delete = 0
Jan 12 10:51:01.434169 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- insertToMap: NEIGH_TABLE, found key: Vlan1000:192.168.0.2, same value
^<<<<<<<<<<<========= third updated to the same (new) value but state is set to SAME
Jan 12 10:51:03.796715 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- checkReconcileTimer: warmstart timer expired
Jan 12 10:51:03.796991 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel102:fc00::76, neigh:1e:20:2b:4d:e7:79, family:IPv6, cache-state:SAME,
Jan 12 10:51:03.797170 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel102:10.0.0.59, neigh:1e:20:2b:4d:e7:79, family:IPv4, cache-state:SAME,
Jan 12 10:51:03.797373 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel104:fc00::7e, neigh:12:03:fe:c8:9a:c7, family:IPv6, cache-state:SAME,
Jan 12 10:51:03.797525 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: eth0:10.75.71.1, neigh:00:00:5e:00:01:04, family:IPv4, cache-state:SAME,
Jan 12 10:51:03.797670 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel103:10.0.0.61, neigh:62:34:26:6d:0f:d8, family:IPv4, cache-state:SAME,
Jan 12 10:51:03.797817 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel104:10.0.0.63, neigh:12:03:fe:c8:9a:c7, family:IPv4, cache-state:SAME,
Jan 12 10:51:03.797952 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: Vlan1000:192.168.0.2, neigh:24:8a:07:ab:a1:12, family:IPv4, cache-state:SAME,
^<<<<<<<<<<<=========information that the entry has been updated is lost
Jan 12 10:51:03.798101 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: eth0:10.75.71.2, neigh:04:3f:72:64:36:48, family:IPv4, cache-state:SAME,
Jan 12 10:51:03.798233 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel103:fc00::7a, neigh:62:34:26:6d:0f:d8, family:IPv6, cache-state:SAME,
Jan 12 10:51:03.798379 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel101:fc00::72, neigh:1a:d2:50:3e:9e:4e, family:IPv6, cache-state:SAME,
Jan 12 10:51:03.798511 mtbc-sonic-01-2410 INFO swss#neighsyncd: :- reconcile: NEIGH_TABLE SAME, key: PortChannel101:10.0.0.57, neigh:1a:d2:50:3e:9e:4e, family:IPv4, cache-state:SAME,
Jan 12 10:51:03.805636 mtbc-sonic-01-2410 NOTICE swss#neighsyncd: :- setWarmStartState: neighsyncd warm start state changed to reconciled
Possible fix
Do not change an entry's state to SAME if it has been NEW.
Description
There is a logic to send lossless traffic to the DUT to verify whether the traffic is discarded while the egress queue is under a PFC storm in the PFCWD warm reboot test.
The flow is like this:
192.168.0.2on the corresponding port in PTF docker to send the traffic. The DUT learns the first port's MAC address as the MAC of 192.168.0.2 in ARP entry.But after warm reboot, what I observed is that
ip neighindicates the MAC in192.168.0.2is the second port's MAC address, which is correct192.168.0.2, which is wrong.The log when the issue is observed
An example log when the issue is NOT observed
There is a logic to hold all the ARP entries learned during warm reboot and apply them to APPL_DB only after it is reconciled.
I do see the following message in the log
insertToMap: NEIGH_TABLE, found key: Vlan1000:192.168.0.2, new valueper my understanding, it will push any keys whose state isNEWinto APPL_DB inAppRestartAssist::reconcile()but I don't see messages like%s NEW, key: %s, %sin theelse if (state == NEW)branch of the function, which is probably the cause.Steps to reproduce the issue:
pfcwd/test_pfcwd_warm_reboot.pythe issue won't be caught by the script by default. but if we capture traffic on other ports within the same VLAN, the flooded traffic can benn seen.
Describe the results you received:
ARP entry is out of sync.
Describe the results you expected:
ARP should always be aligned between kernel and SONiC/SDK/ASIC.
Output of
show version:Output of
show techsupport:Additional information you deem important (e.g. issue happens only occasionally):
Root cause analysis
The
STATEof an entry in AppRestartAssist is maintained on a per-entry basis.What happens if a
NEWentry is updated with a value that existed in the cache? in that case,NEWstate will be overridden toSAMEand the new information will be lost.NEWSAME. the information updated in step 2 is lost.SAMEentry won't be programmed to the APPL_DB. the new information becomesSAMEand won't be updated to APPL_DB, which means the new information is lost.Details can be seen if set the severity to
INFOPossible fix
Do not change an entry's state to
SAMEif it has beenNEW.