Skip to content

[EVPN@Scale] Poor performance and stability on EVPN L2 scale scenario #15004

@Hedgehog-Guru

Description

@Hedgehog-Guru

Description

Poor performance and stability on EVPN L2 scale scenario

Steps to reproduce the issue:

  1. On two switches configure L2 EVPN with single vlan and single VNI
  2. On both switches add two L2 untagged ports to the vlan
  3. Run unicast L2 traffic between switches (one port to one port):
    3.a. SW2 -> SW1: NNNN DMAC to 1 SMAC
    3.b. SW1 -> SW2: 1 SMAC to NNNN DMAC
  4. On SW1 check number of remote MACs and number of EVPN prefixes.
  5. On SW1 check CPU load
  6. On SW1 and SW2 check number of unknown unicast flooded frames on second ports
  7. Do it for a long time (20 mins for example) to check stability and no flooding

Describe the results you received:

Slow convergency time: On Spectrum-3 is takes 4 min to install 132K remote MACs
High CPU utilization mainly on redis-server
Each 5 mins number of MACs decreased (Linux bridge aging?)
Almost constantly - unknown unicast flooding to "monitor" ports

Describe the results you expected:

Faster convergence and stability

Output of show version:

SONiC Software Version: SONiC.202211_RC12.1-4ee027200_Internal
SONiC OS Version: 11
Distribution: Debian 11.7
Kernel: 5.10.0-18-2-amd64
Build commit: 4ee027200
Build date: Thu May  4 11:06:45 UTC 2023
Built by: sw-r2d2-bot@r-build-sonic-ci02-242

Platform: x86_64-mlnx_msn3700-r0
HwSKU: ACS-MSN3700
ASIC: mellanox
ASIC Count: 1
Serial Number: MT1932X22252
Model Number: MSN3700-VS2F
Hardware Revision: A1
Uptime: 17:51:26 up  1:27,  1 user,  load average: 0.13, 0.17, 0.25
Date: Wed 10 May 2023 17:51:26

Output of show techsupport:

SW1 - DUT
sonic_dump_qa-eth-vt03-3-4600ca1_20230510_164945 (1).zip
sonic_dump_qa-eth-vt03-3-4600ca1_20230510_164945 (2).zip
sonic_dump_qa-eth-vt03-3-4600ca1_20230510_164945 (3).zip

SW2
sonic_dump_qa-eth-vt03-4-3700v_20230510_164947 (1).zip
sonic_dump_qa-eth-vt03-4-3700v_20230510_164947 (2).zip
sonic_dump_qa-eth-vt03-4-3700v_20230510_164947 (3).zip

Additional information you deem important (e.g. issue happens only occasionally):

Stats collected
stats-and-script.zip

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions