Skip to content

flap interface after sfp reset#16375

Merged
yejianquan merged 8 commits intosonic-net:masterfrom
sdszhang:sfp_reset
Jan 17, 2025
Merged

flap interface after sfp reset#16375
yejianquan merged 8 commits intosonic-net:masterfrom
sdszhang:sfp_reset

Conversation

@sdszhang
Copy link
Copy Markdown
Contributor

@sdszhang sdszhang commented Jan 7, 2025

Description of PR

Summary:
Fixes interface stays down after tests/platform_tests/api/test_sfp.py::sfp_reset()

And causing error during shutdown_ebgp fixture teardown.

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405
  • 202411

Approach

What is the motivation for this PR?

keep interface up after sfp_reset if it's T2 and QSFP-DD SFP.

How did you do it?

flap the interface after sfp_reset to restore the interface state.

How did you verify/test it?

passed on physical testbed with

admin@svcstr2-8800-lc1-1:~$ sudo sfputil show eeprom -d -p Ethernet0
Ethernet0: SFP EEPROM detected
...
        Application Advertisement: 400GAUI-8 C2M (Annex 120E) - Host Assign (0x1) - Active Cable assembly with BER < 5x10^-5 - Media Assign (0x1)
                                   CAUI-4 C2M (Annex 83E) - Host Assign (0x1) - Active Cable assembly with BER < 5x10^-5 - Media Assign (0x1)
        CMIS Revision: 4.0
        Connector: No separable connector
        Encoding: N/A
        Extended Identifier: Power Class 5 (10.0W Max)
        Extended RateSelect Compliance: N/A
        Hardware Revision: 1.0
        Host Electrical Interface: 400GAUI-8 C2M (Annex 120E)
        Host Lane Assignment Options: 1
        Host Lane Count: 8
        Identifier: QSFP-DD Double Density 8X Pluggable Transceiver
        Length Cable Assembly(m): 1.0
......
platform_tests/api/test_sfp.py::TestSfpApi::test_reset[xxx-lc1-1] PASSED [ 73%]
......
=========================== short test summary info ============================
FAILED platform_tests/api/test_sfp.py::TestSfpApi::test_lpmode[svcstr2-8800-lc1-1]                        <<<< this is separate issue, not related to this PR.
========================= 1 failed, 22 passed, 1 warning in 2104.41s (0:35:04) =========================
``
============= 1 failed, 22 passed, 1 warning in 2104.41s (0:35:04) =============

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@sdszhang sdszhang requested a review from prgeor as a code owner January 7, 2025 06:25
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@sdszhang sdszhang requested a review from abdosi January 7, 2025 06:31
@sdszhang
Copy link
Copy Markdown
Contributor Author

sdszhang commented Jan 7, 2025

@prgeor @abdosi @yejianquan can you help to review this one?

Comment thread tests/platform_tests/api/test_sfp.py Outdated
@mihirpat1
Copy link
Copy Markdown
Contributor

@longhuan-cisco Can you please help in reviewing this?

Comment thread tests/platform_tests/api/test_sfp.py Outdated
Comment thread tests/platform_tests/api/test_sfp.py Outdated
Comment thread tests/common/devices/multi_asic.py
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Comment thread tests/common/devices/multi_asic.py Outdated
Comment thread tests/common/devices/multi_asic.py Outdated
@mihirpat1
Copy link
Copy Markdown
Contributor

@prgeor Can you please help in reviewing this PR?

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

prgeor
prgeor previously requested changes Jan 15, 2025
Copy link
Copy Markdown
Contributor

@prgeor prgeor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sdszhang please make the following changes:-

  1. This SFP test case is not T2 specific. Please remove T2 specific platform check.
  2. Make this fix for CMIS optics does not matter if osfp or QSFP-DD

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@sdszhang
Copy link
Copy Markdown
Contributor Author

sdszhang commented Jan 15, 2025

@sdszhang please make the following changes:-

  1. This SFP test case is not T2 specific. Please remove T2 specific platform check.
  2. Make this fix for CMIS optics does not matter if osfp or QSFP-DD

@prgeor Updated with the suggestion.
Internal test result:
T1 result PASSED
T2 result PASSED

@sdszhang sdszhang requested a review from prgeor January 15, 2025 22:17
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@yejianquan yejianquan dismissed prgeor’s stale review January 17, 2025 06:03

The comment has been resolved

@yejianquan yejianquan merged commit 6814013 into sonic-net:master Jan 17, 2025
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Jan 17, 2025
Description of PR
Summary:
Fixes interface stays down after tests/platform_tests/api/test_sfp.py::sfp_reset()

And causing error during shutdown_ebgp fixture teardown.

Type of change
 Bug fix
 Testbed and Framework(new/improvement)
 Test case(new/improvement)
Back port request
 202012
 202205
 202305
 202311
 202405
 202411
Approach
What is the motivation for this PR?
keep interface up after sfp_reset if it's T2 and QSFP-DD SFP.

How did you do it?
flap the interface after sfp_reset to restore the interface state.

How did you verify/test it?
passed on physical testbed with

admin@svcstr2-8800-lc1-1:~$ sudo sfputil show eeprom -d -p Ethernet0
Ethernet0: SFP EEPROM detected
...
        Application Advertisement: 400GAUI-8 C2M (Annex 120E) - Host Assign (0x1) - Active Cable assembly with BER < 5x10^-5 - Media Assign (0x1)
                                   CAUI-4 C2M (Annex 83E) - Host Assign (0x1) - Active Cable assembly with BER < 5x10^-5 - Media Assign (0x1)
        CMIS Revision: 4.0
        Connector: No separable connector
        Encoding: N/A
        Extended Identifier: Power Class 5 (10.0W Max)
        Extended RateSelect Compliance: N/A
        Hardware Revision: 1.0
        Host Electrical Interface: 400GAUI-8 C2M (Annex 120E)
        Host Lane Assignment Options: 1
        Host Lane Count: 8
        Identifier: QSFP-DD Double Density 8X Pluggable Transceiver
        Length Cable Assembly(m): 1.0
......
platform_tests/api/test_sfp.py::TestSfpApi::test_reset[xxx-lc1-1] PASSED [ 73%]
......
=========================== short test summary info ============================
FAILED platform_tests/api/test_sfp.py::TestSfpApi::test_lpmode[svcstr2-8800-lc1-1]                        <<<< this is separate issue, not related to this PR.
========================= 1 failed, 22 passed, 1 warning in 2104.41s (0:35:04) =========================
``
============= 1 failed, 22 passed, 1 warning in 2104.41s (0:35:04) =============
Any platform specific information?
Supported testbed topology if it's a new test case?

Co-authored-by: [email protected]
@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to 202411: #16570

yejianquan pushed a commit to yejianquan/sonic-mgmt that referenced this pull request Jan 17, 2025
Description of PR
Summary:
Fixes interface stays down after tests/platform_tests/api/test_sfp.py::sfp_reset()

And causing error during shutdown_ebgp fixture teardown.

Type of change
 Bug fix
 Testbed and Framework(new/improvement)
 Test case(new/improvement)
Back port request
 202012
 202205
 202305
 202311
 202405
 202411
Approach
What is the motivation for this PR?
keep interface up after sfp_reset if it's T2 and QSFP-DD SFP.

How did you do it?
flap the interface after sfp_reset to restore the interface state.

How did you verify/test it?
passed on physical testbed with

admin@svcstr2-8800-lc1-1:~$ sudo sfputil show eeprom -d -p Ethernet0
Ethernet0: SFP EEPROM detected
...
        Application Advertisement: 400GAUI-8 C2M (Annex 120E) - Host Assign (0x1) - Active Cable assembly with BER < 5x10^-5 - Media Assign (0x1)
                                   CAUI-4 C2M (Annex 83E) - Host Assign (0x1) - Active Cable assembly with BER < 5x10^-5 - Media Assign (0x1)
        CMIS Revision: 4.0
        Connector: No separable connector
        Encoding: N/A
        Extended Identifier: Power Class 5 (10.0W Max)
        Extended RateSelect Compliance: N/A
        Hardware Revision: 1.0
        Host Electrical Interface: 400GAUI-8 C2M (Annex 120E)
        Host Lane Assignment Options: 1
        Host Lane Count: 8
        Identifier: QSFP-DD Double Density 8X Pluggable Transceiver
        Length Cable Assembly(m): 1.0
......
platform_tests/api/test_sfp.py::TestSfpApi::test_reset[xxx-lc1-1] PASSED [ 73%]
......
=========================== short test summary info ============================
FAILED platform_tests/api/test_sfp.py::TestSfpApi::test_lpmode[svcstr2-8800-lc1-1]                        <<<< this is separate issue, not related to this PR.
========================= 1 failed, 22 passed, 1 warning in 2104.41s (0:35:04) =========================
``
============= 1 failed, 22 passed, 1 warning in 2104.41s (0:35:04) =============
Any platform specific information?
Supported testbed topology if it's a new test case?

Co-authored-by: [email protected]
mssonicbld pushed a commit that referenced this pull request Jan 17, 2025
Description of PR
Summary:
Fixes interface stays down after tests/platform_tests/api/test_sfp.py::sfp_reset()

And causing error during shutdown_ebgp fixture teardown.

Type of change
 Bug fix
 Testbed and Framework(new/improvement)
 Test case(new/improvement)
Back port request
 202012
 202205
 202305
 202311
 202405
 202411
Approach
What is the motivation for this PR?
keep interface up after sfp_reset if it's T2 and QSFP-DD SFP.

How did you do it?
flap the interface after sfp_reset to restore the interface state.

How did you verify/test it?
passed on physical testbed with

admin@svcstr2-8800-lc1-1:~$ sudo sfputil show eeprom -d -p Ethernet0
Ethernet0: SFP EEPROM detected
...
        Application Advertisement: 400GAUI-8 C2M (Annex 120E) - Host Assign (0x1) - Active Cable assembly with BER < 5x10^-5 - Media Assign (0x1)
                                   CAUI-4 C2M (Annex 83E) - Host Assign (0x1) - Active Cable assembly with BER < 5x10^-5 - Media Assign (0x1)
        CMIS Revision: 4.0
        Connector: No separable connector
        Encoding: N/A
        Extended Identifier: Power Class 5 (10.0W Max)
        Extended RateSelect Compliance: N/A
        Hardware Revision: 1.0
        Host Electrical Interface: 400GAUI-8 C2M (Annex 120E)
        Host Lane Assignment Options: 1
        Host Lane Count: 8
        Identifier: QSFP-DD Double Density 8X Pluggable Transceiver
        Length Cable Assembly(m): 1.0
......
platform_tests/api/test_sfp.py::TestSfpApi::test_reset[xxx-lc1-1] PASSED [ 73%]
......
=========================== short test summary info ============================
FAILED platform_tests/api/test_sfp.py::TestSfpApi::test_lpmode[svcstr2-8800-lc1-1]                        <<<< this is separate issue, not related to this PR.
========================= 1 failed, 22 passed, 1 warning in 2104.41s (0:35:04) =========================
``
============= 1 failed, 22 passed, 1 warning in 2104.41s (0:35:04) =============
Any platform specific information?
Supported testbed topology if it's a new test case?

Co-authored-by: [email protected]
@longhuan-cisco
Copy link
Copy Markdown
Contributor

@yejianquan
We saw same issue on fixed (non-chassis) platforms.
Wondering if it's possible to add back request for 202405 tag to cherry-pick it to 202405 (not msft-202405)? I saw this tag got removed somehow, not sure about the reason behind.

@yejianquan
Copy link
Copy Markdown
Collaborator

@yejianquan We saw same issue on fixed (non-chassis) platforms. Wondering if it's possible to add back request for 202405 tag to cherry-pick it to 202405 (not msft-202405)? I saw this tag got removed somehow, not sure about the reason behind.

@bingwang-ms for the requirement to 202405 branch

wangxin pushed a commit to wangxin/sonic-mgmt that referenced this pull request Feb 21, 2025
wangxin pushed a commit to wangxin/sonic-mgmt that referenced this pull request Feb 21, 2025
Description of PR
Summary:
Fixes interface stays down after tests/platform_tests/api/test_sfp.py::sfp_reset()

And causing error during shutdown_ebgp fixture teardown.

Type of change
 Bug fix
 Testbed and Framework(new/improvement)
 Test case(new/improvement)
Back port request
 202012
 202205
 202305
 202311
 202405
 202411
Approach
What is the motivation for this PR?
keep interface up after sfp_reset if it's T2 and QSFP-DD SFP.

How did you do it?
flap the interface after sfp_reset to restore the interface state.

How did you verify/test it?
passed on physical testbed with

admin@svcstr2-8800-lc1-1:~$ sudo sfputil show eeprom -d -p Ethernet0
Ethernet0: SFP EEPROM detected
...
        Application Advertisement: 400GAUI-8 C2M (Annex 120E) - Host Assign (0x1) - Active Cable assembly with BER < 5x10^-5 - Media Assign (0x1)
                                   CAUI-4 C2M (Annex 83E) - Host Assign (0x1) - Active Cable assembly with BER < 5x10^-5 - Media Assign (0x1)
        CMIS Revision: 4.0
        Connector: No separable connector
        Encoding: N/A
        Extended Identifier: Power Class 5 (10.0W Max)
        Extended RateSelect Compliance: N/A
        Hardware Revision: 1.0
        Host Electrical Interface: 400GAUI-8 C2M (Annex 120E)
        Host Lane Assignment Options: 1
        Host Lane Count: 8
        Identifier: QSFP-DD Double Density 8X Pluggable Transceiver
        Length Cable Assembly(m): 1.0
......
platform_tests/api/test_sfp.py::TestSfpApi::test_reset[xxx-lc1-1] PASSED [ 73%]
......
=========================== short test summary info ============================
FAILED platform_tests/api/test_sfp.py::TestSfpApi::test_lpmode[svcstr2-8800-lc1-1]                        <<<< this is separate issue, not related to this PR.
========================= 1 failed, 22 passed, 1 warning in 2104.41s (0:35:04) =========================
``
============= 1 failed, 22 passed, 1 warning in 2104.41s (0:35:04) =============
Any platform specific information?
Supported testbed topology if it's a new test case?

Co-authored-by: [email protected]
wangxin pushed a commit to wangxin/sonic-mgmt that referenced this pull request Feb 21, 2025
Merge 202405 branch in as of 12:23pm 20/01/2025 AEST

b118611 (HEAD -> merge/202405) Use alternate check for reboot for T2 after reboot with REBOOT_TYPE_POWEROFF (sonic-net#16348)
0e0e898 flap interface after sfp reset (sonic-net#16375)
41e2b2f Temporarily skip lpmode test for some transceivers with known issue (sonic-net#16547)
de60273 [Snappi] Infra changes for new PFC-ECN testcases. (sonic-net#13864)
7b357f5 [Snappi] New testcases for PFC-ECN. (sonic-net#13865)
3523a7f [Snappi]: PFC - Mixed Speed testcases (sonic-net#14122)
3754f2a sonic-mgmt: Fix namespace issues for qos tests on T2 single ASIC (sonic-net#15708)
21f6526 [sonic-net#16015 Fix]: Cleaning up unused code from snappi_fixtures (sonic-net#16026)
d8f23be Correcting client arguments to dynamically_compensate_leakout (sonic-net#16169)
3c47107 [sanity_check][bgp] Enhance sanity check recover for bgp default route missing (sonic-net#16357)
37352b8 Eliminate cross-feature dependency from macsec module (sonic-net#15617)
4f33b0d (pub_upstream/202405) [202405][dhcp_relay] Add test case to verify dhcp6relay LLA waiting logic (sonic-net#16494) (sonic-net#16567)
nnelluri-cisco pushed a commit to nnelluri-cisco/sonic-mgmt that referenced this pull request Mar 15, 2025
Description of PR
Summary:
Fixes interface stays down after tests/platform_tests/api/test_sfp.py::sfp_reset()

And causing error during shutdown_ebgp fixture teardown.

Type of change
 Bug fix
 Testbed and Framework(new/improvement)
 Test case(new/improvement)
Back port request
 202012
 202205
 202305
 202311
 202405
 202411
Approach
What is the motivation for this PR?
keep interface up after sfp_reset if it's T2 and QSFP-DD SFP.

How did you do it?
flap the interface after sfp_reset to restore the interface state.

How did you verify/test it?
passed on physical testbed with

admin@svcstr2-8800-lc1-1:~$ sudo sfputil show eeprom -d -p Ethernet0
Ethernet0: SFP EEPROM detected
...
        Application Advertisement: 400GAUI-8 C2M (Annex 120E) - Host Assign (0x1) - Active Cable assembly with BER < 5x10^-5 - Media Assign (0x1)
                                   CAUI-4 C2M (Annex 83E) - Host Assign (0x1) - Active Cable assembly with BER < 5x10^-5 - Media Assign (0x1)
        CMIS Revision: 4.0
        Connector: No separable connector
        Encoding: N/A
        Extended Identifier: Power Class 5 (10.0W Max)
        Extended RateSelect Compliance: N/A
        Hardware Revision: 1.0
        Host Electrical Interface: 400GAUI-8 C2M (Annex 120E)
        Host Lane Assignment Options: 1
        Host Lane Count: 8
        Identifier: QSFP-DD Double Density 8X Pluggable Transceiver
        Length Cable Assembly(m): 1.0
......
platform_tests/api/test_sfp.py::TestSfpApi::test_reset[xxx-lc1-1] PASSED [ 73%]
......
=========================== short test summary info ============================
FAILED platform_tests/api/test_sfp.py::TestSfpApi::test_lpmode[svcstr2-8800-lc1-1]                        <<<< this is separate issue, not related to this PR.
========================= 1 failed, 22 passed, 1 warning in 2104.41s (0:35:04) =========================
``
============= 1 failed, 22 passed, 1 warning in 2104.41s (0:35:04) =============
Any platform specific information?
Supported testbed topology if it's a new test case?

Co-authored-by: [email protected]
@sdszhang sdszhang deleted the sfp_reset branch March 27, 2025 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants