Skip to content

[action] [PR:13611] [Mellanox] set select timeout to no more than 1 sec to make sure fast shutdown#16449

Merged
yxieca merged 1 commit intosonic-net:202205from
mssonicbld:cherry/202205/13611
Sep 6, 2023
Merged

[action] [PR:13611] [Mellanox] set select timeout to no more than 1 sec to make sure fast shutdown#16449
yxieca merged 1 commit intosonic-net:202205from
mssonicbld:cherry/202205/13611

Conversation

@mssonicbld
Copy link
Copy Markdown
Collaborator

Why I did it

Commit sonic-net/sonic-platform-daemons@153ea47 changed SfpStateUpdateTask from Process to Thread. In this commit, it raises an exception in SfpStateUpdateTask to make shutdown flow fast. But it does not work on Nvidia platform as Nvidia platform is passing timeout parameter of get_change_event to select. Linux select function can not be interrupted by a Python exception. There is no such issue on Nvidia platform before that commit. However, in order to comply with the commit and make shutdown flow fast, we decided to change Nvidia platform API implementation.

To fix issue #13591.

How I did it

  1. The select call in get_change_event should use no more than 1 second as timeout parameter.
  2. Outside the select call, add a while loop to make sure timeout parameter of get_change_event work as expected

How to verify it

Manual test

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211

Description for the changelog

Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

… shutdown (sonic-net#13611)

- Why I did it
Commit sonic-net/sonic-platform-daemons@153ea47 changed SfpStateUpdateTask from Process to Thread. In this commit, it raises an exception in SfpStateUpdateTask to make shutdown flow fast. But it does not work on Nvidia platform as Nvidia platform is passing timeout parameter of get_change_event to select. Linux select function can not be interrupted by a Python exception. There is no such issue on Nvidia platform before that commit. However, in order to comply with the commit and make shutdown flow fast, we decided to change Nvidia platform API implementation.

To fix issue sonic-net#13591.

- How I did it
The select call in get_change_event should use no more than 1 second as timeout parameter.
Outside the select call, add a while loop to make sure timeout parameter of get_change_event work as expected

- How to verify it
Manual test
@mssonicbld
Copy link
Copy Markdown
Collaborator Author

Original PR: #13611

@yxieca yxieca merged commit 89f091e into sonic-net:202205 Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants