Skip to content

Update syncd stop script to collect saisdkdump during orch abort#38

Closed
vivekrnv wants to merge 12 commits intomasterfrom
orch_abort
Closed

Update syncd stop script to collect saisdkdump during orch abort#38
vivekrnv wants to merge 12 commits intomasterfrom
orch_abort

Conversation

@vivekrnv
Copy link
Copy Markdown
Owner

@vivekrnv vivekrnv commented Oct 28, 2022

Signed-off-by: Vivek Reddy Karri [email protected]

Why I did it

  • Update the syncd stop flow to collect saisdkdump and save it to /var/log/orch_abrt_status/ location during a SAI programming failure by orchagent.
    1. Limit the number of such archived to 10
    2. Create a tmp file as a sync mechanism with auto-techsupport process
  • Update the swss stop flow to remove PortInitDone. This is done so as to provide other daemons a reliable way of checking if switch_init is successful after services are deemed started. (Eg: Before running saisdkdump, switch should be initialized. techsupport will use this info to determine that)

How I did it

How to verify it


root@r-bulldog-03:/home/admin# sonic-db-cli STATE_DB SET ORCH_ABRT_STATUS 1

root@r-bulldog-03:/home/admin# systemctl restart swss

Oct 28 21:10:52.892667 r-bulldog-03 INFO syncd.sh[26913]: sai_sdk_dump_1666991448.tar.gz collected before taking stopping syncd

root@r-bulldog-03:/home/admin# tar -tvf /var/log/orch_abrt_status/sai_sdk_dump_1666991448.tar.gz
drwxr-xr-x root/root         0 2022-10-28 21:10 ./
-rw-r--r-- root/root   4883498 2022-10-28 21:10 ./sdkdump_ext_cr_001-28_10_2022-21_10_50-980950.udmp
-rw-r--r-- root/root   3964382 2022-10-28 21:10 ./sai_sdk_dump_10_28_2022_09_10_PM
-rw-r--r-- root/root       288 2022-10-28 21:10 ./sdkdump_ext_meta_001-28_10_2022-21_10_50-407583
-rw-r--r-- root/root       288 2022-10-28 21:10 ./sdkdump_ext_meta_001-28_10_2022-21_10_50-980950
-rw-r--r-- root/root   2656360 2022-10-28 21:10 ./sai_sdk_dump_10_28_2022_09_10_PM.json
-rw-r--r-- root/root   4883498 2022-10-28 21:10 ./sdkdump_ext_cr_001-28_10_2022-21_10_50-407583.udmp
-rw-r--r-- root/root   4883498 2022-10-28 21:10 ./sdkdump_ext_cr_001-28_10_2022-21_10_51-573167.udmp
-rw-r--r-- root/root       288 2022-10-28 21:10 ./sdkdump_ext_meta_001-28_10_2022-21_10_51-573167

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205

Description for the changelog

Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@vivekrnv vivekrnv marked this pull request as ready for review October 28, 2022 21:21
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Signed-off-by: Vivek Reddy Karri <[email protected]>
Comment thread files/scripts/syncd.sh
Comment thread files/scripts/syncd.sh
Signed-off-by: Vivek Reddy Karri <[email protected]>
@vivekrnv vivekrnv requested a review from dgsudharsan November 23, 2022 22:20
Comment thread files/scripts/syncd.sh Outdated
/usr/bin/docker exec syncd$DEV rm -f /tmp/${sai_dump_filename_epoch}.tar
/usr/bin/docker exec syncd$DEV mkdir -p ${TMP_DMP_DIR}
/usr/bin/docker exec syncd$DEV saisdkdump -f ${TMP_DMP_DIR}/${sai_dump_filename} > /dev/null
timeout 1m bash -c "/usr/bin/docker exec syncd$DEV saisdkdump -f ${TMP_DMP_DIR}/${sai_dump_filename} > /dev/null"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 minute seems to be a huge timeout. Can you check how long saisdkdump takes in each platform and come up with 2x of max time taken. Ideally saisdkdump should finish within 5-6 seconds. If that's the scenario we can add timeout of 15 seconds

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is worst case scenario right. And we can't predict how long will the saisdkdump of other vendors would take. Anyways, i'll set it to 30s. Seems reasonable?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the timeout to 30s

@vivekrnv vivekrnv requested a review from dgsudharsan December 1, 2022 02:56
Signed-off-by: Vivek Reddy Karri <[email protected]>
@vivekrnv vivekrnv closed this Dec 5, 2022
vivekrnv pushed a commit that referenced this pull request Mar 24, 2025
…lly (sonic-net#639)

#### Why I did it
src/sonic-swss
```
* 4baf54f - (HEAD -> 202412, origin/202412) SRv6: add dscp_mode configuration for MySID entry (#38) (6 hours ago) [mssonicbld]
* ff491ba - [SRv6] Add support for SRv6 VPN (#37) (9 hours ago) [mssonicbld]
```
#### How I did it
#### How to verify it
#### Description for the changelog
vivekrnv pushed a commit that referenced this pull request Mar 24, 2025
… automatically (sonic-net#702)

#### Why I did it
src/sonic-platform-common
```
* c8eac22 - (HEAD -> 202412, origin/202412) [code sync] Merge code from sonic-net/sonic-platform-common:202411 to 202412 (#38) (21 hours ago) [mssonicbld]
```
#### How I did it
#### How to verify it
#### Description for the changelog
vivekrnv pushed a commit that referenced this pull request Mar 24, 2025
…tomatically (sonic-net#899)

#### Why I did it
src/sonic-linux-kernel
```
* b1aeb41 - (HEAD -> 202412, origin/HEAD, origin/202412) [code sync] Merge code from sonic-net/sonic-linux-kernel:202411 to 202412 (#38) (20 hours ago) [mssonicbld]
```
#### How I did it
#### How to verify it
#### Description for the changelog
vivekrnv pushed a commit that referenced this pull request Mar 24, 2025
…omatically (sonic-net#928)

#### Why I did it
src/sonic-swss-common
```
* 0087183 - (HEAD -> 202412, origin/HEAD, origin/202412) Merge pull request #38 from mssonicbld/sonicbld/202412-merge (35 hours ago) [mssonicbld]
* e32b71e - Merge branch '202411' of https://github.com/sonic-net/sonic-swss-common into 202412 (2 days ago) [Sonic Automation]
* 3bc4141 - [FC] remove FLEX_COUNTER_DELAY_STATUS_FIELD (sonic-net#982) (3 weeks ago) [mssonicbld]
```
#### How I did it
#### How to verify it
#### Description for the changelog
vivekrnv pushed a commit that referenced this pull request Mar 24, 2025
…tically (sonic-net#937)

#### Why I did it
src/sonic-sairedis
```
* 8930167 - (HEAD -> 202412, origin/HEAD, origin/202412) [FC] Fix the update failure in switch debug counters (#38) (3 hours ago) [mssonicbld]
```
#### How I did it
#### How to verify it
#### Description for the changelog
vivekrnv pushed a commit that referenced this pull request Apr 10, 2025
…sonic-net#22193)

#### Why I did it
src/dhcpmon
```
* 749c7e5 - (HEAD -> master, origin/master, origin/HEAD) Update DB separator for per-interface counter (#38) (23 hours ago) [Yaqiang Zhu]
```
#### How I did it
#### How to verify it
#### Description for the changelog
vivekrnv pushed a commit that referenced this pull request Jun 6, 2025
…tically (sonic-net#22803)

#### Why I did it
src/sonic-dash-api
```
* 573485d - (HEAD -> master, origin/master, origin/HEAD) Update pipeline to use Bookworm and Ubuntu 24.04 (#38) (6 days ago) [Saikrishna Arcot]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants