-
Notifications
You must be signed in to change notification settings - Fork 1.8k
log_ssd_health timeouts when executing from shell and during warm-reboot #9114
Copy link
Copy link
Closed
sonic-net/sonic-utilities
#1904Description
Description
Issue introduced after sonic-net/sonic-utilities#1850
timeout command was added to smartctl. However, the smartctl command in host is a wrapper which calls docker exec -it pmon smartctl. With the interactive mode, the command doesn't return and it times out. This is documented in the bug moby/moby#28207 (comment) and recommendation is to use --foreground in timeout command which solves the issue
In the below logs between successive commands the log_sdd_health takes 30 seconds and timeouts
Oct 20 09:19:23.134719 arc-switch1025 NOTICE admin: Collecting logs to check ssd health before fast-reboot...
Oct 20 09:19:53.154005 arc-switch1025 NOTICE admin: Stopping nat ...
Steps to reproduce the issue:
- Run log_ssd_health in bash. It will not return until timeout (30 sec)
Describe the results you received:
The command hangs
Describe the results you expected:
The command shouldn't hang
Output of show version:
show version
SONiC Software Version: SONiC.master.209-b0c73d9a7_Internal
Distribution: Debian 10.11
Kernel: 4.19.0-12-2-amd64
Build commit: b0c73d9a7
Build date: Tue Oct 26 16:48:20 UTC 2021
Built by: sw-r2d2-bot@r-build-sonic-ci03-243
Platform: x86_64-mlnx_msn2700-r0
HwSKU: Mellanox-SN2700
ASIC: mellanox
ASIC Count: 1
Serial Number: MT2020T04244
Model Number: MSN2700-CS2FO
Hardware Revision: A2
Uptime: 05:53:42 up 1 day, 21:36, 1 user, load average: 0.70, 0.94, 1.03
Docker images:
REPOSITORY TAG IMAGE ID SIZE
docker-dhcp-relay latest e2f35a076316 429MB
docker-syncd-mlnx latest 2652a4c657a9 996MB
docker-syncd-mlnx master.209-b0c73d9a7_Internal 2652a4c657a9 996MB
docker-database latest fe2aefdfb8c3 415MB
docker-database master.209-b0c73d9a7_Internal fe2aefdfb8c3 415MB
docker-snmp latest 822131202195 457MB
docker-snmp master.209-b0c73d9a7_Internal 822131202195 457MB
docker-teamd latest ea6bda719ec0 428MB
docker-teamd master.209-b0c73d9a7_Internal ea6bda719ec0 428MB
docker-nat latest 1cbc8173f2e3 430MB
docker-nat master.209-b0c73d9a7_Internal 1cbc8173f2e3 430MB
docker-router-advertiser latest ea7bb1f6d3f0 415MB
docker-router-advertiser master.209-b0c73d9a7_Internal ea7bb1f6d3f0 415MB
docker-platform-monitor latest 03027e633d3f 746MB
docker-platform-monitor master.209-b0c73d9a7_Internal 03027e633d3f 746MB
docker-macsec latest de1ea43df7cb 431MB
docker-macsec master.209-b0c73d9a7_Internal de1ea43df7cb 431MB
docker-lldp latest f13c1b763180 455MB
docker-lldp master.209-b0c73d9a7_Internal f13c1b763180 455MB
docker-orchagent latest 6849170aa3cd 446MB
docker-orchagent master.209-b0c73d9a7_Internal 6849170aa3cd 446MB
docker-sonic-telemetry latest 22a2702c6b34 504MB
docker-sonic-telemetry master.209-b0c73d9a7_Internal 22a2702c6b34 504MB
docker-sonic-mgmt-framework latest 84e0d03cae25 570MB
docker-sonic-mgmt-framework master.209-b0c73d9a7_Internal 84e0d03cae25 570MB
docker-mux latest 6e6637e4ea92 468MB
docker-mux master.209-b0c73d9a7_Internal 6e6637e4ea92 468MB
docker-fpm-frr latest a48280de2de7 446MB
docker-fpm-frr master.209-b0c73d9a7_Internal a48280de2de7 446MB
docker-sflow latest 0b2be7d286b9 428MB
docker-sflow master.209-b0c73d9a7_Internal 0b2be7d286b9 428MB
urm.nvidia.com/sw-nbu-sws-sonic-docker/sonic-wjh 1.1.0-202106-internal-13 a1808db49408 462MB
harbor.mellanox.com/sonic/cpu-report 10.0.0 5314b41a2a5e 413MB
Output of show techsupport:
(paste your output here or download and attach the file here )
Additional information you deem important (e.g. issue happens only occasionally):
Reactions are currently unavailable