Skip to content

log_ssd_health timeouts when executing from shell and during warm-reboot  #9114

@dgsudharsan

Description

@dgsudharsan

Description

Issue introduced after sonic-net/sonic-utilities#1850
timeout command was added to smartctl. However, the smartctl command in host is a wrapper which calls docker exec -it pmon smartctl. With the interactive mode, the command doesn't return and it times out. This is documented in the bug moby/moby#28207 (comment) and recommendation is to use --foreground in timeout command which solves the issue

In the below logs between successive commands the log_sdd_health takes 30 seconds and timeouts
Oct 20 09:19:23.134719 arc-switch1025 NOTICE admin: Collecting logs to check ssd health before fast-reboot...
Oct 20 09:19:53.154005 arc-switch1025 NOTICE admin: Stopping nat ...

Steps to reproduce the issue:

  1. Run log_ssd_health in bash. It will not return until timeout (30 sec)

Describe the results you received:

The command hangs

Describe the results you expected:

The command shouldn't hang

Output of show version:

show version

SONiC Software Version: SONiC.master.209-b0c73d9a7_Internal
Distribution: Debian 10.11
Kernel: 4.19.0-12-2-amd64
Build commit: b0c73d9a7
Build date: Tue Oct 26 16:48:20 UTC 2021
Built by: sw-r2d2-bot@r-build-sonic-ci03-243

Platform: x86_64-mlnx_msn2700-r0
HwSKU: Mellanox-SN2700
ASIC: mellanox
ASIC Count: 1
Serial Number: MT2020T04244
Model Number: MSN2700-CS2FO
Hardware Revision: A2
Uptime: 05:53:42 up 1 day, 21:36,  1 user,  load average: 0.70, 0.94, 1.03

Docker images:
REPOSITORY                                         TAG                             IMAGE ID            SIZE
docker-dhcp-relay                                  latest                          e2f35a076316        429MB
docker-syncd-mlnx                                  latest                          2652a4c657a9        996MB
docker-syncd-mlnx                                  master.209-b0c73d9a7_Internal   2652a4c657a9        996MB
docker-database                                    latest                          fe2aefdfb8c3        415MB
docker-database                                    master.209-b0c73d9a7_Internal   fe2aefdfb8c3        415MB
docker-snmp                                        latest                          822131202195        457MB
docker-snmp                                        master.209-b0c73d9a7_Internal   822131202195        457MB
docker-teamd                                       latest                          ea6bda719ec0        428MB
docker-teamd                                       master.209-b0c73d9a7_Internal   ea6bda719ec0        428MB
docker-nat                                         latest                          1cbc8173f2e3        430MB
docker-nat                                         master.209-b0c73d9a7_Internal   1cbc8173f2e3        430MB
docker-router-advertiser                           latest                          ea7bb1f6d3f0        415MB
docker-router-advertiser                           master.209-b0c73d9a7_Internal   ea7bb1f6d3f0        415MB
docker-platform-monitor                            latest                          03027e633d3f        746MB
docker-platform-monitor                            master.209-b0c73d9a7_Internal   03027e633d3f        746MB
docker-macsec                                      latest                          de1ea43df7cb        431MB
docker-macsec                                      master.209-b0c73d9a7_Internal   de1ea43df7cb        431MB
docker-lldp                                        latest                          f13c1b763180        455MB
docker-lldp                                        master.209-b0c73d9a7_Internal   f13c1b763180        455MB
docker-orchagent                                   latest                          6849170aa3cd        446MB
docker-orchagent                                   master.209-b0c73d9a7_Internal   6849170aa3cd        446MB
docker-sonic-telemetry                             latest                          22a2702c6b34        504MB
docker-sonic-telemetry                             master.209-b0c73d9a7_Internal   22a2702c6b34        504MB
docker-sonic-mgmt-framework                        latest                          84e0d03cae25        570MB
docker-sonic-mgmt-framework                        master.209-b0c73d9a7_Internal   84e0d03cae25        570MB
docker-mux                                         latest                          6e6637e4ea92        468MB
docker-mux                                         master.209-b0c73d9a7_Internal   6e6637e4ea92        468MB
docker-fpm-frr                                     latest                          a48280de2de7        446MB
docker-fpm-frr                                     master.209-b0c73d9a7_Internal   a48280de2de7        446MB
docker-sflow                                       latest                          0b2be7d286b9        428MB
docker-sflow                                       master.209-b0c73d9a7_Internal   0b2be7d286b9        428MB
urm.nvidia.com/sw-nbu-sws-sonic-docker/sonic-wjh   1.1.0-202106-internal-13        a1808db49408        462MB
harbor.mellanox.com/sonic/cpu-report               10.0.0                          5314b41a2a5e        413MB

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions