Skip to content

[DellEmc S5232]: Orchagent crash is seen with 201911 images #4331

@chitra-raghavan

Description

@chitra-raghavan

Description

Upon loading T0 or T1 config in S5232 platform , orhcagent crashes.

Syslog:

Mar 17 05:40:02.321023 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait: No response for msg 0#015
Mar 17 05:40:02.321254 sonic INFO syncd#supervisord: syncd 0:soc_cmicx_led_link_status: Led msg id 0x0 send failed, Error Code -3#015
Mar 17 05:40:03.581098 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait: No response for msg 4#015
Mar 17 05:40:03.581371 sonic INFO syncd#supervisord: syncd 0:soc_cmicx_led_speed: Led msg id 0x4 send failed, Error Code -3#015
Mar 17 05:40:04.136778 sonic-s5232-01 ERR nat#supervisor-proc-exit-listener: Unable to retrieve features for container 'nat'. Exiting...
Mar 17 05:40:04.250397 sonic-s5232-01 INFO lldp#supervisord: lldpd 2020-03-17T05:40:04 [WARN/lldp] too large management address received on eth0: Value too large for defined data type
Mar 17 05:40:04.847556 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait: No response for msg 0#015
Mar 17 05:40:04.847798 sonic INFO syncd#supervisord: syncd 0:soc_cmicx_led_link_status: Led msg id 0x0 send failed, Error Code -3#015
Mar 17 05:40:04.883349 sonic-s5232-01 INFO nat#supervisord 2020-03-17 05:39:55,585 INFO success: supervisor-proc-exit-listener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Mar 17 05:40:04.883509 sonic-s5232-01 INFO nat#supervisord 2020-03-17 05:39:55,608 INFO exited: supervisor-proc-exit-listener (exit status 3; not expected)
Mar 17 05:40:04.883593 sonic-s5232-01 INFO nat#supervisord 2020-03-17 05:39:56,611 INFO spawned: 'supervisor-proc-exit-listener' with pid 238
Mar 17 05:40:04.883675 sonic-s5232-01 INFO nat#supervisord 2020-03-17 05:39:57,719 INFO success: supervisor-proc-exit-listener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Mar 17 05:40:04.883755 sonic-s5232-01 INFO nat#supervisord 2020-03-17 05:39:57,740 INFO exited: supervisor-proc-exit-listener (exit status 3; not expected)
Mar 17 05:40:04.883835 sonic-s5232-01 INFO nat#supervisord 2020-03-17 05:39:58,743 INFO spawned: 'supervisor-proc-exit-listener' with pid 240
Mar 17 05:40:04.883915 sonic-s5232-01 INFO nat#supervisord 2020-03-17 05:39:59,853 INFO success: supervisor-proc-exit-listener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Mar 17 05:40:04.883994 sonic-s5232-01 INFO nat#supervisord 2020-03-17 05:39:59,879 INFO exited: supervisor-proc-exit-listener (exit status 3; not expected)
Mar 17 05:40:04.884073 sonic-s5232-01 INFO nat#supervisord 2020-03-17 05:40:00,882 INFO spawned: 'supervisor-proc-exit-listener' with pid 242
Mar 17 05:40:04.884152 sonic-s5232-01 INFO nat#supervisord 2020-03-17 05:40:01,990 INFO success: supervisor-proc-exit-listener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Mar 17 05:40:04.884232 sonic-s5232-01 INFO nat#supervisord 2020-03-17 05:40:02,015 INFO exited: supervisor-proc-exit-listener (exit status 3; not expected)
Mar 17 05:40:04.884310 sonic-s5232-01 INFO nat#supervisord 2020-03-17 05:40:03,018 INFO spawned: 'supervisor-proc-exit-listener' with pid 244
Mar 17 05:40:04.884393 sonic-s5232-01 INFO nat#supervisord 2020-03-17 05:40:04,128 INFO success: supervisor-proc-exit-listener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Mar 17 05:40:04.884474 sonic-s5232-01 INFO nat#supervisord 2020-03-17 05:40:04,150 INFO exited: supervisor-proc-exit-listener (exit status 3; not expected)
Mar 17 05:40:06.113434 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait: No response for msg 4#015
Mar 17 05:40:06.113715 sonic INFO syncd#supervisord: syncd 0:soc_cmicx_led_speed: Led msg id 0x4 send failed, Error Code -3#015
Mar 17 05:40:06.270813 sonic-s5232-01 ERR nat#supervisor-proc-exit-listener: Unable to retrieve features for container 'nat'. Exiting...
Mar 17 05:40:07.379209 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait: No response for msg 0#015expected)

Mar 17 05:40:06.113434 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait: No response for msg 4#015
Mar 17 05:40:06.113715 sonic INFO syncd#supervisord: syncd 0:soc_cmicx_led_speed: Led msg id 0x4 send failed, Error Code -3#015
Mar 17 05:40:06.270813 sonic-s5232-01 ERR nat#supervisor-proc-exit-listener: Unable to retrieve features for container 'nat'. Exiting...
Mar 17 05:40:07.379209 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait: No response for msg 0#015expected)
Mar 17 05:40:06.113434 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait: No response for msg 4#015
Mar 17 05:40:06.113715 sonic INFO syncd#supervisord: syncd 0:soc_cmicx_led_speed: Led msg id 0x4 send failed, Error Code -3#015
Mar 17 05:40:06.270813 sonic-s5232-01 ERR nat#supervisor-proc-exit-listener: Unable to retrieve features for container 'nat'. Exiting...
Mar 17 05:40:07.379209 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait: No response for msg 0#015
Mar 17 05:40:07.379470 sonic INFO syncd#supervisord: syncd 0:soc_cmicx_led_link_status: Led msg id 0x0 send failed, Error Code -3#015
Mar 17 05:40:08.405273 sonic-s5232-01 ERR nat#supervisor-proc-exit-listener: Unable to retrieve features for container 'nat'. Exiting...
Mar 17 05:40:08.641344 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait:
Mar 17 05:40:08.641703 sonic INFO syncd#supervisord: syncd No response for msg 4
Mar 17 05:40:08.642012 sonic INFO syncd#supervisord: syncd #015
Mar 17 05:40:08.642330 sonic INFO syncd#supervisord: syncd 0:soc_cmicx_led_speed:
Mar 17 05:40:08.642641 sonic INFO syncd#supervisord: syncd Led msg id 0x4 send failed, Error Code -3
Mar 17 05:40:08.642945 sonic INFO syncd#supervisord: syncd #015
Mar 17 05:40:09.905710 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait:
Mar 17 05:40:09.905930 sonic INFO syncd#supervisord: syncd No response for msg 0
Mar 17 05:40:09.906067 sonic INFO syncd#supervisord: syncd #015
Mar 17 05:40:09.906338 sonic INFO syncd#supervisord: syncd 0:soc_cmicx_led_link_status:
Mar 17 05:40:09.906589 sonic INFO syncd#supervisord: syncd Led msg id 0x0 send failed, Error Code -3
Mar 17 05:40:09.906907 sonic INFO syncd#supervisord: syncd #015
Mar 17 05:40:10.538647 sonic-s5232-01 ERR nat#supervisor-proc-exit-listener: Unable to retrieve features for container 'nat'. Exiting...
Mar 17 05:40:11.166551 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait:
Mar 17 05:40:11.166756 sonic INFO syncd#supervisord: syncd No response for msg 4
Mar 17 05:40:11.167019 sonic INFO syncd#supervisord: syncd #015
Mar 17 05:40:11.167273 sonic INFO syncd#supervisord: syncd 0:soc_cmicx_led_speed:
Mar 17 05:40:11.167524 sonic INFO syncd#supervisord: syncd Led msg id 0x4 send failed, Error Code -3                                             
Mar 17 05:40:12.358639 sonic-s5232-01 ERR monit[549]: 'telemetry' process is not running
Mar 17 05:40:12.430572 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait:
Mar 17 05:40:12.430787 sonic INFO syncd#supervisord: syncd No response for msg 0
Mar 17 05:40:12.431000 sonic INFO syncd#supervisord: syncd #015
Mar 17 05:40:12.431254 sonic INFO syncd#supervisord: syncd 0:soc_cmicx_led_link_status:
Mar 17 05:40:12.431504 sonic INFO syncd#supervisord: syncd Led msg id 0x0 send failed, Error Code -3
Mar 17 05:40:12.431746 sonic INFO syncd#supervisord: syncd #015
Mar 17 05:40:12.675248 sonic-s5232-01 ERR nat#supervisor-proc-exit-listener: Unable to retrieve features for container 'nat'. Exiting...
Mar 17 05:40:13.694066 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait:
Mar 17 05:40:13.694275 sonic INFO syncd#supervisord: syncd No response for msg 4
Mar 17 05:40:13.694541 sonic INFO syncd#supervisord: syncd #015
Mar 17 05:40:13.694797 sonic INFO syncd#supervisord: syncd 0:soc_cmicx_led_speed:
Mar 17 05:40:13.695042 sonic INFO syncd#supervisord: syncd Led msg id 0x4 send failed, Error Code -3
Mar 17 05:40:13.695284 sonic INFO syncd#supervisord: syncd #015
Mar 17 05:40:14.810732 sonic-s5232-01 ERR nat#supervisor-proc-exit-listener: Unable to retrieve features for contain
Mar 17 05:40:14.958671 sonic INFO syncd#supervisord: syncd #015
Mar 17 05:40:16.218397 sonic INFO syncd#supervisord: syncd 0:soc_iproc_data_send_wait:
Mar 17 05:40:16.218604 sonic INFO syncd#supervisord: syncd No response for msg 4
Mar 17 05:40:16.218859 sonic INFO syncd#supervisord: syncd #015
Mar 17 05:40:16.219111 sonic INFO syncd#supervisord: syncd 0:soc_cmicx_led_speed:
Mar 17 05:40:16.219360 sonic INFO syncd#supervisord: syncd Led msg id 0x4 send failed, Error Code -3
Mar 17 05:40:16.219601 sonic INFO syncd#supervisord: syncd #015
Mar 17 05:40:16.943689 sonic-s5232-01 ERR nat#supervisor-proc-exit-listener: Unable to retrieve features for container 'nat'. Exiting...
Mar 17 05:40:17.160159 sonic ERR swss#orchagent: :- transfer_attributes: src vs dst attr id don't match GET mismatch
Mar 17 05:40:17.160328 sonic INFO swss#supervisord: orchagent terminate called after throwing an instance of 'std::runtime_error'
Mar 17 05:40:17.160496 sonic INFO swss#supervisord: orchagent   what():  :- transfer_attributes: src vs dst attr id don't match GET mismatch
Mar 17 05:40:19.080793 sonic-s5232-01 ERR nat#supervisor-proc-exit-listener: Unable to retrieve features for container 'nat'. Exiting...
Mar 17 05:40:23.351490 sonic-s5232-01 ERR nat#supervisor-proc-exit-listener: message repeated 2 times: [ Unable to retrieve features for container 'nat'. Exiting...]
Mar 17 05:40:24.262483 sonic INFO swss#supervisord 2020-03-17 05:40:17,363 INFO exited: orchagent (terminated by SIGABRT (core dumped); not expected)

Steps to reproduce the issue:

  1. Load T0/T1 config in 201911 or latest master in S5232 (DellEmc) and observe the crash

Describe the results you received:

Describe the results you expected:

Additional information you deem important (e.g. issue happens only occasionally):

**Output of `show version`:**

 
SONiC Software Version: SONiC.HEAD.31-055d7cd1
Distribution: Debian 9.12
Kernel: 4.9.0-11-2-amd64
Build commit: 055d7cd1
Build date: Mon Mar 16 20:46:35 UTC 2020
Built by: johnar@jenkins-worker-7
 
Platform: x86_64-dellemc_s5232f_c3538-r0
HwSKU: DellEMC-S5232f-C32
ASIC: broadcom
Serial Number: CN01WJVTCES0094Q0019
Uptime: 07:01:03 up  1:31,  1 user,  load average: 0.46, 0.51, 0.59
 
Docker images:
REPOSITORY                    TAG                 IMAGE ID            SIZE
docker-syncd-brcm             HEAD.31-055d7cd1    31fb4b7a906b        430MB
docker-syncd-brcm             latest              31fb4b7a906b        430MB
docker-platform-monitor       HEAD.31-055d7cd1    035a869d4325        334MB
docker-platform-monitor       latest              035a869d4325        334MB
docker-router-advertiser      HEAD.31-055d7cd1    6d0a964549c5        283MB
docker-router-advertiser      latest              6d0a964549c5        283MB
docker-fpm-frr                HEAD.31-055d7cd1    579a525fa8a3        327MB
docker-fpm-frr                latest              579a525fa8a3        327MB
docker-sflow                  HEAD.31-055d7cd1    8dfcf4fddee0        307MB
docker-sflow                  latest              8dfcf4fddee0        307MB
docker-lldp-sv2               HEAD.31-055d7cd1    8d09bae23cb9        304MB
docker-lldp-sv2               latest              8d09bae23cb9        304MB
docker-orchagent              HEAD.31-055d7cd1    e630365d8d1f        325MB
docker-orchagent              latest              e630365d8d1f        325MB
docker-dhcp-relay             HEAD.31-055d7cd1    6972cb6f2aa4        293MB
docker-dhcp-relay             latest              6972cb6f2aa4        293MB
docker-database               HEAD.31-055d7cd1    91b6facd8e22        283MB
docker-database               latest              91b6facd8e22        283MB
docker-snmp-sv2               HEAD.31-055d7cd1    727833344fd0        340MB
docker-snmp-sv2               latest              727833344fd0        340MB
docker-teamd                  HEAD.31-055d7cd1    7c049bb252d2        307MB
docker-teamd                  latest              7c049bb252d2        307MB
docker-nat                    HEAD.31-055d7cd1    c28ac72de986        309MB
docker-sonic-mgmt-framework   HEAD.31-055d7cd1    c352fed38869        420MB
docker-sonic-mgmt-framework   latest              c352fed38869        420MB
docker-sonic-telemetry        HEAD.31-055d7cd1    633c6375f4e8        344MB
docker-sonic-telemetry        latest              633c6375f4e8        3

Coredump :

root@sonic:/# gdb /usr/bin/orchagent orchagent.1584445346.126.core
GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/orchagent...(no debugging symbols found)...done.
[New LWP 126]
[New LWP 133]
[New LWP 132]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/orchagent -d /var/log/swss -b 8192 -m 3c:2c:30:6d:7e:80'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007fb7852bdfff in raise () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fb78766b6c0 (LWP 126))]
(gdb) bt
#0 0x00007fb7852bdfff in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fb7852bf42a in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007fb785bd60ad in _gnu_cxx::_verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x00007fb785bd4066 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007fb785bd40b1 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007fb785bd42c9 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00007fb7868d7906 in swss::Logger::wthrow(swss::Logger::Priority, char const*, ...) () from /usr/lib/x86_64-linux-gnu/libswsscommon.so.0
#7 0x00007fb78631bbe0 in void sai_deserialize_number<int>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int&, bool) ()
from /usr/lib/x86_64-linux-gnu/libsaimetadata.so.0
#8 0x00007fb78630d1c4 in sai_deserialize_enum(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, _sai_enum_metadata_t const*, int&) ()
from /usr/lib/x86_64-linux-gnu/libsaimetadata.so.0
#9 0x00007fb78631645f in sai_deserialize_attr_value(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, _sai_attr_metadata_t const&, _sai_attribute_t&, bool) ()
from /usr/lib/x86_64-linux-gnu/libsaimetadata.so.0
#10 0x00007fb786307887 in SaiAttributeList::SaiAttributeList(sai_object_type_t, std::vector<std::pair<std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::_cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, bool) () from /usr/lib/x86_64-linux-gnu/libsaimetadata.so.0
#11 0x00007fb786b6fdd0 in internal_redis_get_process(sai_object_type_t, unsigned int, _sai_attribute_t*, std::tuple<std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<std::pair<std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::_cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > >&) () from /usr/lib/x86_64-linux-gnu/libsairedis.so.0
#12 0x00007fb786b70a79 in internal_redis_generic_get(sai_object_type_t, std::_cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, _sai_attribute_t*) ()
from /usr/lib/x86_64-linux-gnu/libsairedis.so.0
#13 0x00007fb786b710f0 in redis_generic_get(_sai_object_type_t, unsigned long, unsigned int, _sai_attribute_t*) () from /usr/lib/x86_64-linux-gnu/libsairedis.so.0
#14 0x00007fb786304cee in meta_sai_get_oid(_sai_object_type_t, unsigned long, unsigned int, _sai_attribute_t*, int  (_sai_object_type_t, unsigned long, unsigned int, _sai_attribute_t*)) ()
from /usr/lib/x86_64-linux-gnu/libsaimetadata.so.0
#15 0x00007fb786b5ce92 in redis_get_port_attribute(unsigned long, unsigned int, _sai_attribute_t*) () from /usr/lib/x86_64-linux-gnu/libsairedis.so.0
#16 0x000055cacc19a9eb in ?? ()
#17 0x000055cacc19afd7 in ?? ()
#18 0x000055cacc19c859 in ?? ()
#19 0x000055cacc19b337 in ?? ()
#20 0x000055cacc1976df in ?? ()
#21 0x000055cacc137b6a in ?? ()
#22 0x000055cacc124e5f in ?? ()
#23 0x00007fb7852ab2e1 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#24 0x000055cacc1353ea in ?? ()

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions