-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[warm-reboot] APPLY_VIEW failure in set operation "SAI_API_FDB:brcm_sai_set_fdb_entry_attribute" #8367
Copy link
Copy link
Closed
sonic-net/sonic-swss
#1866Labels
Description
Description
SYNCD and OA failures seen during APPLY_VIEW. Two issues that are brought up here:
- As part of APPLY_VIEW, SET operation
brcm_sai_set_fdb_entry_attributefailed duringasic_process_event: field: SAI_FDB_ENTRY_ATTR_ALLOW_MAC_MOVE, value: true - As a result, syncd sent
sendShutdownRequestto OA, OA also crashed because of kernel GP infreecall with errorcorrupted unsorted chunks
Steps to reproduce the issue:
- Run test_advanced_reboot::test_warm_reboot
- The test would fail with LAG flap errors.
- Check syslog to find that there were syncd and Orchagent failures.
Describe the results you received:
Describe the results you expected:
SAI REDIS:
2021-08-06.13:25:47.038362|a|APPLY_VIEW
2021-08-06.13:25:49.669961|A|SAI_STATUS_FAILURE
2021-08-06.13:25:49.670077|n|switch_shutdown_request|{"switch_id":"oid:0x21000000000000"}|
Sylog:
Aug 6 13:25:49.431633 str2-7050cx3-acs-01 NOTICE syncd#syncd: :- asicGetWithOptimizedRemoveOperations: moved 12 REMOVE operations upper in stack from total 545 operations
Aug 6 13:25:49.431670 str2-7050cx3-acs-01 NOTICE syncd#syncd: :- asicGetWithOptimizedRemoveOperations: optimizing asic remove operations took 0.000348 sec
Aug 6 13:25:49.431670 str2-7050cx3-acs-01 WARNING syncd#syncd: [none] SAI_API_HOSTIF:_brcm_sai_set_cpu_queue_shaper:13005 Set CPU Queue 0 shaping: cir 0, cbs 0
Aug 6 13:25:49.431670 str2-7050cx3-acs-01 WARNING syncd#syncd: [none] SAI_API_HOSTIF:_brcm_sai_set_cpu_queue_shaper:13005 Set CPU Queue 1 shaping: cir 0, cbs 0
Aug 6 13:25:49.434215 str2-7050cx3-acs-01 INFO syncd#syncd: [none] SAI_API_FDB:brcm_sai_set_fdb_entry_attribute:965 FDB MAC:00-06-07-08-09-0A port_tid:0xb0000006 port_type:Port vlan_id:1000 vid:0x73e8
>>>> Aug 6 13:25:49.434215 str2-7050cx3-acs-01 ERR syncd#syncd: :- asic_process_event: field: SAI_FDB_ENTRY_ATTR_ALLOW_MAC_MOVE, value: true
>>>> Aug 6 13:25:49.434215 str2-7050cx3-acs-01 ERR syncd#syncd: :- asic_process_event: failed to execute api: set, key: SAI_OBJECT_TYPE_FDB_ENTRY:{"bvid":"oid:0x2600000000063d","mac":"00:06:07:08:09:0A","switch_id":"oid:0x21000000000000"}, status: SAI_STATUS_INVALID_ATTRIBUTE_0
Aug 6 13:25:49.434344 str2-7050cx3-acs-01 NOTICE syncd#syncd: :- executeOperationsOnAsic: asic apply took 0.034392 sec
Aug 6 13:25:49.434344 str2-7050cx3-acs-01 ERR syncd#syncd: :- executeOperationsOnAsic: Error while executing asic operations, ASIC is in inconsistent state: :- asic_process_event: failed to execute api: set, key: SAI_OBJECT_TYPE_FDB_ENTRY:{"bvid":"oid:0x2600000000063d","mac":"00:06:07:08:09:0A","switch_id":"oid:0x21000000000000"}, status: SAI_STATUS_INVALID_ATTRIBUTE_0
Aug 6 13:25:49.669511 str2-7050cx3-acs-01 NOTICE syncd#syncd: :- applyView: apply took 2.625379 sec
Aug 6 13:25:49.669801 str2-7050cx3-acs-01 ERR syncd#syncd: :- run: Runtime error: :- asic_process_event: failed to execute api: set, key: SAI_OBJECT_TYPE_FDB_ENTRY:{"bvid":"oid:0x2600000000063d","mac":"00:06:07:08:09:0A","switch_id":"oid:0x21000000000000"}, status: SAI_STATUS_INVALID_ATTRIBUTE_0
Aug 6 13:25:49.669933 str2-7050cx3-acs-01 NOTICE syncd#syncd: :- sendShutdownRequest: sending switch_shutdown_request notification to OA for switch: oid:0x21000000000000
Aug 6 13:25:49.670140 str2-7050cx3-acs-01 ERR swss#orchagent: :- syncd_apply_view: Failed to notify syncd APPLY_VIEW -1
Aug 6 13:25:49.670347 str2-7050cx3-acs-01 ERR swss#orchagent: :- on_switch_shutdown_request: Syncd stopped
Aug 6 13:25:49.670537 str2-7050cx3-acs-01 NOTICE syncd#syncd: :- sendShutdownRequestAfterException: notification send successfully
Aug 6 13:25:49.670653 str2-7050cx3-acs-01 INFO swss#/supervisord: orchagent free(): corrupted unsorted chunks
>>>> Aug 6 13:25:49.672697 str2-7050cx3-acs-01 INFO kernel: [ 56.028221] traps: orchagent[3039] general protection ip:7fbf45905d57 sp:7fbf44d18710 error:0 in libc-2.28.so[7fbf458ee000+148000]
Aug 6 13:25:50.773457 str2-7050cx3-acs-01 INFO swss#supervisord 2021-08-06 13:25:50,771 INFO exited: orchagent (terminated by SIGSEGV (core dumped); not expected)
Aug 6 13:25:51.786777 str2-7050cx3-acs-01 INFO swss#/supervisor-proc-exit-listener: Process 'orchagent' exited unexpectedly. Terminating supervisor 'swss'
Aug 6 13:25:56.326707 str2-7050cx3-acs-01 NOTICE root: Stopping swss service...
BT from core:
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7fbf44d19700 (LWP 409) 0x00007fbf45905d57 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
2 Thread 0x7fbf4551dbc0 (LWP 40) 0x00007fbf459453e1 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
3 Thread 0x7fbf3ffff700 (LWP 57) 0x00007fbf459c57ef in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6
4 Thread 0x7fbf3f7fe700 (LWP 58) 0x00007fbf459c57ef in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6
5 Thread 0x7fbf3e7fc700 (LWP 60) 0x00007fbf459c57ef in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6
6 Thread 0x7fbf4551a700 (LWP 55) 0x00007fbf459c57ef in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6
7 Thread 0x7fbf3effd700 (LWP 59) 0x00007fbf459ba819 in poll () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) t 1
[Switching to thread 1 (Thread 0x7fbf44d19700 (LWP 409))]
#0 0x00007fbf45905d57 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007fbf45905d57 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fbf45905eba in exit () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00005636d6bae709 in on_switch_shutdown_request () at notifications.cpp:27
#3 0x00007fbf462d92d5 in sairedis::NotificationSwitchShutdownRequest::executeCallback (this=0x7fbf38000ea0, switchNotifications=...) at NotificationSwitchShutdownRequest.cpp:49
#4 0x00007fbf462b77cf in sairedis::RedisRemoteSaiInterface::handleNotification (this=0x5636d707a560, name=..., serializedNotification="{\"switch_id\":\"oid:0x21", '0' <repeats 12 times>, "\"}",
values=std::vector of length 0, capacity 1) at /usr/include/c++/8/bits/shared_ptr_base.h:1018
#5 0x00007fbf462d7a21 in std::function<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&)>::operator()(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::vector<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&) const (__args#2=std::vector of length 0, capacity 1, __args#1="{\"switch_id\":\"oid:0x21", '0' <repeats 12 times>, "\"}", __args#0="switch_shutdown_request", this=0x5636d7078128)
at /usr/include/c++/8/bits/std_function.h:682
#6 sairedis::RedisChannel::notificationThreadFunction (this=0x5636d7078120) at RedisChannel.cpp:118
#7 0x00007fbf45ce7b2f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8 0x00007fbf4632ffa3 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#9 0x00007fbf459c54cf in clone () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) t 2
[Switching to thread 2 (Thread 0x7fbf4551dbc0 (LWP 40))]
#0 0x00007fbf459453e1 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007fbf459453e1 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fbf4594bc1a in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007fbf4594d8a2 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007fbf45db5beb in redisFree (c=0x5636d7063ef0) at hiredis.c:610
#4 0x00007fbf45f8a7ad in swss::Table::del (this=0x5636d7069740, key=...) at /usr/include/c++/8/bits/char_traits.h:352
#5 0x00007fbf45905eba in exit () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x00005636d6b90f6b in syncd_apply_view () at main.cpp:117
#7 0x00005636d6b9bae1 in OrchDaemon::warmRestoreAndSyncUp (this=0x5636d708ed70) at orchdaemon.cpp:707
#8 0x00005636d6b9f244 in OrchDaemon::init (this=<optimized out>) at orchdaemon.cpp:529
#9 0x00005636d6b76b11 in main (argc=<optimized out>, argv=<optimized out>) at main.cpp:646
(gdb)
Output of show version:
admin@str2-7050cx3-acs-01:~$ show ver
SONiC Software Version: SONiC.master.27379-1bc3fdbf2
Distribution: Debian 10.10
Kernel: 4.19.0-12-2-amd64
Build commit: 1bc3fdbf2
Build date: Thu Aug 5 13:52:30 UTC 2021
Built by: AzDevOps@sonic-build-workers-000KJ9
Platform: x86_64-arista_7050cx3_32s
HwSKU: Arista-7050CX3-32S-C32
ASIC: broadcom
ASIC Count: 1
Serial Number: JPE20432334
Model Number: DCS-7050CX3-32S-SSD
Hardware Revision: N/A
Uptime: 22:41:09 up 4:57, 1 user, load average: 0.37, 0.27, 0.27
SAI version:
$ docker exec -it syncd dpkg -s libsaibcm | head
Package: libsaibcm
Status: install ok installed
Priority: extra
Section: libs
Installed-Size: 245525
Maintainer: Guohan Lu <[email protected]>
Architecture: amd64
Source: saibcm
Version: 5.0.0.6-1
Provides: libsai
Output of show techsupport:
(paste your output here or download and attach the file here )
Additional information you deem important (e.g. issue happens only occasionally):
Reactions are currently unavailable