Skip to content

syncd crashed due to db memory limit? #176

@lguohan

Description

@lguohan

I think the cause is that orchagent generates too route churns in the ecmp converges period which as database to reach the limit.

We need to solve two issues:

  1. raise database limit
  2. decrease the route churns in orchagent to reduce number of routes in the ecmp convergence time.
Apr 22 10:31:02 str-msn2700-05 INFO database.sh[1124]: 1:M 22 Apr 10:31:02.156 # Client id=14 addr=/var/run/redis/redis.
sock:0 fd=10 name= age=101 idle=85 flags=U db=0 sub=0 psub=1 multi=-1 qbuf=0 qbuf-free=0 obl=16327 oll=1536 omem=2870096
0 events=rw cmd=psubscribe scheduled to be closed ASAP for overcoming of output buffer limits.
127.0.0.1:6379[1]> KEYS ASIC_STATE_*
1) "ASIC_STATE_VALUE_QUEUE"
2) "ASIC_STATE_OP_QUEUE"
3) "ASIC_STATE_KEY_QUEUE"
127.0.0.1:6379[1]> LLEN "ASIC_STATE_VALUE_QUEUE"
(integer) 195587
127.0.0.1:6379[1]> LLEN "ASIC_STATE_OP_QUEUE"
(integer) 195587
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/syncd -p /tmp/sai.profile'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f7aa06ab067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56      ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007f7aa06ab067 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f7aa06ac448 in __GI_abort () at abort.c:89
#2  0x00007f7aa0f98b3d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007f7aa0f96bb6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007f7aa0f96c01 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007f7aa0f96e19 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007f7aa1ac90d7 in swss::RedisSelect::readMe() () from /usr/lib/x86_64-linux-gnu/libswsscommon.so.0
#7  0x00007f7aa1ac5327 in swss::Select::select(swss::Selectable**, int*, unsigned int) ()
   from /usr/lib/x86_64-linux-gnu/libswsscommon.so.0
#8  0x0000000000407abf in ?? ()
#9  0x00007f7aa0697b45 in __libc_start_main (main=0x407370, argc=3, argv=0x7fff39375ba8, init=<optimized out>, 
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff39375b98) at libc-start.c:287
#10 0x0000000000408d7d in ?? ()

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions