Skip to content

100% CPU Lockup with subkey expiry on v6.0.13 #233

@markabey

Description

@markabey

Describe the bug
Same issue as #187 - still being seen on v6.0.13. Possibly the reproduce by @benschermel was a different issue.

Log Files
Nothing out of the ordinary in the log files:

1698:1698:C 14 Sep 2020 15:07:11.946 # oO0OoO0OoO0Oo KeyDB is starting oO0OoO0OoO0Oo
1698:1698:C 14 Sep 2020 15:07:11.946 # KeyDB version=6.0.13, bits=64, commit=00000000, modified=0, pid=1698, just started
1698:1698:C 14 Sep 2020 15:07:11.946 # Configuration loaded
1699:1698:M 14 Sep 2020 15:07:11.950 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
1699:1698:M 14 Sep 2020 15:07:11.950 # Server can't set maximum open files to 10032 because of OS error: Operation not permitted.
1699:1698:M 14 Sep 2020 15:07:11.950 # Current maximum open files is 4096. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
1699:1698:M 14 Sep 2020 15:07:11.951 * Running mode=standalone, port=6388.
1699:1698:M 14 Sep 2020 15:07:11.951 # Server initialized
1699:1698:M 14 Sep 2020 15:07:11.951 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1699:1698:M 14 Sep 2020 15:07:11.951 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with KeyDB. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. KeyDB must be restarted after THP is disabled.
1699:1698:M 14 Sep 2020 15:07:16.671 * DB loaded from disk: 4.720 seconds
1699:1870:M 14 Sep 2020 15:07:16.671   Thread 0 alive.
1699:1871:M 14 Sep 2020 15:07:16.671   Thread 1 alive.
1699:1870:M 14 Sep 2020 15:07:16.673 # NOTICE: Detuning locks due to high load per core: 123.09%

THP is set to madvise.
This log was loading a DB from disk from a redis instance with no EXPIREMEMBER.

To Reproduce
I think this is when a very high volume of EXPIREMEMBER calls arrive. Our pattern of writing is like this (output of monitor, redacted):

1600096565.995802 [6 127.0.0.1:37220] "hSET" "keyF_keyG_49346_keyB_80_1" "keyC" "1600096564.599853_keyH_(896:)"
1600096565.995828 [6 127.0.0.1:37220] "EXPIREMEMBER" "keyF_keyG_49346_keyB_80_1" "keyC" "14400"
1600096565.996765 [6 127.0.0.1:37220] "HSeT" "160009656_keyG" "keyB_49346_80_6_1600096564_FS_keyC" "1600096564.599853_1600096564.599853_2_0_0_0_0"                                                                                                                                          1600096565.996800 [6 127.0.0.1:37220] "EXPIRE" "160009656_keyG" "14400"
1600096565.996811 [6 127.0.0.1:37220] "HINCRBY" "keyD" "0" "120"
1600096565.996820 [6 127.0.0.1:37220] "EXPIRE" "160009656_keyI" "14400"
1600096565.996828 [6 127.0.0.1:37220] "SADD" "160009656_keyI" "keyG"
1600096565.996837 [6 127.0.0.1:37220] "ZADD" "keyE" "160009656" "160009656"
1600096565.996850 [6 127.0.0.1:37220] "hSET" "keyF_keyG_49346_keyB_80_1" "keyC" "1600096564.605171_keyH"
1600096565.996872 [6 127.0.0.1:37220] "EXPIREMEMBER" "keyF_keyG_49346_keyB_80_1" "keyC" "14400"
1600096565.997643 [6 127.0.0.1:37220] "HSeT" "160009656_keyG" "keyB_49346_80_6_1600096564_FS_keyC" "1600096564.605170_1600096564.605173_2_0_0_0_0"                                                                                                                                          1600096565.997674 [6 127.0.0.1:37220] "EXPIRE" "160009656_keyG" "14400"
1600096565.997684 [6 127.0.0.1:37220] "HINCRBY" "keyD" "0" "120"
1600096565.997692 [6 127.0.0.1:37220] "EXPIRE" "160009656_keyI" "14400"
1600096565.997700 [6 127.0.0.1:37220] "SADD" "160009656_keyI" "keyG"
1600096565.997709 [6 127.0.0.1:37220] "ZADD" "keyE" "160009656" "160009656"
1600096565.997722 [6 127.0.0.1:37220] "hSET" "keyF_keyG_49346_keyB_80_1" "keyC" "1600096564.605179_keyH_(896:)"
1600096565.997746 [6 127.0.0.1:37220] "EXPIREMEMBER" "keyF_keyG_49346_keyB_80_1" "keyC" "14400"                                               1600096565.998456 [6 127.0.0.1:37220] "HSeT" "160009656_keyG" "keyB_49346_80_6_1600096564_FS_keyC" "1600096564.605179_1600096564.605180_2_0_0_0_0"
1600096565.998491 [6 127.0.0.1:37220] "EXPIRE" "160009656_keyG" "14400"                                                                       1600096565.998502 [6 127.0.0.1:37220] "HINCRBY" "keyD" "0" "120"
1600096565.998511 [6 127.0.0.1:37220] "EXPIRE" "160009656_keyI" "14400"
1600096565.998518 [6 127.0.0.1:37220] "SADD" "160009656_keyI" "keyG"
1600096565.998526 [6 127.0.0.1:37220] "ZADD" "keyE" "160009656" "160009656"
1600096565.998539 [6 127.0.0.1:37220] "hSET" "keyF_keyG_49346_keyB_80_1" "keyC" "1600096564.605198_keyH"
1600096565.998568 [6 127.0.0.1:37220] "EXPIREMEMBER" "keyF_keyG_49346_keyB_80_1" "keyC" "14400"
1600096565.999289 [6 127.0.0.1:37220] "HSeT" "160009656_keyG" "keyB_49346_80_6_1600096564_FS_keyC" "1600096564.605197_1600096564.605199_2_0_0_0_0"
1600096565.999322 [6 127.0.0.1:37220] "EXPIRE" "160009656_keyG" "14400"                                                                       1600096565.999333 [6 127.0.0.1:37220] "HINCRBY" "keyD" "0" "120"
1600096565.999341 [6 127.0.0.1:37220] "EXPIRE" "160009656_keyI" "14400"
1600096565.999350 [6 127.0.0.1:37220] "SADD" "160009656_keyI" "keyG"
1600096565.999358 [6 127.0.0.1:37220] "ZADD" "keyE" "160009656" "160009656"
1600096565.999371 [6 127.0.0.1:37220] "hSET" "keyF_keyG_49346_keyB_80_1" "keyC" "1600096564.605386_keyH_(896:)"
1600096565.999391 [6 127.0.0.1:37220] "EXPIREMEMBER" "keyF_keyG_49346_keyB_80_1" "keyC" "14400"
1600096566.000089 [6 127.0.0.1:37220] "HSeT" "160009656_keyG" "keyB_49346_80_6_1600096564_FS_keyC" "1600096564.605386_1600096564.605387_2_0_0_0_0"
1600096566.000121 [6 127.0.0.1:37220] "EXPIRE" "160009656_keyG" "14400"
1600096566.000132 [6 127.0.0.1:37220] "HINCRBY" "keyD" "0" "120"                                                                              1600096566.000140 [6 127.0.0.1:37220] "EXPIRE" "160009656_keyI" "14400"
1600096566.000151 [6 127.0.0.1:37220] "SADD" "160009656_keyI" "keyG"
1600096566.000160 [6 127.0.0.1:37220] "ZADD" "keyE" "160009656" "160009656"
1600096566.000173 [6 127.0.0.1:37220] "hSET" "keyF_keyG_49346_keyB_80_1" "keyC" "1600096564.611449_keyH"
1600096566.000195 [6 127.0.0.1:37220] "EXPIREMEMBER" "keyF_keyG_49346_keyB_80_1" "keyC" "14400"

There will be many different keys like keyF_keyG_49346_keyB_80_1.
In "normal mode" - when a key like keyF_keyG_49346_keyB_80_1 is written, an EXPIRE 14400 is set. When it is read by another app, an EXPIRE keyF_keyG_49346_keyB_80_1 120 is sent by the other application. It is then often refreshed by the EXPIRE 14400 before the 120 expiry triggers and so it doesn't actually expire and the number of sub keys keeps growing, using up more memory. This is what we hoped EXPIREMEMBER would solve.

In "EXPIREMEMBER mode" - when a key like keyF_keyG_49346_keyB_80_1 is written, an EXPIREMEMBER 14400 is set on the subkey. When it is read by another app, an EXPIRE keyF_keyG_49346_keyB_80_1 120 on the main key is still sent by the other application.

We find that when we don't send the EXPIREMEMBER commands then the CPU stays down at 7-12%, but the memory slowly fills up and the lru algorithm evicts keys. but with EXPIREMEMBER enabled, the whole instance locks up under high load. The difference is the first instance keydb is always available to serve commands (even if there is some data loss).

This is what the flamegraph looked like for a 1 minute call
image

Flame graph SVG here, and the stacks list, for viewing in flamescope
keydb-ticket-expiremember-fully-locked-stacks.gz
keydb-ticket-expiremember-fully-locked.svg.gz

For another host, which wasn't under lockout, but was using high CPU compared to redis or keydb with no EXPIREMEMBER, here are the flame graphs.

image

ticket-perf-stacks-busy-not-locked.svg.gz
ticket-perf-stacks-busy-not-locked-stacks.gz

Is this just a usecase for which EXPIREMEMBER is not suitable for?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions