-
Notifications
You must be signed in to change notification settings - Fork 24.6k
[NEW] Optimize expire #9480
Description
The problem/use-case that the feature addresses
Ideas for optimizing expireIfNeeded() to require fewer cold memory accesses per key.
Description of the feature
There are many memory accesses involved in a key lookup and a number of these are for expireIfNeeded().
Recent optimizations have thrown out the parallel structure for slot-to-key mapping using dictEntry metadata. Another optimization is embedding the key in dictEntry. Something could be done for expire too, but it needs design. There some ideas below.
First, let's look at the current situation. Whenever a key is looked up in the database with a non-empty expire dict, this is what happens, with memory accesses listed:
expireIfNeeded()- The key is hashed.
- The expire dict ht is accessed by index.
- The expire dictEntry is accessed.
- The key is accessed and compared to the requested key. (This key is shared with the db dict.)
- The TTL value is embedded in the dictEntry, so no extra memory lookup for this one.
lookupKey()- The key is hashed again.
- The db dict ht is accessed by index.
- The db dictEntry is accessed. (If the key is embedded in the dictEntry, the allocation might already be in cache.)
- The key is accessed and compared to the requested key. (This allocation is already in cache since the access for expire.)
- The value is accessed.
Alternatives you've considered
-
Idea 1: Store an 'expire' bit in the main db dict.
Use one bit in the value robj or in the db dictEntry to flag the existence of TTL for a key. Only if this bit is set, the expire dict needs to be looked up. This means that main db is looked up before the expire dict. Eviction works as before.
Benefit: Lookup keys without TTL will use 1 memory access less. Fairly simple to implement.
-
Idea 2: Get rid of the 'expire' dict altogether.
Store the TTL in the dictEntry of the main db entry. To be able to sample the keys with TTL, we form a linked list of dictEntries with a TTL using dictEntry metadata (similar to the new slot-to-key structure). The list contains all keys with a TTL and is not sorted. For maxmemory evictions, instead of sampling random entries in the expire dict, we do a random walk along this list. One bit in the dictEntry is needed to flag the existence of the extra fields in the dictEntry metadata (TTL and
nextWithTTL/prevWithTTLpointers).Benefit: Lookup of keys with TTL will use 2 memory accesses less. Keys without TTL will use 1 memory access less. Fewer allocations in total (e.g. the large expire dict ht is not needed anymore).
This idea was proposed by @ShooterIT. The idea came up in a comment in Slot-to-keys using dict entry metadata #9356 (comment).
Additional information
Prototyping and benchmarks TBD.