Enhance dict to support arbitrary metadata carried in dictEntry by JimB123 · Pull Request #8210 · redis/redis

JimB123 · 2020-12-17T23:37:04Z

The Redis dictionary, as defined in dict.h/dict.c, is a general purpose dictionary utility that should not be modified for custom changes in Redis. The dictionary uses a dictEntry structure to carry each key/value pair.

This update allows the embedding of arbitrary metadata at the end of the dictEntry struct by allowing the allocation of a fixed amount of space at the end of each dictEntry in a given dictionary.

This is useful for modules that might like to maintain meta data along with a Redis defined structure which they can't modify. This allows the storage of the metadata without the overhead of another dictionary.

This is an optional capability specified through the dictType and will not affect performance of the dictionary. It will only affect space (memory) for dictionaries where this is explicitly enabled.

Rebase to latest OSS

rebase from redis

rebase

pull from main branch

rebase

oranagra · 2020-12-21T16:03:48Z

@JimB123 thanks for submitting this PR.

I wonder if this is a preparation to set the ground for a followup PR? i don't think we wanna merge this one on it's own (add capabilities that we're not using).
Note that the "dict" capability that redis exposes for modules is based on rax.c, not dict.c

P.S. I actually made a somewhat similar change to dict.c in the past, adding optional callbacks for allocation and release of the dictEntry. i did this for the purpose of keeping some allocation reuse pull for some case of rapid / repeated malloc and frees, but when i think of it such a change (the one i did) could serve your purpose too, and even allow the flexibility of allocating a different size of payload per entry rather than a constant tone.

madolson · 2020-12-21T22:54:31Z

@oranagra There was two things we were considering open sourcing:

A secret we haven't decided to tell the community yet. ;)
Removing the slots-keys table, and instead replacing it with a linked list of all the entries. This is specifically a memory enhancement when the size of the keys < 16 bytes (which is usually the case), since the rax can use a lot of extra memory for large keys.

oranagra · 2020-12-22T10:24:57Z

@madolson you mean when key names are > 16 bytes?
IIRC rax is in most cases more memory efficient than a dict / and certainly a list.
maybe i don't understand the case well enough, can you add more info?

anyway, how is that related to the current PR?

madolson · 2020-12-22T20:26:04Z

@oranagra Yes I meant > 16 bytes. More precisely, a linked list is more efficient when the keys (stored on average within the radix tree) consume more than 16 bytes each, since a linked list can get away with 16 bytes (2 8 byte allocations for the previous and next entry). We've had several people who use GUIDs to store data, which are stored very inefficiently in the radix tree, complain about memory. The linked list pointers can be stored in the dict-entry metadata that is outlined here, which is how it relates.

This wouldn't replace the RAX, since it will actually be more memory intensive for keys with lots of shared prefixes. It seems like a relatively small change that can optimize some workloads though.

To your point, we can just leave this PR here until we have a real implementation.

Also, the dict implementation is always less efficient than a linked list, since we need the key pointer, the entry pointer, the pointer to the entry, and the extra overhead from the buckets.

JimB123 · 2021-01-04T19:00:41Z

@oranagra @madolson Memory-wise, a doubly linked list would always be more efficient than a RAX for this case. In most cases it would be significantly more efficient.

The issue is that the RAX duplicates the key, which is already linked in the dictEntry. If there are commonalities in the keys, it will duplicate somewhat efficiently, but still, it is duplicated. In practice, this means (at minimum) an 8-byte end marker on each key and a 16-byte data node for the portion of the key which is different. This is true unless the key is the leading substring of a larger key. But even in the leading substring case, memory is wasted as each node must be rounded up to the allocation size of 8 bytes and a pointer must be added. I don't believe that it's possible to add entries to a Redis RAX using less than 16 bytes.

Simple example - "home" and "homesick":
"home" (by itself) gets stored as: "home" -> endMark.

"home" takes 16 bytes (including the pointer to the end mark)
the end marker is 8 bytes. Total = 24 bytes

Now add "homesick"
This gets stored as "home" -> "sick" -> endMark.

"home" takes 16 bytes (including the pointer to "sick")
"sick" takes 16 bytes (including the pointer to the end mark)
the end marker is 8 bytes. Total = 40 bytes for 2 overlapping entries.

This assumes 8-byte pointers. Of course if we were using 4-byte pointers, the DLL would be correspondingly smaller also. It also assumes that the RAX only carries keys (and no data) which is the case for slots_to_keys.

oranagra · 2021-01-04T19:13:25Z

There's something I'm missing here, or context not provided?
Why are we comparing rax and linked-list?
Is there any use case where these are both applicable?

I mean:
Hash and rax have sane lookup complexity, and list is just a dumb O(N) search.
Hash and rax have key names, list doesn't necessarily.

IIRC the only places here redis uses a list are when it doesn't have a key, and doesn't perform searches.

What use case are we discussing and how is it related to this PR?

madolson · 2021-01-04T20:25:34Z

@oranagra The slot->key mapping table could be re-implemented as a linked list instead of a radix tree. The linked list is built from the dictEntrys.

We only need to loop over the entire structure, so we don't need to maintain ordering. We also don't need O(1) lookups because it's being attached to the dictionary, which already provides that. It's an optimization that, as Jim points out, can save a lot of memory for use cases where the keys have very little shared prefixes.

zuiderkwast · 2021-08-11T02:08:08Z

@JimB123 @madolson did anyone start the rewrite of clusterState slots_to_keys to a linked list based on this? If not, I'll start working on that now.

madolson · 2021-08-11T03:32:11Z

@zuiderkwast go for it!

zuiderkwast · 2021-08-11T08:10:21Z

@madolson I had a late night coding session. :-) PR is up.

@JimB123

The issue is that the RAX duplicates the key, which is already linked in the dictEntry. If there are commonalities in the keys, it will duplicate somewhat efficiently, but still, it is duplicated.

Actually, this rax concatenates the slot number (two bytes) before the key, so keys with a common prefix will only share space if they also belong to the same slot.

madolson · 2021-09-20T16:28:07Z

Yeah, we can close this now that it's been merged through the entry-metadata PR.

JimB123 added 6 commits August 5, 2020 15:19

Merge pull request #1 from redis/unstable

0ee3061

Rebase to latest OSS

Merge pull request #2 from redis/unstable

77789a2

rebase from redis

Merge pull request #3 from redis/unstable

09368ce

rebase

Merge pull request #4 from redis/unstable

2e6cacb

pull from main branch

Merge pull request #5 from redis/unstable

12d9913

rebase

Enhance dict to support arbitrary metadata carried in dictEntry

6250781

madolson mentioned this pull request May 14, 2021

[NEW] Proposal for deterministic LRU in Redis #8947

Open

zuiderkwast mentioned this pull request Aug 11, 2021

Slot-to-keys using dict entry metadata #9356

Merged

soloestoy added the state:to-be-closed requesting the core team to close the issue label Sep 16, 2021

madolson closed this Sep 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance dict to support arbitrary metadata carried in dictEntry#8210

Enhance dict to support arbitrary metadata carried in dictEntry#8210
JimB123 wants to merge 6 commits intoredis:unstablefrom
JimB123:dict

JimB123 commented Dec 17, 2020

Uh oh!

oranagra commented Dec 21, 2020

Uh oh!

madolson commented Dec 21, 2020 •

edited

Loading

Uh oh!

oranagra commented Dec 22, 2020

Uh oh!

madolson commented Dec 22, 2020

Uh oh!

JimB123 commented Jan 4, 2021 •

edited by madolson

Loading

Uh oh!

oranagra commented Jan 4, 2021

Uh oh!

madolson commented Jan 4, 2021 •

edited

Loading

Uh oh!

zuiderkwast commented Aug 11, 2021

Uh oh!

madolson commented Aug 11, 2021

Uh oh!

zuiderkwast commented Aug 11, 2021

Uh oh!

madolson commented Sep 20, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

JimB123 commented Dec 17, 2020

Uh oh!

oranagra commented Dec 21, 2020

Uh oh!

madolson commented Dec 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oranagra commented Dec 22, 2020

Uh oh!

madolson commented Dec 22, 2020

Uh oh!

JimB123 commented Jan 4, 2021 • edited by madolson Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oranagra commented Jan 4, 2021

Uh oh!

madolson commented Jan 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zuiderkwast commented Aug 11, 2021

Uh oh!

madolson commented Aug 11, 2021

Uh oh!

zuiderkwast commented Aug 11, 2021

Uh oh!

madolson commented Sep 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

madolson commented Dec 21, 2020 •

edited

Loading

JimB123 commented Jan 4, 2021 •

edited by madolson

Loading

madolson commented Jan 4, 2021 •

edited

Loading

madolson commented Sep 20, 2021 •

edited

Loading