Skip to content

Improved memory efficiency for small sets #11155

@madolson

Description

@madolson

Right now sets only have two encodings, intset and hashmap. The the intset is a fairly memory dense structure, but only supports integers. The hashmap alternative however has 32 bytes of overhead per entry, as well as more overhead for the dictionary itself. This means that for small sets, you are paying a lot of memory for overhead. Compared to other collections, like hash or sorted sets, the set has no compact memory representation for when the set is small.

The proposed solution here is to support a listpack encoding for sets. When a set is below some number of items or value size, the entries will just be stored as a lispack of sequential items. Once the threshold is breached, the listpack will be expanded into either an intset or a hashmap depending on the content.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions