Skip to content

Replace all usage of ziplist with listpack for t_zset#9366

Merged
oranagra merged 16 commits intoredis:unstablefrom
sundb:listpack-migration-zset
Sep 9, 2021
Merged

Replace all usage of ziplist with listpack for t_zset#9366
oranagra merged 16 commits intoredis:unstablefrom
sundb:listpack-migration-zset

Conversation

@sundb
Copy link
Collaborator

@sundb sundb commented Aug 12, 2021

Part two of implementing #8702 (zset), after #8887.

Description of the feature

Replaced all uses of ziplist with listpack in t_zset, and optimized some of the code to optimize performance.

Rdb format changes

New RDB_TYPE_ZSET_LISTPACK rdb type.

Rdb loading improvements:

  1. Pre-expansion of dict for validation of duplicate data for listpack and ziplist.
  2. Simplifying the release of empty key objects when RDB loading.
  3. Unify ziplist and listpack data verify methods for zset and hash, and move code to rdb.c.

Interface changes

  1. New zset-max-listpack-entries config is an alias for zset-max-ziplist-entries (same with zset-max-listpack-value).
  2. OBJECT ENCODING will return listpack instead of ziplist.

Listpack improvements:

  1. Add lpDeleteRange and lpDeleteRangeWithEntry functions to delete a range of entries from listpack.
  2. Improve the performance of lpCompare, converting from string to integer is faster than converting from integer to string.
  3. Replace snprintf with ll2string to improve performance in converting numbers to strings in lpGet().

Zset improvements:

  1. Improve the performance of zzlFind method, use lpFind instead of lpCompare in a loop.
  2. Use lpDeleteRangeWithEntry instead of lpDelete twice to delete a element of zset.

Tests

  1. Add some unittests for lpDeleteRange and lpDeleteRangeWithEntry function.
  2. Add zset RDB loading test.
  3. Add benchmark test for lpCompare and ziplsitCompare.
  4. Add empty listpack zset corrupt dump test.

@sundb sundb force-pushed the listpack-migration-zset branch from d33a9c8 to c77f56e Compare August 12, 2021 12:29
@sundb sundb force-pushed the listpack-migration-zset branch from c77f56e to 91213f5 Compare August 12, 2021 12:54
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few minor comments and suggestions..

src/listpack.c Outdated
Comment on lines 1007 to 1010
while (i--) {
tail = lpNext(lp, tail);
assert(p != NULL);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

despite the fact we have a previous check if num is too big, i think we still need to handle a case we reached to the end of the listpack in the middle of that loop (lpLength may be unreliable).
in which case we'll also need to update num since it's used below.
also, why change long to int ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, I forgot to corrupt data.
long to int was my mistake.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, This piece of code is obsolete.

src/rdb.c Outdated
if (deep_integrity_validation) server.stat_dump_payload_sanitizations++;
if (!zsetZiplistValidateIntegrity(encoded, encoded_len, deep_integrity_validation)) {
rdbReportCorruptRDB("Zset ziplist integrity check failed.");
if (!listpackValidateIntegrity(encoded, encoded_len, deep_integrity_validation)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't use the plain listpackValidateIntegrity, that one doesn't check for duplicate records.

Copy link
Collaborator Author

@sundb sundb Aug 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my naming caused your misunderstanding(lpValidateIntegrity is in listpack.c), perhaps it should be changed to hashAndZsetZiplistValidateIntegrity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i've been mixing lpValidateIntegrity and listpackValidateIntegrity. you must admit that it's confusing 8-)
so you renamed and moved hashListpackValidateIntegrity to serve as common code.
in the past, with ziplists, i've let hash and ziplist have separate integrity validation, just to let them be independent.

our options:

  1. rename your new listpackValidateIntegrity, either to lpPaisValidateIntegrityAndDups or something alike (it's not a generic listpack validation)
  2. move it back to hash and zset to be independently (like before).

i'm ok with option 1.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer 1, because I hate repetitive code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, change to lpPairsValidateIntegrityAndDups.

src/t_zset.c Outdated
Comment on lines 1028 to 1030
/* TODO: add function to ziplist API to delete N elements from offset. */
zl = ziplistDelete(zl,&p);
zl = ziplistDelete(zl,&p);
zl = lpDelete(zl,p,&p);
zl = lpDelete(zl,p,&p);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can now use the new method you created to implement that TODO

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it, but lpDeleteRange uses index for deletion, and I actually wanted to write lpDeleteEntryRange, but I didn't like the name, need help.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Change all similar code for deleting 2 elements.

Comment on lines 175 to +176
verify_log_message 0 "*skipping empty key: zset_ziplist*" 0
verify_log_message 0 "*empty keys skipped: 8*" 0
verify_log_message 0 "*skipping empty key: zset_listpack*" 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there really both types of encoded zsets in the same rdb file? how did you generate it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used debug populate to generate it, and modified the saved code.

        // zset ziplist
        robj *o = createObject(OBJ_ZSET,ziplistNew());
        o->encoding = OBJ_ENCODING_LISTPACK;
        dbAdd(c->db,createStringObject("zset_ziplist",12),o);

        // zset ziplist
        dbAdd(c->db,createStringObject("zset_listpack",13),createZsetListpackObject());

@sundb sundb marked this pull request as draft August 13, 2021 17:27
@sundb sundb force-pushed the listpack-migration-zset branch from 3184d20 to 34a9508 Compare August 16, 2021 03:16
@sundb sundb force-pushed the listpack-migration-zset branch from 34a9508 to 6a45efa Compare August 16, 2021 03:20
@sundb
Copy link
Collaborator Author

sundb commented Aug 16, 2021

Following are the benchmarking results for loading rdb under different scenarios.
The result is the same as #8887 (comment), i.e. the loading speed depends only on the number of ziplists and is independent of the size of the ziplist entry.

key num entries num of one element rdb loading time without convert rdb loading time with convert
500000 128 6.316s 18.164s
500000 64 3.745s 9.572s
500000 32 2.057s 5.293s
1000000 128 12.498s 32.786s
1000000 64 6.535s 17.399s
1000000 32 3.533s 8.814s

@oranagra
Copy link
Member

IIRC for hashes the factor was about 1.5x, and here the factor seems like 3x.
am i right? have a clue why such a difference?

@sundb
Copy link
Collaborator Author

sundb commented Aug 16, 2021

IIRC for hashes the factor was about 1.5x, and here the factor seems like 3x.
am i right? have a clue why such a difference?

Ohh, My mistake, `entries num of one element` is actually the entry number of zset,
not the ziplist, so I'll have to double check.

@sundb
Copy link
Collaborator Author

sundb commented Aug 16, 2021

zset benchmark test. config: `zset-max-ziplist-entries 999999` command: `./src/redis-benchmark -t zadd,zpopmin -n 50000 -r 100000000`

@oranagra I was sceptical about the test results, but I checked the code several times and found no problems, the improvement in the zpopmin test was mainly due to the zzlDelete change, I modified your suggestion to add lpDeleteRangeWithEntry, I wonder if I made a mistake in the code.

  1. listack

zadd:

Summary:
  throughput summary: 2568.58 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
       19.347     0.136    19.167    37.055    44.895    47.871

zpopmin:

Summary:
  throughput summary: 54406.96 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.749     0.064     0.255     1.983     2.263     2.815
  1. ziplist

zadd:

Summary:
  throughput summary: 1201.63 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
       41.463     0.184    34.879    88.575   103.615   167.423

zpopmin:

Summary:
  throughput summary: 5523.03 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        8.943     0.088     2.183    26.751    37.023    90.431

There is indeed a bug, back to my words.

@sundb sundb marked this pull request as ready for review August 16, 2021 11:16
@sundb
Copy link
Collaborator Author

sundb commented Aug 16, 2021

The reason for the error in the previous benchmark (#9366 (comment)) is that when the listpack length exceeds UINT16_MAX, lpDeleteRangeWithEntry method call lpSetNumElements ignores the maximum length of listpack.

zset benchmark test.
config: zset-max-ziplist-entries 999999
command: ./src/redis-benchmark -t zadd,zpopmin -n 30000 -r 100000000

Note: The reason for setting -n 30000 is to avoid exceeding the maximum listpack length(65535), which will result in unrealistic benchmark results.

  1. listack

zadd:

Summary:
  throughput summary: 4033.34 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
       12.276     0.176    12.087    23.535    24.815    26.703

zpopmin:

Summary:
  throughput summary: 58708.42 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        0.748     0.056     0.751     1.295     1.575     2.439
  1. ziplist

zadd:

Summary:
  throughput summary: 3012.96 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
       16.470     0.152    16.295    30.847    33.983   111.295

zpopmin:

Summary:
  throughput summary: 37593.98 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
        1.229     0.072     1.231     2.223     2.431     3.127

@sundb
Copy link
Collaborator Author

sundb commented Aug 17, 2021

@oranagra In #9366 (comment).
The reason for the slow conversion is that we have deep sanitization enabled by default, which causes the hash to be rehashed several times during check dup ziplist.
I modified the ziplist and listpack validation callbacks to use head count for dict expand.
The loading speed dropped from 18s seconds to 14s.

@sundb sundb force-pushed the listpack-migration-zset branch from 089ce2d to 05d6358 Compare August 17, 2021 11:36
@sundb sundb force-pushed the listpack-migration-zset branch from 05d6358 to 6bac662 Compare August 17, 2021 13:23
src/listpack.c Outdated
Comment on lines 1029 to 1032
if (numele == LP_HDR_NUMELE_UNKNOWN) {
/* If the listpack length cannot be obtained in constant time,
* using lpDeleteRangeWithEntry will be much faster. */
lp = lpDeleteRangeWithEntry(lp, &p, num);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't understand this. it looks like the alternative is doing the same (also using lpDeleteRangeWithEntry)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but if we do that, we can't use lpGetNumElements, we need to use lpLength, which will traverse the entire listpack when listpack length > UINT16_MAX. Instead, we use lpDeleteRangeWithEntry to delete the length, so it only traverses to the index.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i tihnk i see what you mean.
the code below has an "optimization" to just move the EOF marker, and that optimization can't be used without knowing the real length (which in this case would require calling lpLength).
we rather not use lpLength, so instead in this case we skip the EOF "optimization" and just fall back to the normal path.

the current code and comment is confusing because it mentions we can do something faster, but the thing it's faster from, isn't at all there..
i suggest to refactor that code and improve the comment.

it can be something like:

/* If we know we're gonna delete beyond the end of the listpack, we can just move
 * the EOF marker, and there's no need to iterate though the entries.
 * but if we can't be sure how many entries there are, we rather avoid calling lpLength
 * since that means an additional iteration on all elements. */
if (we know the real length, and we know we're deleting beyond the range) {
    do the optimal thing and move the EOF
} else {
    lpDeleteRangeWithEntry
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@oranagra
Copy link
Member

@sundb so you're saying that now a loading that used to take 6 seconds takes 14 (and before that fix it used to take 18s)?
so that's about 2.0x?

did that fix also affect (improve) the conversion benchmark we did for hashes?

besides that, i see you also showed an improved performance of zsets that use listpack vs ziplists (not related to any of the commits of recent days, but rather just that listpack is better than ziplist), right?

anything else left before we can merge it? please go over the recent comment that are not yet marked as resolve and make sure they're handled, and please also update the top comment to list all the changes of this PR.

thank you!

@oranagra
Copy link
Member

ohh, sorry, i see i missed one commit (1c72ad5), so the improvement are a result of a recent change.

@sundb sundb force-pushed the listpack-migration-zset branch from 467ac3a to ea0ed69 Compare August 18, 2021 01:57
@sundb
Copy link
Collaborator Author

sundb commented Aug 18, 2021

@sundb so you're saying that now a loading that used to take 6 seconds takes 14 (and before that fix it used to take 18s)?
so that's about 2.0x?

Yeah.

did that fix also affect (improve) the conversion benchmark we did for hashes?

Yes, hashes also pay off, and I'll be benchmarking hash and zset again.

@sundb sundb force-pushed the listpack-migration-zset branch from ee2956a to 8873fd8 Compare August 18, 2021 06:06
@sundb sundb force-pushed the listpack-migration-zset branch from 1f391b9 to b22020d Compare August 18, 2021 07:41
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so are we done?

anything else left before we can merge it?
please go over the recent comments that are not yet marked as resolve and make sure they're handled, and please also update the top comment to list all the changes of this PR.

src/listpack.c Outdated
Comment on lines 1035 to 1036
/* Note that index could overflow, but we use the value
* after seek, so when we use it no overflow happens. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may still wanna keep this comment if you think it's useful (personally i didn't dive into it)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought no one would care about this comment, I think it's necessary (I often confusing).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to do the benchmark.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also need to run "corrupt-dump-fuzzer " for several hours.

@sundb
Copy link
Collaborator Author

sundb commented Aug 19, 2021

Second Rdb loading zset benchmark test after #9366 (comment).
Early expansion of the dict used to validate ziplsit duplicates in lpPairsValidateIntegrityAndDups method(6bac662).

key num zset length loading without convert loading with convert(after optimize) loading with convert(before optimize)
500000 128 5.352s 13.821s 15.648s
500000 64 2.983s 7.225s 8.443s
500000 32 1.629s 3.835s 4.217s
1000000 128 10.977s 27.026s 31.965s
1000000 64 6.022s 14.280s 16.477s
1000000 32 3.317s 7.522s 8.708s

@sundb
Copy link
Collaborator Author

sundb commented Aug 19, 2021

6bac662 also optimises the loading of the listpack with sanitation.

Following is a benchmark test for loading hash listpack.
Half of each hash is strings, and half is numbers.

key num hash length with sanitation(before optimize) with sanitation(after optimize)
1000000 256 39.783s 32.496s
1000000 128 18.372s 15.125s
1000000 64 9.452s 8.029s
1000000 32 5.004s 4.460s

@oranagra oranagra added the state:to-be-merged The PR should be merged soon, even if not yet ready, this is used so that it won't be forgotten label Sep 9, 2021
@oranagra
Copy link
Member

oranagra commented Sep 9, 2021

@redis/core-team technically, this is a major decision, but since it follows the footsteps of the same thing we did for hashes, I'll merge this one normally.
So FYI: rdb changes, loading time conversion, etc.

@sundb
Copy link
Collaborator Author

sundb commented Sep 9, 2021

@oranagra It's ready to go.
The 'corrupt-dump-fuzzer' test has been run a few times before.

@oranagra oranagra merged commit 3ca6972 into redis:unstable Sep 9, 2021
@sundb sundb deleted the listpack-migration-zset branch September 10, 2021 01:55
@oranagra oranagra added the state:major-decision Requires core team consensus label Sep 12, 2021
oranagra added a commit that referenced this pull request Nov 24, 2021
Part three of implementing #8702, following #8887 and #9366 .

## Description of the feature
1. Replace the ziplist container of quicklist with listpack.
2. Convert existing quicklist ziplists on RDB loading time. an O(n) operation.

## Interface changes
1. New `list-max-listpack-size` config is an alias for `list-max-ziplist-size`.
2. Replace `debug ziplist` command with `debug listpack`.

## Internal changes
1. Add `lpMerge` to merge two listpacks . (same as `ziplistMerge`)
2. Add `lpRepr` to print info of listpack which is used in debugCommand and `quicklistRepr`. (same as `ziplistRepr`)
3. Replace `QUICKLIST_NODE_CONTAINER_ZIPLIST` with `QUICKLIST_NODE_CONTAINER_PACKED`(following #9357 ).
    It represent that a quicklistNode is a packed node, as opposed to a plain node.
4. Remove `createZiplistObject` method, which is never used.
5. Calculate listpack entry size using overhead overestimation in `quicklistAllowInsert`.
    We prefer an overestimation, which would at worse lead to a few bytes below the lowest limit of 4k.

## Improvements
1. Calling `lpShrinkToFit` after converting Ziplist to listpack, which was missed at #9366.
2. Optimize `quicklistAppendPlainNode` to avoid memcpy data.

## Bugfix
1. Fix crash in `quicklistRepr` when ziplist is compressed, introduced from #9366.

## Test
1. Add unittest for `lpMerge`.
2. Modify the old quicklist ziplist corrupt dump test.

Co-authored-by: Oran Agra <[email protected]>
hwware pushed a commit to hwware/redis that referenced this pull request Dec 20, 2021
Part three of implementing redis#8702, following redis#8887 and redis#9366 .

## Description of the feature
1. Replace the ziplist container of quicklist with listpack.
2. Convert existing quicklist ziplists on RDB loading time. an O(n) operation.

## Interface changes
1. New `list-max-listpack-size` config is an alias for `list-max-ziplist-size`.
2. Replace `debug ziplist` command with `debug listpack`.

## Internal changes
1. Add `lpMerge` to merge two listpacks . (same as `ziplistMerge`)
2. Add `lpRepr` to print info of listpack which is used in debugCommand and `quicklistRepr`. (same as `ziplistRepr`)
3. Replace `QUICKLIST_NODE_CONTAINER_ZIPLIST` with `QUICKLIST_NODE_CONTAINER_PACKED`(following redis#9357 ).
    It represent that a quicklistNode is a packed node, as opposed to a plain node.
4. Remove `createZiplistObject` method, which is never used.
5. Calculate listpack entry size using overhead overestimation in `quicklistAllowInsert`.
    We prefer an overestimation, which would at worse lead to a few bytes below the lowest limit of 4k.

## Improvements
1. Calling `lpShrinkToFit` after converting Ziplist to listpack, which was missed at redis#9366.
2. Optimize `quicklistAppendPlainNode` to avoid memcpy data.

## Bugfix
1. Fix crash in `quicklistRepr` when ziplist is compressed, introduced from redis#9366.

## Test
1. Add unittest for `lpMerge`.
2. Modify the old quicklist ziplist corrupt dump test.

Co-authored-by: Oran Agra <[email protected]>
@oranagra oranagra added the release-notes indication that this issue needs to be mentioned in the release notes label Jan 23, 2022
enjoy-binbin added a commit to enjoy-binbin/redis that referenced this pull request May 19, 2022
Remove some dead code in object.c, ziplist is no longer used in 7.0

Some backgrounds:
zipmap - hash: replaced by ziplist in redis#285
ziplist - hash: replaced by listpack in redis#8887
ziplist - zset: replaced by listpack in redis#9366
ziplist - list: replaced by quicklist (listpack) in redis#2143 / redis#9740
oranagra pushed a commit that referenced this pull request May 22, 2022
Remove some dead code in object.c, ziplist is no longer used in 7.0

Some backgrounds:
zipmap - hash: replaced by ziplist in #285
ziplist - hash: replaced by listpack in #8887
ziplist - zset: replaced by listpack in #9366
ziplist - list: replaced by quicklist (listpack) in #2143 / #9740

Moved the location of ziplist.h in the server.c
enjoy-binbin added a commit to enjoy-binbin/redis that referenced this pull request Jul 31, 2023
Remove some dead code in object.c, ziplist is no longer used in 7.0

Some backgrounds:
zipmap - hash: replaced by ziplist in redis#285
ziplist - hash: replaced by listpack in redis#8887
ziplist - zset: replaced by listpack in redis#9366
ziplist - list: replaced by quicklist (listpack) in redis#2143 / redis#9740

Moved the location of ziplist.h in the server.c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus state:to-be-merged The PR should be merged soon, even if not yet ready, this is used so that it won't be forgotten

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants