Skip to content

Fix set with duplicate elements causes sdiff to hang#11530

Merged
oranagra merged 4 commits intoredis:unstablefrom
enjoy-binbin:fix_sdiff_hang
Nov 22, 2022
Merged

Fix set with duplicate elements causes sdiff to hang#11530
oranagra merged 4 commits intoredis:unstablefrom
enjoy-binbin:fix_sdiff_hang

Conversation

@enjoy-binbin
Copy link
Contributor

@enjoy-binbin enjoy-binbin commented Nov 21, 2022

This payload produces a set with duplicate elements (listpack encoding):

restore _key 0 "\x14\x25\x25\x00\x00\x00\x0A\x00\x06\x01\x82\x5F\x35\x03\x04\x01\x82\x5F\x31\x03\x82\x5F\x33\x03\x00\x01\x82\x5F\x39\x03\x82\x5F\x33\x03\x08\x01\x02\x01\xFF\x0B\x00\x31\xBE\x7D\x41\x01\x03\x5B\xEC"

smembers key
1) "6"
2) "_5"
3) "4"
4) "_1"
5) "_3"  ---> dup
6) "0"
7) "_9"
8) "_3"  ---> dup
9) "8"
10) "2"

This kind of sets will cause SDIFF to hang, SDIFF generated a broken
protocol and left the client hung. (Expected ten elements, but only
got nine elements due to the duplication.)

If we set sanitize-dump-payload to yes, we will be able to find
the duplicate elements and report "ERR Bad data format".

Discovered and discussed in #11290.

This PR also improve prints when corrupt-dump-fuzzer hangs, it will
print the cmds and the payload, an example like:

Testing integration/corrupt-dump-fuzzer
[TIMEOUT]: clients state report follows.
sock6 => (SPAWNED SERVER) pid:28884
Killing still running Redis server 28884
commands caused test to hang:
SDIFF __key 
payload that caused test to hang: "\x14\balabala"

This payload produces a set with duplicate elements (listpack encoding):
```
restore _key 0 "\x14\x25\x25\x00\x00\x00\x0A\x00\x06\x01\x82\x5F\x35\x03\x04\x01\x82\x5F\x31\x03\x82\x5F\x33\x03\x00\x01\x82\x5F\x39\x03\x82\x5F\x33\x03\x08\x01\x02\x01\xFF\x0B\x00\x31\xBE\x7D\x41\x01\x03\x5B\xEC"

smembers key
1) "6"
2) "_5"
3) "4"
4) "_1"
5) "_3"  ---> dup
6) "0"
7) "_9"
8) "_3"  ---> dup
9) "8"
10) "2"
```

This kind of sets will cause SDIFF to hang, SDIFF generated a broken
protocol and left the client hung. (Expected ten elements, but only
got nine elements due to the duplication.)

If we set `sanitize-dump-payload` to yes, we will be able to find
the duplicate elements and report "ERR Bad data format".

Discovered and discussed in redis#11290.
@enjoy-binbin
Copy link
Contributor Author

will try later to see how to print the payload for this hang thing.
this looks like a fix that can be backported (the sdiff part)

Copy link
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

What about intset? Is it possible to create an intset dump with dup elements?

@enjoy-binbin
Copy link
Contributor Author

Normally it should not be possible
But everything is possible in restore, like:

127.0.0.1:6379> restore key 0 "\x0B\x0C\x02\x00\x00\x00\x02\x00\x00\x00\x01\x00\x01\x00\x0B\x00\x67\x90\x99\x0E\x3E\xEA\xB3\x19"
OK
127.0.0.1:6379> smembers key
1) "1"
2) "1"
127.0.0.1:6379> sdiff key
1) "1"
127.0.0.1:6379> object encoding key
"intset"

@enjoy-binbin
Copy link
Contributor Author

enjoy-binbin commented Nov 21, 2022

btw, i see the payload in top comment will carsh the server if we config set set-max-listpack-entries 5
should we handle this? @oranagra

127.0.0.1:6379> config set set-max-listpack-entries 5
OK
127.0.0.1:6379> restore key 0 "\x14%%\x00\x00\x00\n\x00\x06\x01\x82_5\x03\x04\x01\x82_1\x03\x82_3\x03\x00\x01\x82_9\x03\x82_3\x03\b\x01\x02\x01\xff\x0b\x001\xbe}A\x01\x03[\xec" replace
Error: Server closed the connection

log:

=== REDIS BUG REPORT START: Cut & paste starting from here ===
8467:M 21 Nov 2022 22:28:57.729 # === ASSERTION FAILED ===
8467:M 21 Nov 2022 22:28:57.729 # ==> t_set.c:485 'dictAdd(d,element,NULL) == DICT_OK' is not true

------ STACK TRACE ------

Backtrace:
src/redis-server *:6379(setTypeConvert+0xc8)[0x4a5be4]
src/redis-server *:6379(rdbLoadObject+0x1a38)[0x49a063]
src/redis-server *:6379(restoreCommand+0x3f0)[0x4df864]
src/redis-server *:6379(call+0xea)[0x4579db]
src/redis-server *:6379(processCommand+0xd8c)[0x4592bd]
src/redis-server *:6379(processCommandAndResetClient+0x35)[0x478d21]
src/redis-server *:6379(processInputBuffer+0x184)[0x478f79]
src/redis-server *:6379(readQueryFromClient+0x41f)[0x4794ac]
src/redis-server *:6379[0x547397]
src/redis-server *:6379[0x547a7f]
src/redis-server *:6379(aeProcessEvents+0x263)[0x44b840]
src/redis-server *:6379(aeMain+0x2a)[0x44ba3f]
src/redis-server *:6379(main+0xdf3)[0x4618f1]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7ff433c063d5]
src/redis-server *:6379[0x43ef64]

@zuiderkwast
Copy link
Contributor

btw, i see the payload in top comment will carsh the server if we config set set-max-listpack-entries 5

Well spotted! I think we should definitely handle this. The crash on serverAssert(dictAdd(d,element,NULL) == DICT_OK) happens is in setTypeConvert, so it also means that if we have a listpack with dup elements and later adding more elements causing it to be converted to hashtable, it will crash.

I think the solution is to just ignore the return value of dictAdd here.

@oranagra
Copy link
Member

in the spirit of #7807, it's perfectly ok to crash on an assertion in some random command if we got a corrupt RESTORE payload.
it's not ok to crash on a SIGSEGV (if sanitization was disabled), instead we must crash on an assertion.
and it's not ok to crash (even with an assert) in RESTORE when sanitization is enabled (the RESTORE command should be rejected with an error, and no leaks).
if the sanitization is disabled, it's ok that RESTORE crashes on assertion though (unlike when sanitization is enabled).

@oranagra
Copy link
Member

the above text may be a bit intimidating, the guidelines are:

  1. never crash on segfault
  2. if sanitization is enabled, you're guaranteed that you won't get any assertions either.
  3. if sanitization is disabled, you may get assertions
  4. it's always ok to reject a RESTORE command

anyway, from what i can tell, the crash report above is conforming to the rules, so i guess we can merge this, right?
maybe remove the fix to utils.py and re-add it together with the fix that adds the payload in that case?

@enjoy-binbin
Copy link
Contributor Author

maybe remove the fix to utils.py and re-add it together with the fix that adds the payload in that case?

ohh, i just see it, and i think the last commit can print the payload (just a simple test, but i think it can work), so now i think it can be merged

@oranagra oranagra merged commit 3f8756a into redis:unstable Nov 22, 2022
@enjoy-binbin enjoy-binbin deleted the fix_sdiff_hang branch November 22, 2022 09:22
madolson pushed a commit to madolson/redis that referenced this pull request Apr 19, 2023
This payload produces a set with duplicate elements (listpack encoding):
```
restore _key 0 "\x14\x25\x25\x00\x00\x00\x0A\x00\x06\x01\x82\x5F\x35\x03\x04\x01\x82\x5F\x31\x03\x82\x5F\x33\x03\x00\x01\x82\x5F\x39\x03\x82\x5F\x33\x03\x08\x01\x02\x01\xFF\x0B\x00\x31\xBE\x7D\x41\x01\x03\x5B\xEC"

smembers key
1) "6"
2) "_5"
3) "4"
4) "_1"
5) "_3"  ---> dup
6) "0"
7) "_9"
8) "_3"  ---> dup
9) "8"
10) "2"
```

This kind of sets will cause SDIFF to hang, SDIFF generated a broken
protocol and left the client hung. (Expected ten elements, but only
got nine elements due to the duplication.)

If we set `sanitize-dump-payload` to yes, we will be able to find
the duplicate elements and report "ERR Bad data format".

Discovered and discussed in redis#11290.

This PR also improve prints when corrupt-dump-fuzzer hangs, it will
print the cmds and the payload, an example like:
```
Testing integration/corrupt-dump-fuzzer
[TIMEOUT]: clients state report follows.
sock6 => (SPAWNED SERVER) pid:28884
Killing still running Redis server 28884
commands caused test to hang:
SDIFF __key 
payload that caused test to hang: "\x14\balabala"
```

Co-authored-by: Oran Agra <[email protected]>
enjoy-binbin added a commit to enjoy-binbin/redis that referenced this pull request Jul 31, 2023
This payload produces a set with duplicate elements (listpack encoding):
```
restore _key 0 "\x14\x25\x25\x00\x00\x00\x0A\x00\x06\x01\x82\x5F\x35\x03\x04\x01\x82\x5F\x31\x03\x82\x5F\x33\x03\x00\x01\x82\x5F\x39\x03\x82\x5F\x33\x03\x08\x01\x02\x01\xFF\x0B\x00\x31\xBE\x7D\x41\x01\x03\x5B\xEC"

smembers key
1) "6"
2) "_5"
3) "4"
4) "_1"
5) "_3"  ---> dup
6) "0"
7) "_9"
8) "_3"  ---> dup
9) "8"
10) "2"
```

This kind of sets will cause SDIFF to hang, SDIFF generated a broken
protocol and left the client hung. (Expected ten elements, but only
got nine elements due to the duplication.)

If we set `sanitize-dump-payload` to yes, we will be able to find
the duplicate elements and report "ERR Bad data format".

Discovered and discussed in redis#11290.

This PR also improve prints when corrupt-dump-fuzzer hangs, it will
print the cmds and the payload, an example like:
```
Testing integration/corrupt-dump-fuzzer
[TIMEOUT]: clients state report follows.
sock6 => (SPAWNED SERVER) pid:28884
Killing still running Redis server 28884
commands caused test to hang:
SDIFF __key 
payload that caused test to hang: "\x14\balabala"
```

Co-authored-by: Oran Agra <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants