Merged
Conversation
This test was nearly always failing on MacOS github actions. This is because of bugs in the test that caused it to nearly always run all 3 attempts and just look at the last one as the pass/fail creteria. i.e. the test was nearly always running all 3 attempts and still sometimes succeed. this is because the break condition was different than the test completion condition. The reason the test succeeded is because the break condition tested the results of all 3 tests (PSETEX/PEXPIRE/PEXPIREAT), but the success check at the end was only testing the result of PSETEX. The reason the PEXPIREAT test nearly always failed is because it was getting the current time wrong: getting the current second and loosing the sub-section time, so the only chance for it to succeed is if it run right when a certain second started. Because i now get the time from redis, adding another round trip, i added another 100ms to the PEXPIRE test to make it less fragile, and also added many more attempts. Adding many more attempts before failure to account for slow platforms, github actions and valgrind
These tests started failing every day on http 404 (not being able to install valgrind)
…ests (#7561) Co-authored-by: MemuraiUser <[email protected]>
Before this commit, following command did not show --tls option: ./runtest-cluster --help ./runtest-sentinel --help
This commit makes stream object returning "stream" as encoding type in OBJECT ENCODING subcommand and DEBUG OBJECT command. Till now, it would return "unknown"
Rather than blindly evicting until maxmemory limit is achieved, this update adds a time limit to eviction. While over the maxmemory limit, eviction will process before each command AND as a timeProc when no commands are running. This will reduce the latency impact on many cases, especially pathological cases like massive used memory increase during dict rehashing. There is a risk that some other edge cases (like massive pipelined use of MGET) could cause Redis memory usage to keep growing despite the eviction attempts, so a new maxmemory-eviction-tenacity config is introduced to let users mitigate that.
If one thread got SIGSEGV, function sigsegvHandler() would be triggered, it would call bioKillThreads(). But call pthread_cancel() to cancel itself would make it block. Also note that if SIGSEGV is caught by bio thread, it should kill the main thread in order to give a positive report.
The fix in error handling of rdbGenericLoadStringObject is an actual bugfix
Redis 6.0 introduces I/O threads, it is so cool and efficient, we use C11 _Atomic to establish inter-thread synchronization without mutex. But the compiler that must supports C11 _Atomic can compile redis code, that brings a lot of inconvenience since some common platforms can't support by default such as CentOS7, so we want to implement redis atomic type to make it more portable. We have implemented our atomic variable for redis that only has 'relaxed' operations in src/atomicvar.h, so we implement some operations with 'sequentially-consistent', just like the default behavior of C11 _Atomic that can establish inter-thread synchronization. And we replace all uses of C11 _Atomic with redis atomic variable. Our implementation of redis atomic variable uses C11 _Atomic, __atomic or __sync macros if available, it supports most common platforms, and we will detect automatically which feature we use. In Makefile we use a dummy file to detect if the compiler supports C11 _Atomic. Now for gcc, we can compile redis code theoretically if your gcc version is not less than 4.1.2(starts to support __sync_xxx operations). Otherwise, we remove use mutex fallback to implement redis atomic variable for performance and test. You will get compiling errors if your compiler doesn't support all features of above. For cover redis atomic variable tests, we add other CI jobs that build redis on CentOS6 and CentOS7 and workflow daily jobs that run the tests on them. For them, we just install gcc by default in order to cover different compiler versions, gcc is 4.4.7 by default installation on CentOS6 and 4.8.5 on CentOS7. We restore the feature that we can test redis with Helgrind to find data race errors. But you need install Valgrind in the default path configuration firstly before running your tests, since we use macros in helgrind.h to tell Helgrind inter-thread happens-before relationship explicitly for avoiding false positives. Please open an issue on github if you find data race errors relate to this commit. Unrelated: - Fix redefinition of typedef 'RedisModuleUserChangedFunc' For some old version compilers, they will report errors or warnings, if we re-define function type.
We're already using bg_unlink in several places to delete the rdb file in the background, and avoid paying the cost of the deletion from our main thread. This commit uses bg_unlink to remove the temporary rdb file in the background too. However, in case we delete that rdb file just before exiting, we don't actually wait for the background thread or the main thread to delete it, and just let the OS clean up after us. i.e. we open the file, unlink it and exit with the fd still open. Furthermore, rdbRemoveTempFile can be called from a thread and was using snprintf which is not async-signal-safe, we now use ll2string instead.
The symbol base address is a const on this system.
Co-authored-by: Oran Agra <[email protected]>
445a4b6 introudced a makefile script that detects if the toolchain supports c11, and it looked that it was passing on MacOS and fails on Ubuntu, looks like Ubuntu's Dash was spawning a background process, deleted foo.c before gcc tried to compile it.
Refine comment of makeThreadKillable(). This commit can be backported to 5.0, only if we also backport 8b70cb0. Co-authored-by: Oran Agra <[email protected]>
This commit can be cherry picked to 6.0 only if we also cherry pick f866023.
jemalloc configure shows this:
configure: WARNING: unrecognized options: --enable-cc-silence
The changelog of jemalloc 4.0 has:
- Replace --enable-cc-silence with --disable-cc-silence to suppress spurious
warnings by default.
Change `val` to `unsigned char` before being tested.
The fix is identical to the one that's been made in upstream jemalloc.
warning is:
src/malloc_io.c: In function ‘malloc_vsnprintf’:
src/malloc_io.c:369:2: warning: case label value exceeds maximum value for type
369 | case '?' | 0x80: \
| ^~~~
src/malloc_io.c:581:5: note: in expansion of macro ‘GET_ARG_NUMERIC’
581 | GET_ARG_NUMERIC(val, 'p');
| ^~~~~~~~~~~~~~~
…op call (#7829) This commit adds streamIteratorStop call in rewriteStreamObject function in some of the return statement. Although currently this will not cause memory leak since stream id is only 16 bytes long.
…nabled (#7819) When all replicas waiting for a bgsave get disconnected (possibly due to output buffer limit), It may be good to kill the bgsave child. in diskless replication it already happens, but in disk-based, the child may still serve some purpose (for persistence). By killing the child, we prevent it from eating COW memory in vain, and we also allow a new child fork sooner for the next full synchronization or bgsave. We do that only if rdb persistence wasn't enabled in the configuration. Btw, now, rdbRemoveTempFile in killRDBChild won't block server, so we can killRDBChild safely.
redis-check-rdb was unable to parse rdb files containing module aux data. Co-authored-by: Oran Agra <[email protected]>
This happens only on diskless replicas when attempting to reconnect after failing to load an RDB file. It is more likely to occur with larger datasets. After reconnection is initiated, replicationEmptyDbCallback() may get called and try to write to an unconnected socket. This triggered another issue where the connection is put into an error state and the connect handler never gets called. The problem is a regression introduced by commit c17e597.
this is very dangerous bug, but it looks like it didn't cause any harm.
mainly backtrace and register dump support.
Co-authored-by: Alex Ronke <[email protected]>
Before this commit, we would have continued to add replies to the reply buffer even if client output buffer limit is reached, so the used memory would keep increasing over the configured limit. What's more, we shouldn’t write any reply to the client if it is set 'CLIENT_CLOSE_ASAP' flag because that doesn't conform to its definition and we will close all clients flagged with 'CLIENT_CLOSE_ASAP' in ‘beforeSleep’. Because of code execution order, before this, we may firstly write to part of the replies to the socket before disconnecting it, but in fact, we may can’t send the full replies to clients since OS socket buffer is limited. But this unexpected behavior makes some commands work well, for instance ACL DELUSER, if the client deletes the current user, we need to send reply to client and close the connection, but before, we close the client firstly and write the reply to reply buffer. secondly, we shouldn't do this despite the fact it works well in most cases. We add a flag 'CLIENT_CLOSE_AFTER_COMMAND' to mark clients, this flag means we will close the client after executing commands and send all entire replies, so that we can write replies to reply buffer during executing commands, send replies to clients, and close them later. We also fix some implicit problems. If client output buffer limit is enforced in 'multi/exec', all commands will be executed completely in redis and clients will not read any reply instead of partial replies. Even more, if the client executes 'ACL deluser' the using user in 'multi/exec', it will not read the replies after 'ACL deluser' just like before executing 'client kill' itself in 'multi/exec'. We added some tests for output buffer limit breach during multi-exec and using a pipeline of many small commands rather than one with big response. Co-authored-by: Oran Agra <[email protected]>
…7307) When fclose would fail, the previous implementation would have attempted to do fclose again this can in theory lead to segfault. other changes: check for non-zero return value as failure rather than a specific error code. this doesn't fix a real bug, just a minor cleanup.
We should sync temp DB file before renaming as rdb_fsync_range does not use
flag `SYNC_FILE_RANGE_WAIT_AFTER`.
Refer to `Linux Programmer's Manual`:
SYNC_FILE_RANGE_WAIT_AFTER
Wait upon write-out of all pages in the range after performing any write.
Make sure we handle short writes correctly, sync to disk after writing and use rename to make sure the replacement is actually atomic. In any case of failure old configuration will remain in place. Also, add some additional logging to make it easier to diagnose rewrite problems.
We may access and modify these two variables in signal handler function, to guarantee them async-signal-safe, so we should set them to volatile sig_atomic_t type. It doesn't look like this could have caused any real issue, and it seems that signals are handled in main thread on most platforms. But we want to follow C and POSIX standard in signal handler function.
There's currently an issue with IO threads and gopher (issuing lookupKey from within the thread). simply fix is to just not support it for now.
when slaveof config is "no one", reset any pre-existing config and resume. also solve a memory leak if slaveof appears twice. and fail loading if port number is out of range or not an integer. Co-authored-by: caozhengbin <[email protected]> Co-authored-by: Oran Agra <[email protected]>
Also stabilize new shutdown tests on slow machines (valgrind)
- The argument `u` in for `ar` is ignored (and generates warnings since `D` became the default. All it does is avoid updating unchanged objects (shouldn't have any impact on our build) - Enable `LUA_USE_MKSTEMP` to force the use of `mkstemp()` instead of `tmpname()` (which is dead code in redis anyway). - Remove unused variable `c` in `f_parser()` - Removed misleadingly indented space in `luaL_loadfile()` and ``addfield()` Co-authored-by: Oran Agra <[email protected]>
The tls-ca-cert or tls-ca-cert-dir configuration parameters are only used when Redis needs to authenticate peer certificates, in one of these scenarios: 1. Incoming clients or replicas, with `tls-auth-clients` enabled. 2. A replica authenticating the master's peer certificate. 3. Cluster nodes authenticating other nodes when establishing the bus protocol connection.
The client pointed to by the module context may in some cases be a fake client. RM_Authenticate*() calls in this case would be ineffective but appear to succeed, and this change fails them to make it easier to catch such cases.
When REDISMODULE_EVENT_CLIENT_CHANGE events are delivered, modules may want to mutate the client state (e.g. perform authentication). This change links the module context with the real client rather than a fake client for these events.
…s” (#7871) PROBLEM: [$rd1 read] reads invalidation messages one by one, so it's never going to see the second invalidation message produced after INCR b, whether or not it exists. Adding another read will block incase no invalidation message is produced. FIX: We switch the order of "INCR a" and "INCR b" - now "INCR b" comes first. We still only read the first invalidation message produces. If an invalidation message is wrongly produces for b - then it will be produced before that of a, since "INCR b" comes before "INCR a". Co-authored-by: Nitai Caro <[email protected]>
The MEMORY command is used for debugging memory usage, so it should include internal fragmentation, same as used_memory
…'s internal frag (#7875) This commit has two aspects: 1) improve memory reporting for all the places that use sdsAllocSize to compute memory used by a string, in this case it'll include the internal fragmentation. 2) reduce the need for realloc calls by making the sds implicitly take over the internal fragmentation of the block it allocated.
Add optional GET parameter to SET command in order to set a new value to a key and retrieve the old key value. With this change we can deprecate `GETSET` command and use only the SET command with the GET parameter.
Adding -D option for redis-cli to control newline between command responses in raw mode. Also removing cleanup code before calling exit, just in order to avoid adding more adding more cleanup code (redis doesn't bother to release allocations before exit anyway) Co-authored-by: Oran Agra <[email protected]>
track and report memory used by clients argv. this is very usaful in case clients started sending a command and didn't complete it. in which case the first args of the command are already trimmed from the query buffer. in an effort to avoid cache misses and overheads while keeping track of these, i avoid calling sdsZmallocSize and instead use the sdslen / bulk-len which can at least give some insight into the problem. This memory is now added to the total clients memory usage, as well as the client list.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.