Implement the new Redis Array type#15162
Conversation
|
Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset. In case there are security findings, they will be communicated to you as a comment inside the PR. Hope you’ll enjoy using Jit. Questions? Comments? Want to learn more? Get in touch with us. |
|
This pull request is abnormally large and would use a significant amount of tokens to review. If you still wish to review it, comment "augment review" and we will review it. |
|
@augmentcode augment review |
|
Implementing a new Redis data type like this is definitely not trivial, and the design choices around indexing and memory efficiency are very clear. The focus on practical use cases makes it particularly solid. Really valuable contribution. Great work by @antirez. |
| offset = slice_size - winsize; | ||
| } | ||
|
|
||
| arSlice *s = zmalloc(arDenseAllocSize(winsize)); |
There was a problem hiding this comment.
maybe we can use zmalloc_usable to get more memory to store more items
There was a problem hiding this comment.
i mean zmalloc_usable may return a bigger memory than we want, we can use the extra memory to store items. but i am okay now since we will release RC1
| int keyremoved = (arCount(ar) == 0); | ||
| if (server.memory_tracking_enabled && total_deleted > 0 && keyremoved) | ||
| updateSlotAllocSize(c->db, getKeySlot(c->argv[1]->ptr), o, old_alloc, kvobjAllocSize(o)); | ||
| if (total_deleted > 0) { |
There was a problem hiding this comment.
I feel like the finalization logic is somewhat similar to ardelCommand; perhaps we could refactor it into a separate function.
ARSET and ARMSET also have the similar issue.
| * otherwise a double scan would be needed. */ | ||
| void **collected = zmalloc(effective_count * sizeof(void *)); | ||
| uint64_t anchor_idx = | ||
| (ar->insert_idx == AR_INSERT_IDX_NONE) ? ar_len - 1 : ar->insert_idx; |
There was a problem hiding this comment.
if ar->insert_idx is larger than ar_len, should set it to ar_len - 1?
There was a problem hiding this comment.
Ok you mean something like:
DEL a
ARSET a 0 zero one two three four
ARSEEK a 100
ARLASTITEMS a 3
That would produce NULLs? This si indeed a degenerated case. And should instead be anchored to the top of the array. Thanks!
There was a problem hiding this comment.
Wait, you mean to also change the cursor as a side effect?
There was a problem hiding this comment.
No, I just think that the effective_count will be adjusted to the maximum of count. When the insert_idx is invalid, does it also need to be adjusted? For example, in the following case
DEL a
ARSET a 0 zero one two three four
ARSEEK a 10
ARLASTITEMS a 10
Because the value of "count" has been corrected, it will now return
127.0.0.1:6379> ARLASTITEMS a 10
1) (nil)
2) (nil)
3) (nil)
4) (nil)
5) (nil)
But I'm wondering, since insert_idx is valid, why wouldn't it return:
127.0.0.1:6379> ARLASTITEMS a 10
1) "zero"
2) "one"
3) "two"
4) "three"
5) "four"
6) (nil)
7) (nil)
8) (nil)
9) (nil)
10) (nil)
Co-authored-by: Shubham S Taple <[email protected]>
Co-authored-by: Yuan Wang <[email protected]>
Co-authored-by: Yuan Wang <[email protected]>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 3 total unresolved issues (including 1 from previous review).
Reviewed by Cursor Bugbot for commit 74d1fcd. Configure here.
| status = tre_add_tag_right(mem, left, tag_left); | ||
| tnfa->tag_directions[tag_left] = TRE_TAG_MAXIMIZE; | ||
| status = tre_add_tag_right(mem, right, tag_right); | ||
| tnfa->tag_directions[tag_right] = TRE_TAG_MAXIMIZE; |
There was a problem hiding this comment.
First tre_add_tag_right error silently discarded
Medium Severity
In ADDTAGS_AFTER_UNION_RIGHT, status is assigned the result of the first tre_add_tag_right call but immediately overwritten by the second call. If the first call returns REG_ESPACE (memory allocation failure) while the second succeeds, the error is silently lost and compilation continues with a partially-modified AST. The outer while loop's status == REG_OK guard never sees the failure, potentially allowing a corrupt regex to be used for matching.
Reviewed by Cursor Bugbot for commit 74d1fcd. Configure here.
| { | ||
| unsigned int i; | ||
| for (i = 0; i <= tnfa->num_submatches; i++) | ||
| saved_states[i].tag = -1; |
There was a problem hiding this comment.
Allocated saved_states array is never used
Low Severity
saved_states is allocated with xmalloc, initialized with -1 values, and freed at the end of tre_add_tags, but it is never read or used in any computation between allocation and free. The allocation is proportional to tnfa->num_submatches + 1 and occurs on every regex compilation that requires tag setup, making it a needless overhead.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 74d1fcd. Configure here.
|
Observation on Alternatively, maybe remove the |
| } else if (o->type == OBJ_ARRAY) { | ||
| redisArray *ar = o->ptr; | ||
| uint64_t len = arLen(ar); | ||
| for (uint64_t idx = 0; idx < len; idx++) { |
There was a problem hiding this comment.
A very sparse array can cause the debug digest to get stuck.
ARSET key 1000000000 x
debug digest
| assert_equal 0 [r arop myarray 0 2 XOR] | ||
| } | ||
|
|
||
| test {AROP AND/OR/XOR truncates floats toward zero} { |
There was a problem hiding this comment.
opinion (only): this "truncates floats toward zero" is not an obvious behaviour; if I request a bitwise operation, I expect it to operate on the exact bits I gave it, or error. Totally fine for the operation to only be defined on integers, if that is the intent; but since there is no ARINCRBY (unless I missed it), I don't expect any rounding errors, so this scenario seems really unexpected (and: if I was using ARINCRBY for floating-point incrs, I would have zero expectation for a bitwise operation to be meaningful... because I'm talking about FP).
Or rephrased: in what scenario is this behaviour expected and useful?
There was a problem hiding this comment.
(it also gets weird in large values - for example a FP number above 253, where exact integers start to break down, but integer bitwise ops on the entire 64-bit range are normally anticipated to be meaningful)
| "items": { | ||
| "oneOf": [ | ||
| { | ||
| "anyOf": [ |
There was a problem hiding this comment.
oof; not a fan of these arbitrary "sometimes it is jagged, sometimes it is flat" decisions, especially when it doesn't even make use of any RESP3 intrinsics (map, etc)
There was a problem hiding this comment.
I made this modification to keep it in sync with ZRANGE.
Perhaps from the client's perspective, it may not be very good.
@oranagra WDYT?
There was a problem hiding this comment.
It probably impacts the "raw" clients more than the opinionated ones - it makes another pain point for people changing from RESP2 to RESP3. Consistency is not an unreasonable reason. I don't know what the "right" answer is here.
There was a problem hiding this comment.
on the plus side: it reminded me that I should be using my own inbuilt helper types to make my own life easier... StackExchange/StackExchange.Redis@37cf357
There was a problem hiding this comment.
I understand the discussion here is about using a completely different "layout" in RESP2 vs RESP3, and not the fact that it's a different layout when a WITHVALUES argument is passed (that's already well embedded in many other commands).
I understand that that's how it's in ZRANGE, and i don't recall why it was like that, but i assume we tried to "fix" (improve) WITHSCORES, and did that only for RESP3 to avoid breaking RESP2, which was probably a mistake.
and i think that in the case of the new command we can use a nested array in both RESP3 and RESP2.
Co-authored-by: Marc Gravell <[email protected]>
| #define AROP_MAX 3 /* Maximum numeric element in range. */ | ||
| #define AROP_AND 4 /* Bitwise AND of integer elements in range. */ | ||
| #define AROP_OR 5 /* Bitwise OR of integer elements in range. */ | ||
| #define AROP_XOR 6 /* Bitwise XOR of integer elements in range. */ |
There was a problem hiding this comment.
Xor is a weird one; is there a genuine use-case for this, or is it here "because we could"? In particular, in the context of sparse arrays, are nil values skipped? Or treated as zero? The latter is presumably inconsistent with how And behaves; the former means you need to know how many values were skipped to understand the result. But I question whether this even has a valid use case.
There was a problem hiding this comment.
Xor is a weird one; is there a genuine use-case for this, or is it here "because we could"? In particular, in the context of sparse arrays, are nil values skipped? Or treated as zero?
nil values are skipped in all aggregation modes.
|
More on AROP / ARGREP: I guess I'm worried that a: there's a lot of overlap, and b: there's also a lot of gap. AROP is sometimes an open aggregate, sometimes a filter. ARGREP is a text filter with MATCH and others; AROP can only do MATCH, and in MATCH mode only does count. For gap: for examples like sensor data, range filters are far more useful than MATCH filters (especially for continuous/FP data). Suggestion (probably not for RC1):
|
Fixes: 1. After #15096, we pass -flto to jemalloc. On Azure Linux, the resulting jemalloc library cannot be handled at link time and the build fails. Adding -ffat-lto-objects so the compiler also emits regular object code that the linker can fall back to when it cannot handle the LTO-compiled library. 2. Fixed a warning about `path` being NULL in `moduleLoadInternalModules()`. 3. Fixed compile warnings on older GCC versions introduced by #15162 (reported on Ubuntu 20.04) Co-authored-by: debing.sun <[email protected]>


Redis Array
For years, Redis has been missing a real indexed data structure for the use cases where the index and the spatial relationship of elements are semantic. Hashes give you random lookups, but you have to store an index as a key, and have no range visibility. Lists give you appending and trimming, but what is in the middle remains hard to access. Streams give you append-only events, which is another (useful, indeed) beast. None of these is what you want when the position itself has business meaning — slot 37, step 4, row 18552, day from 2934 to 2949, file line 11, 12, 15 and so forth. And, all those types, for different reasons, are all suboptimal when you want a ring buffer able to store the latest N observed samples of something.
Up to now, users found ways (they always do \o/) using the fact that the data structures that are obvious in this universe are also extremely powerful, if well implemented. But this forces compromises. Arrays handle these index-first requirements natively, and usually with much better memory and CPU usage than the workarounds. If the use case is the right one, Arrays often provide much better space, time and usability at the same time.
Internal encoding
NULLstored in the directory.Use cases
Arrays are mostly stateless if not for the fact that each array remembers the index of the latest added item, allowing
ARINSERTandARRINGto work properly. Otherwise it is a set/get at this index game, with solid support for both setting / getting ranges, server-side scanning, returning only populated elements in a time which is proportional not to the range size, but to the population size.A few concrete examples, that may work as mental models for the set of problems that are similar to them (from the POV of the data modeling).
Thermometer. A sensor reporting once per minute, with gaps:
Missing minutes cost little to nothing. Numeric aggregation runs inside Redis. Telemetry, IoT, meter readings, KPI rollups.
Calendar. A clinic with 96 fifteen-minute slots per day:
The slot number is the business key in this case. Room booking, parking spaces, warehouse bins, lockers, ...
Ring buffer. ARRING replaces the classic LPUSH+LTRIM pattern. Imagine remote
dmesg.Faster than LPUSH+LTRIM, keep indexed access to past elements. Last-N alarms, recent fraud scores, access history, remote logs, device events. Ok here the use cases are mainly the ones of the old pattern: it is just a better fit and allows to access random items in the middle, aggregate server-side, and so forth.
Workflow. Step number is the index, value is the status. Gaps are meaningful:
Skills knowledge base for agents. Arrays are good at representing / grepping into Markdown files:
ARGREP has EXACT, MATCH, GLOB, RE, you can have multiple predicates, can select AND or OR behavior.
Bulk import results. Sparse row annotations over millions of rows / CSV / ...:
TLDR
If the position is part of the meaning, use an Array. If you want to aggregate or grep remotely, use an Array. Feedback welcome :)
Note
Medium Risk
Adds a sizable new third-party C regex engine (
deps/tre) and wires it into the dependency build/clean flow, which can impact build portability and introduce new native-code surface area.Overview
Vendors the TRE POSIX regex library (including compiler and matchers) into
deps/treand adds adeps/Makefiletarget to build/clean it, producingdeps/tre/libtre.a.Updates
.gitignoreto ignore the generatedlibtre.aartifact.Reviewed by Cursor Bugbot for commit e73a465. Bugbot is set up for automated code reviews on this repo. Configure here.