Implement the new Redis Array type by antirez · Pull Request #15162 · redis/redis

antirez · 2026-05-04T07:34:58Z

Redis Array

For years, Redis has been missing a real indexed data structure for the use cases where the index and the spatial relationship of elements are semantic. Hashes give you random lookups, but you have to store an index as a key, and have no range visibility. Lists give you appending and trimming, but what is in the middle remains hard to access. Streams give you append-only events, which is another (useful, indeed) beast. None of these is what you want when the position itself has business meaning — slot 37, step 4, row 18552, day from 2934 to 2949, file line 11, 12, 15 and so forth. And, all those types, for different reasons, are all suboptimal when you want a ring buffer able to store the latest N observed samples of something.

Up to now, users found ways (they always do \o/) using the fact that the data structures that are obvious in this universe are also extremely powerful, if well implemented. But this forces compromises. Arrays handle these index-first requirements natively, and usually with much better memory and CPU usage than the workarounds. If the use case is the right one, Arrays often provide much better space, time and usability at the same time.

Internal encoding

When dense, an Array is essentially a more fancy C array. You don't pay anything for storing the index.
Yet, instead of going really flat, arrays are sliced into 4096-element slices, and each slice, when it contains just a few elements, uses a special sparse encoding. When a slice is empty it's just a NULL stored in the directory.
Small ints, floats, and short strings are pointer-tagged, so they cost zero additional memory beyond the pointer slot itself.
When very sparse, a super-directory of windowed directories is used. This allows the data type to be safe, instead of exhibiting pathological space or time behavior. This representation is only triggered when there are more than 8 million elements or very high indexes set.

Use cases

Arrays are mostly stateless if not for the fact that each array remembers the index of the latest added item, allowing ARINSERT and ARRING to work properly. Otherwise it is a set/get at this index game, with solid support for both setting / getting ranges, server-side scanning, returning only populated elements in a time which is proportional not to the range size, but to the population size.

A few concrete examples, that may work as mental models for the set of problems that are similar to them (from the POV of the data modeling).

Thermometer. A sensor reporting once per minute, with gaps:

ARSET       temp:room12:day7 123 22.3
ARGETRANGE  temp:room12:day7 600 660     # the 10:00–11:00 window, with NULLs
ARSCAN      temp:room12:day7 600 660     # only populated elements
AROP        temp:room12:day7 0 1439 MAX  # peak of the day, server-side

Missing minutes cost little to nothing. Numeric aggregation runs inside Redis. Telemetry, IoT, meter readings, KPI rollups.

Calendar. A clinic with 96 fifteen-minute slots per day:

ARSET       sched:room12:day 32 booking:991
ARSCAN      sched:room12:day 0 95      # only occupied slots
ARGETRANGE  sched:room12:day 48 63     # the afternoon full view to render

The slot number is the business key in this case. Room booking, parking spaces, warehouse bins, lockers, ...

Ring buffer. ARRING replaces the classic LPUSH+LTRIM pattern. Imagine remote dmesg.

ARRING       machine:123 200 "[141087.430123]: arm_cpu_init(): cpu 14 online" # Capped to 200 entries
ARLASTITEMS  machine:123 50 REV       # 50 newest first

Faster than LPUSH+LTRIM, keep indexed access to past elements. Last-N alarms, recent fraud scores, access history, remote logs, device events. Ok here the use cases are mainly the ones of the old pattern: it is just a better fit and allows to access random items in the middle, aggregate server-side, and so forth.

Workflow. Step number is the index, value is the status. Gaps are meaningful:

ARSET       claim:99172 0 received
ARSET       claim:99172 3 waiting:reviewer42
ARSET       claim:99172 5 approved
ARGETRANGE  claim:99172 0 5     # full workflow view, with NULLs for missing steps
ARSCAN      claim:99172 0 5     # only steps that have a state
ARCOUNT     claim:99172         # number of recorded steps
ARLEN       claim:99172         # highest reached step + 1

Skills knowledge base for agents. Arrays are good at representing / grepping into Markdown files:

ARSET   skill:metal_gpu 0 "...."
ARSET   skill:metal_gpu 1 "...."
ARSET   skill:metal_gpu 2 "...."
ARGREP  skill:metal_gpu - + RE "M3|M4" WITHVALUES

ARGREP has EXACT, MATCH, GLOB, RE, you can have multiple predicates, can select AND or OR behavior.

Bulk import results. Sparse row annotations over millions of rows / CSV / ...:

ARSET  import:job551 18552 ERR:bad_email
ARSCAN import:job551 0 1000000        # Provides only rows that have something

TLDR

If the position is part of the meaning, use an Array. If you want to aggregate or grep remotely, use an Array. Feedback welcome :)

Note

Medium Risk
Adds a sizable new third-party C regex engine (deps/tre) and wires it into the dependency build/clean flow, which can impact build portability and introduce new native-code surface area.

Overview
Vendors the TRE POSIX regex library (including compiler and matchers) into deps/tre and adds a deps/Makefile target to build/clean it, producing deps/tre/libtre.a.

Updates .gitignore to ignore the generated libtre.a artifact.

^{Reviewed by Cursor Bugbot for commit e73a465. Bugbot is set up for automated code reviews on this repo. Configure here.}

jit-ci · 2026-05-04T07:35:06Z

Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset.

In case there are security findings, they will be communicated to you as a comment inside the PR.

Hope you’ll enjoy using Jit.

Questions? Comments? Want to learn more? Get in touch with us.

augmentcode · 2026-05-04T07:35:56Z

This pull request is abnormally large and would use a significant amount of tokens to review. If you still wish to review it, comment "augment review" and we will review it.

ad-m · 2026-05-04T16:35:42Z

@augmentcode augment review

salvatorecorvaglia · 2026-05-05T14:02:52Z

Implementing a new Redis data type like this is definitely not trivial, and the design choices around indexing and memory efficiency are very clear. The focus on practical use cases makes it particularly solid.

Really valuable contribution.

Great work by @antirez.

ShooterIT

i read the version without regex lib, great work, just some comments

ShooterIT · 2026-05-06T13:23:15Z

+        offset = slice_size - winsize;
+    }
+
+    arSlice *s = zmalloc(arDenseAllocSize(winsize));


maybe we can use zmalloc_usable to get more memory to store more items

Fixed by 686e024

i mean zmalloc_usable may return a bigger memory than we want, we can use the extra memory to store items. but i am okay now since we will release RC1

ShooterIT · 2026-05-06T21:35:48Z

+    int keyremoved = (arCount(ar) == 0);
+    if (server.memory_tracking_enabled && total_deleted > 0 && keyremoved)
+        updateSlotAllocSize(c->db, getKeySlot(c->argv[1]->ptr), o, old_alloc, kvobjAllocSize(o));
+    if (total_deleted > 0) {


I feel like the finalization logic is somewhat similar to ardelCommand; perhaps we could refactor it into a separate function.

ARSET and ARMSET also have the similar issue.

sundb · 2026-05-07T07:43:35Z

+     * otherwise a double scan would be needed. */
+    void **collected = zmalloc(effective_count * sizeof(void *));
+    uint64_t anchor_idx =
+        (ar->insert_idx == AR_INSERT_IDX_NONE) ? ar_len - 1 : ar->insert_idx;


if ar->insert_idx is larger than ar_len, should set it to ar_len - 1?

Ok you mean something like:

DEL a ARSET a 0 zero one two three four ARSEEK a 100 ARLASTITEMS a 3

That would produce NULLs? This si indeed a degenerated case. And should instead be anchored to the top of the array. Thanks!

Wait, you mean to also change the cursor as a side effect?

No, I just think that the effective_count will be adjusted to the maximum of count. When the insert_idx is invalid, does it also need to be adjusted? For example, in the following case

DEL a ARSET a 0 zero one two three four ARSEEK a 10 ARLASTITEMS a 10

Because the value of "count" has been corrected, it will now return

127.0.0.1:6379> ARLASTITEMS a 10 1) (nil) 2) (nil) 3) (nil) 4) (nil) 5) (nil)

But I'm wondering, since insert_idx is valid, why wouldn't it return:

127.0.0.1:6379> ARLASTITEMS a 10 1) "zero" 2) "one" 3) "two" 4) "three" 5) "four" 6) (nil) 7) (nil) 8) (nil) 9) (nil) 10) (nil)

Co-authored-by: Shubham S Taple <[email protected]>

Co-authored-by: Yuan Wang <[email protected]>

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 3 total unresolved issues (including 1 from previous review).

^{Reviewed by Cursor Bugbot for commit 74d1fcd. Configure here.}

cursor · 2026-05-09T08:31:44Z

+		    status = tre_add_tag_right(mem, left, tag_left);
+		    tnfa->tag_directions[tag_left] = TRE_TAG_MAXIMIZE;
+		    status = tre_add_tag_right(mem, right, tag_right);
+		    tnfa->tag_directions[tag_right] = TRE_TAG_MAXIMIZE;


First tre_add_tag_right error silently discarded

Medium Severity

In ADDTAGS_AFTER_UNION_RIGHT, status is assigned the result of the first tre_add_tag_right call but immediately overwritten by the second call. If the first call returns REG_ESPACE (memory allocation failure) while the second succeeds, the error is silently lost and compilation continues with a partially-modified AST. The outer while loop's status == REG_OK guard never sees the failure, potentially allowing a corrupt regex to be used for matching.

^{Reviewed by Cursor Bugbot for commit 74d1fcd. Configure here.}

cursor · 2026-05-09T08:31:44Z

+    {
+      unsigned int i;
+      for (i = 0; i <= tnfa->num_submatches; i++)
+	saved_states[i].tag = -1;


Allocated saved_states array is never used

Low Severity

saved_states is allocated with xmalloc, initialized with -1 values, and freed at the end of tre_add_tags, but it is never read or used in any computation between allocation and free. The allocation is proportional to tnfa->num_submatches + 1 and occurs on every regex compilation that requires tag setup, making it a needless overhead.

Additional Locations (1)

deps/tre/lib/tre-compile.c#L1000-L1001

^{Reviewed by Cursor Bugbot for commit 74d1fcd. Configure here.}

fixed in the upstream.

Co-authored-by: Yuan Wang <[email protected]>

mgravell · 2026-05-12T10:16:36Z

Observation on AROP ... MATCH pattern; it feels "unbalanced" vs ARGREP; I don't know the complexity involved, but I wonder whether AROP ... GLOB pattern, AROP ... RE pattern etc would work.

Alternatively, maybe remove the MATCH from AROP, and add a COUNT modifier to ARGREP such that it only reports the count of matches, rather than listing them. This would presumably then also allow multiple predicates, and the use of AND / OR etc to work similarly.

sundb · 2026-05-12T12:09:38Z

+    } else if (o->type == OBJ_ARRAY) {
+        redisArray *ar = o->ptr;
+        uint64_t len = arLen(ar);
+        for (uint64_t idx = 0; idx < len; idx++) {


A very sparse array can cause the debug digest to get stuck.

ARSET key 1000000000 x debug digest

mgravell · 2026-05-13T11:35:46Z

+        assert_equal 0 [r arop myarray 0 2 XOR]
+    }
+
+    test {AROP AND/OR/XOR truncates floats toward zero} {


opinion (only): this "truncates floats toward zero" is not an obvious behaviour; if I request a bitwise operation, I expect it to operate on the exact bits I gave it, or error. Totally fine for the operation to only be defined on integers, if that is the intent; but since there is no ARINCRBY (unless I missed it), I don't expect any rounding errors, so this scenario seems really unexpected (and: if I was using ARINCRBY for floating-point incrs, I would have zero expectation for a bitwise operation to be meaningful... because I'm talking about FP).

Or rephrased: in what scenario is this behaviour expected and useful?

(it also gets weird in large values - for example a FP number above 2⁵³, where exact integers start to break down, but integer bitwise ops on the entire 64-bit range are normally anticipated to be meaningful)

sundb · 2026-05-13T13:04:39Z

@antirez in commit e73a465, ARSCAN/ARGREP command with WITHVALUES will reply as a pair array when RESP, and a flat array when RESP2

mgravell · 2026-05-13T13:07:41Z

-            "items": {
-                "oneOf": [
-                    {
+            "anyOf": [


oof; not a fan of these arbitrary "sometimes it is jagged, sometimes it is flat" decisions, especially when it doesn't even make use of any RESP3 intrinsics (map, etc)

I made this modification to keep it in sync with ZRANGE.
Perhaps from the client's perspective, it may not be very good.
@oranagra WDYT?

It probably impacts the "raw" clients more than the opinionated ones - it makes another pain point for people changing from RESP2 to RESP3. Consistency is not an unreasonable reason. I don't know what the "right" answer is here.

on the plus side: it reminded me that I should be using my own inbuilt helper types to make my own life easier... StackExchange/StackExchange.Redis@37cf357

I understand the discussion here is about using a completely different "layout" in RESP2 vs RESP3, and not the fact that it's a different layout when a WITHVALUES argument is passed (that's already well embedded in many other commands).

I understand that that's how it's in ZRANGE, and i don't recall why it was like that, but i assume we tried to "fix" (improve) WITHSCORES, and did that only for RESP3 to avoid breaking RESP2, which was probably a mistake.
and i think that in the case of the new command we can use a nested array in both RESP3 and RESP2.

@mgravell changed to use nested all the time when with WITHVALUES argument.
d740cce

noted, thanks

Co-authored-by: Marc Gravell <[email protected]>

mgravell · 2026-05-13T16:48:03Z

+#define AROP_MAX   3  /* Maximum numeric element in range. */
+#define AROP_AND   4  /* Bitwise AND of integer elements in range. */
+#define AROP_OR    5  /* Bitwise OR of integer elements in range. */
+#define AROP_XOR   6  /* Bitwise XOR of integer elements in range. */


Xor is a weird one; is there a genuine use-case for this, or is it here "because we could"? In particular, in the context of sparse arrays, are nil values skipped? Or treated as zero? The latter is presumably inconsistent with how And behaves; the former means you need to know how many values were skipped to understand the result. But I question whether this even has a valid use case.

Xor is a weird one; is there a genuine use-case for this, or is it here "because we could"? In particular, in the context of sparse arrays, are nil values skipped? Or treated as zero?

nil values are skipped in all aggregation modes.

mgravell · 2026-05-13T16:56:28Z

More on AROP / ARGREP: I guess I'm worried that a: there's a lot of overlap, and b: there's also a lot of gap. AROP is sometimes an open aggregate, sometimes a filter. ARGREP is a text filter with MATCH and others; AROP can only do MATCH, and in MATCH mode only does count.

For gap: for examples like sensor data, range filters are far more useful than MATCH filters (especially for continuous/FP data).

Suggestion (probably not for RC1):

ARQUERY and ARAGGREGATE / ARAGG instead
both take all the MATCH/GREP/etc things as optional predicates, plus AND/OR, and new RANGE (value, not index) predicate (using the usual open/closed syntax), presumably sharing a predicate engine - maybe also LEXRANGE
"query" fetches either indices or WITHVALUES pairs; run the predicate engine and give the rows for matches
"aggregate" requires at least one (possibly multiple?) aggregate options; run the predicate engine and do the aggregation for matches

Fixes: 1. After #15096, we pass -flto to jemalloc. On Azure Linux, the resulting jemalloc library cannot be handled at link time and the build fails. Adding -ffat-lto-objects so the compiler also emits regular object code that the linker can fall back to when it cannot handle the LTO-compiled library. 2. Fixed a warning about `path` being NULL in `moduleLoadInternalModules()`. 3. Fixed compile warnings on older GCC versions introduced by #15162 (reported on Ubuntu 20.04) Co-authored-by: debing.sun <[email protected]>

Implement the new Redis array type

ead90ea

cursor Bot reviewed May 4, 2026

View reviewed changes

Comment thread deps/tre/lib/tre-compile.c Outdated

simonw mentioned this pull request May 4, 2026

Add redis-array.html: in-browser playground for Redis Array (PR #15162) simonw/tools#277

Merged

ShubhamTaple reviewed May 5, 2026

View reviewed changes

Comment thread src/rdb.c Outdated

This comment was marked as off-topic.

Sign in to view

ShooterIT reviewed May 7, 2026

View reviewed changes

sundb reviewed May 7, 2026

View reviewed changes

Comment thread tests/unit/type/array.tcl Outdated

Comment thread src/networking.c Outdated

Comment thread src/object.h Outdated

Comment thread deps/Makefile Outdated

sundb reviewed May 7, 2026

View reviewed changes

sundb added release-notes indication that this issue needs to be mentioned in the release notes state:needs-doc-pr requires a PR to redis-doc repository labels May 7, 2026

sundb added this to Redis 8.8 May 7, 2026

github-project-automation Bot moved this to Todo in Redis 8.8 May 7, 2026

minchopaskal reviewed May 7, 2026

View reviewed changes

Comment thread src/sparsearray.c Outdated

Don't pass ASAN flags to deps, do it in another PR

216a7e8

cursor Bot reviewed May 7, 2026

View reviewed changes

Comment thread deps/tre/lib/tre-compile.c

Update tests/unit/type/array.tcl

2b53e17

cursor Bot reviewed May 8, 2026

View reviewed changes

Comment thread deps/tre/lib/tre-filter.c

Update src/rdb.c

5439334

Co-authored-by: Shubham S Taple <[email protected]>

cursor Bot reviewed May 8, 2026

View reviewed changes

Comment thread deps/tre/lib/tre-internal.h Outdated

Pick from laurikari/tre@13d597c (Issue laurikari/tre#135)

7b5e24b

sundb mentioned this pull request May 8, 2026

-Wuse-after-free build warning laurikari/tre#137

Closed

sundb added 2 commits May 8, 2026 10:56

Update src/object.h

61e845d

Update src/networking.c

ad1204a

cursor Bot reviewed May 8, 2026

View reviewed changes

Comment thread deps/tre/lib/regcomp.c

Reply error if pattern is empty

ae2c6e1

sundb mentioned this pull request May 8, 2026

tre_tnfa_run_parallel is declared identically twice in tre-internal.h laurikari/tre#138

Closed

sundb and others added 2 commits May 9, 2026 15:51

Use z*_usable family to calculate the usable size of ar

686e024

Co-authored-by: Yuan Wang <[email protected]>

Convert line comment to block comment

74d1fcd

Co-authored-by: Yuan Wang <[email protected]>

cursor Bot reviewed May 9, 2026

View reviewed changes

sundb and others added 3 commits May 10, 2026 21:05

Disable the use-after-free warning for hash_table_del()

a24cad8

Merge branch 'unstable' into array

bb47f9e

Add dismissArrayObject()

3d5a104

Co-authored-by: Yuan Wang <[email protected]>

mgravell mentioned this pull request May 12, 2026

Draft support for 8.8 arrays StackExchange/StackExchange.Redis#3076

Merged

sundb reviewed May 12, 2026

View reviewed changes

vladvildanov reviewed May 13, 2026

View reviewed changes

Comment thread src/commands/arscan.json Outdated

Add comment for AROP_* flags

bf55508

mgravell reviewed May 13, 2026

View reviewed changes

Reply ARSCAN/ARGREP WITHVALUES as nested pairs in RESP3

e73a465

mgravell reviewed May 13, 2026

View reviewed changes

Always use nested for WITHVALUES

d740cce

Co-authored-by: Marc Gravell <[email protected]>

sundb approved these changes May 13, 2026

View reviewed changes

mgravell reviewed May 13, 2026

View reviewed changes

sundb merged commit 0d95764 into redis:unstable May 13, 2026
19 checks passed

github-project-automation Bot moved this from Todo to Done in Redis 8.8 May 13, 2026

This was referenced May 14, 2026

AROP : reconsider integer cast in bitwise ops #15208

Open

AROP and ARGREP - both overlapping and incomplete #15211

Open

a-TODO-rov mentioned this pull request May 15, 2026

Support Redis Array data type (Redis 8.8) redis/lettuce#3749

Closed

mrniko mentioned this pull request May 15, 2026

Add RArray object redisson/redisson#7142

Closed

cxljs mentioned this pull request May 16, 2026

feat: impl array data structure redis/go-redis#3813

Merged

tezc mentioned this pull request May 18, 2026

Fix compile on some linux distros #15225

Merged

matteocervelli mentioned this pull request May 20, 2026

[Module5|13] [P2] feat: Redis shared agent memory demo with Array type matteocervelli/llm-ops-v1#43

Closed

aviggiano mentioned this pull request Jun 10, 2026

Roaring native bitmap: plan review - changes for the PR breakdown and design gist aviggiano/redis#7

Closed

24 tasks

Uh oh!

Conversation

antirez commented May 4, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Redis Array

Internal encoding

Use cases

TLDR

Uh oh!

jit-ci Bot commented May 4, 2026

Uh oh!

augmentcode Bot commented May 4, 2026

Uh oh!

Uh oh!

ad-m commented May 4, 2026

Uh oh!

salvatorecorvaglia commented May 5, 2026

Uh oh!

Uh oh!

This comment was marked as off-topic.

Uh oh!

ShooterIT left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 9, 2026

Choose a reason for hiding this comment

First tre_add_tag_right error silently discarded

Uh oh!

cursor Bot May 9, 2026

Choose a reason for hiding this comment

Allocated saved_states array is never used

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mgravell commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mgravell May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mgravell May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antirez commented May 4, 2026 •

edited by cursor Bot

Loading

ShooterIT left a comment •

edited

Loading

First `tre_add_tag_right` error silently discarded

Allocated `saved_states` array is never used

mgravell commented May 12, 2026 •

edited

Loading

mgravell May 13, 2026 •

edited

Loading

mgravell May 13, 2026 •

edited

Loading

sundb commented May 13, 2026 •

edited

Loading

oranagra May 13, 2026 •

edited

Loading

mgravell commented May 13, 2026 •

edited

Loading