Add child shards filtering by radoslawcybulski · Pull Request #28189 · scylladb/scylladb

radoslawcybulski · 2026-01-15T21:13:04Z

Add a CHILD_SHARDS filter to DescribeStream command.
The user needs to pass a parent stream shard id, Streams will then fetch cdc generations, find children and return them to the caller. By children we understand any stream shard, that have at least one token common with parent stream shard. There might be any number of children, but must be at least one.

Add unit tests.

Fixes: #25160
Fixes SCYLLADB-532

nyh · 2026-01-15T23:34:16Z

Does this "Fixes" #25160?
I think it should, but if it doesn't, please explain why.

Please write a test for this in test/alternator, not just in test/boost. This is not just petty preference of programming language: The tests in test/alternator is what allows us to run to check you really did the API correctly. Did you handle the "ShardFilter" option correctly in the same way as DynamoDB does? Do you handle the correct, and incorrect Type parameter? Is its value case-sensitive or not? What happens if you give this filter a bad shardid? A good sharid?

Looking at your code, it seems like you handled all these edge cases which I just mentioned - but how did you know that you handled it correctly, i.e., same as DynamoDB? How will we know a year from now as more people change this code, that something in this API's support doesn't regress?

scylladbbot · 2026-01-16T09:48:35Z

🟢 CI State: SUCCESS

Mode	Stage	Status	Queue Time	Run Time	Node
	Framework test	✅			spider9.cloudius-systems.com
dev	Build	✅	7h 47m	12m	spider7.cloudius-systems.com
	test.py	✅		2h 34m	spider7.cloudius-systems.com
	Unit Tests Custom 🔹 test/alternator/test_streams.py 🔹 test/boost/alternator_unit_test.cc	✅			spider7.cloudius-systems.com
release	Build	✅	1m	14m	spider12.cloudius-systems.com
	test.py	✅		1h 43m	spider12.cloudius-systems.com
	Unit Tests Custom 🔹 test/alternator/test_streams.py 🔹 test/boost/alternator_unit_test.cc	✅			spider12.cloudius-systems.com
debug	Build	✅	7h 38m	16m	spider5.cloudius-systems.com
	test.py	✅		4h 12m	spider5.cloudius-systems.com
	Unit Tests Custom 🔹 test/alternator/test_streams.py 🔹 test/boost/alternator_unit_test.cc	✅			spider5.cloudius-systems.com
dtest	dtest with tablets	✅
	dtest with consistent topology changes	✅

Build Details:

Total Queue: 15h 26m
Total Run: 10h 51m

scylladbbot · 2026-01-26T16:44:20Z

🟢 CI State: SUCCESS

Mode	Stage	Status	Queue Time	Run Time	Node
	Framework test	✅			spider9.cloudius-systems.com
dev	Build	✅	1h 9m	14m	spider2.cloudius-systems.com
	test.py	✅		1h 39m	spider2.cloudius-systems.com
	Unit Tests Custom 🔹 test/alternator/test_streams.py 🔹 test/boost/alternator_unit_test.cc	✅			spider2.cloudius-systems.com
release	Build	✅	15m	18m	spider4.cloudius-systems.com
	test.py	✅		55m	spider4.cloudius-systems.com
	Unit Tests Custom 🔹 test/alternator/test_streams.py 🔹 test/boost/alternator_unit_test.cc	✅			spider4.cloudius-systems.com
debug	Build	✅	18m	20m	spider1.cloudius-systems.com
	test.py	✅		3h 9m	spider1.cloudius-systems.com
	Unit Tests Custom 🔹 test/alternator/test_streams.py 🔹 test/boost/alternator_unit_test.cc	✅			spider1.cloudius-systems.com
dtest	dtest with tablets	✅
	dtest with consistent topology changes	✅

Build Details:

Total Queue: 1h 42m
Total Run: 8h 20m

radoslawcybulski · 2026-02-05T10:18:50Z

Updated patch: added tests.

scylladbbot · 2026-02-05T14:43:20Z

🟢 CI State: SUCCESS

Mode	Stage	Status	Queue Time	Run Time	Node
	Framework test	✅			spider9.cloudius-systems.com
dev	Build	✅	1h 5m	14m	spider6.cloudius-systems.com
	test.py	✅		1h 31m	spider6.cloudius-systems.com
	Unit Tests Custom 🔹 test_stream_shard_filtering_missing_shard_id 🔹 test_stream_shard_filtering_missing_type 🔹 test_stream_shard_filtering_simple_no_children 🔹 test_stream_shard_filtering_wrong_shard_id 🔹 test_stream_shard_filtering_wrong_type	✅	0m	2m	spider6.cloudius-systems.com
release	Build	✅	57m	16m	spider15.cloudius-systems.com
	test.py	✅		46m	spider15.cloudius-systems.com
	Unit Tests Custom 🔹 test_stream_shard_filtering_missing_shard_id 🔹 test_stream_shard_filtering_missing_type 🔹 test_stream_shard_filtering_simple_no_children 🔹 test_stream_shard_filtering_wrong_shard_id 🔹 test_stream_shard_filtering_wrong_type	✅	0m	2m	spider15.cloudius-systems.com
debug	Build	✅	5m	17m	spider14.cloudius-systems.com
	test.py	✅		2h 39m	spider14.cloudius-systems.com
	Unit Tests Custom 🔹 test_stream_shard_filtering_missing_shard_id 🔹 test_stream_shard_filtering_missing_type 🔹 test_stream_shard_filtering_simple_no_children 🔹 test_stream_shard_filtering_wrong_shard_id 🔹 test_stream_shard_filtering_wrong_type	✅	0m	4m	spider14.cloudius-systems.com
dtest	dtest with tablets	✅
	dtest with consistent topology changes	✅

Build Details:

Total Queue: 2h 7m
Total Run: 4h 24m

github-actions · 2026-02-05T14:50:26Z

@radoslawcybulski, this PR has merge conflicts with the base branch. Please resolve the conflicts so we can merge it.

nyh

If I pretend I understand what's going on in this patch, I think this patch is probably correct and could be merged, but if you'll look at my comments you'll see that I didn't understand it. See my comments for a few more concrete things I had to say. But I'll probably end up merging this patch before I fully understand it.

Something which bothers me is that this patch doesn't just read the CHILD_SHARDS parameter and ignores it - it really has a lot of code handling the actual CHILD_SHARDS. But then, we don't actually get a test for this. I'm assuming (and you also had a comment about it) that your plan is to later add tablets support and create a test in test/cluster which adds tablets and checks the child shards feature in earnest. Maybe you already wrote such a test?

nyh · 2026-02-05T15:04:07Z

 static constexpr auto dynamodb_streams_max_window = 24h;

+// for parent-child stuff we need id:s to be sorted by token
+// (see explanation above) since we want to find closest


I'm having a hard time understanding the theory here (it was @elcallio, not me, who wrote this code originally), but I think I got it - please let me know if I did:

DynamoDB's notion of "parent" shard (shard is the CDC stream in DynamoDB jargon) has to do only with splits: One shard is split into two (or more). If we assume that are shards are tablets (I don't know why this code does anything relevant with vnodes...) and can only split, then indeed the "parent" shard is the one with the closest start token from (inclusive) the bottom to the new shard's start token.

Anyway, I see that this code here isn't actually new code, it's @elcallio's original code which you moved around. I'll have to apply some "suspension of disbelief" and assume most of this code is correct and look only for really suspicious things.

This is rather extensively tested in alternator_unit_test, including brute force checking all possible combinations for some fixed amount of parents and children.

If i understand DynamoDB correctly, you don't have to have a split to get new shards and parent, you might get new generation just after some time.

Child relationship is to make you drain parent completely before starting to process child. To do this you've a guarantee that if children exists, no more entries will be added to parent. We need this relationship for vnodes (here it serves the same purpose), the implementation is slightly different because vnodes token space wraps around, while tablets don't.

DynamoDB does not, as far as I know, have "generations". They don't get new shards if there isn't a split. At least as far as I know.

I'm not asking why the notion of "Child" exists in DynamoDB - what I'm doubting is why what we did in Alternator - especially for vnodes - makes any sense there, where "generations" can, at least in theory, just mix up all the vnodes and there is not necessarily a single parent (whom you need to drain first) to one new "shard".

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html#Streams.Processing

Because shards have a lineage (parent and children), an application must always process a parent shard before it processes a child shard.

yes, generations is our concept, but it's just our implementation.

They also mention:

Shards are ephemeral: They are created and deleted automatically, as needed.

Generations can mix shards as they wish, but they have to keep lineage correct (token space wise - so for stream shard A "children" stream shards B..Z that cover the same token space must have parent-children relationship set), otherwise it's not going to work (order requirement on reading events will be broken). We're actually relaying on Streams clients using CHILD_SHARDS filtering, as parent relationship is not enough.

nyh · 2026-02-05T15:16:17Z

+// #7409 - shards must be returned in lexicographical order,
+// normal bytes compare is string_traits<int8_t>::compare.
+// thus bytes 0x8000 is less than 0x0000. By doing unsigned
+// compare instead we inadvertently will sort in string lexical.


This comment (which I know, was preexisting, so it's not your fault) looks self-constricting - if "by doing unsigned compare we inadvertently ...", then why do it? Maybe the comment meant to say "signed compare"?

My english isn't that good, to me this feels not precise, but decently correct. Updated comment.

nyh · 2026-02-05T15:17:58Z

+    , lo2(items.end())
+    , end2(items.end())
+{
+    assert(end1 <= lo2);


Did you deliberately decide to use assert() instead of SCYLLA_ASSERT() to make this compile out in release mode?
I don't love this, but it's indeed safer than SCYLLA_ASSERT(). The throwing_assert() I'm now proposing would be better than both.

By the way, I'm curious why you decided to assert end1 <= lo2, and not other possible assertions like lo1 <= end1 and lo2 <= end2. Or is it because wrap-around is possible?

This is developing leftover - i'm used to use assert and it breaks, when running under gdb. The algorithm is messy (even my implementation, which is like 4th version and the best i could come up so far), the assert here checks, that after wrap around two ranges don't overlap (the upper layer should merge them together then).

nyh · 2026-02-05T15:42:33Z

+// then we will find previous shard in parent stream - that will determine range
+// then based on the range we will find children shards in current_streams
+// NOTE: function sorts / reorders current_streams
+// NOTE: function assumes parent_streams is sorted by token_cmp and it doesn't modify it


These two notes are very strange. Why make different assumptions on the two lists?
If you call this function several times, it re-sorts the already sorted current list? Why does this make sense?

That's why the comment is there.

This is the original behaviour (where we sorted parents by signed comparison to be able to find a parent for our shard and we sorted children by unsigned, because that's how we want them to be returned). It is just moved to a function.

You could mess around with comparison function and remove sorting i think. Not sure it's worth it.

nyh · 2026-02-05T15:56:44Z

        wait_for_active_stream(dynamodbstreams, table)

+# NOTE: it's hard to add child stream shards on vnodes,
+# filtering with children will be added in streams for tablets patch


I don't know what this comment explains. Better to explain what the test that you did write does.

Added some explanation.

You didn't explain... This "NOTE" explains what was hard to do and what you didn't test yet. But what is this test even about?

Here is what I would have used:

# This test is the most basic check of DescribeStream with ShardFilter set to CHILD_SHARDS. Since there is no cluster changes in this single-node test, there are no shards being closed and child shards being born, so we expect the list of child shards is expected to be empty.

All the other apologetics - why this test isn't more sophisticated - can come afterwards (if at all).

Done. Rewrote the comment on test_stream_shard_filtering_simple_no_children to explain: this tests the CHILD_SHARDS filter in a simple case where there's only one generation, so asking for child shards returns nothing. Commit 0020137.

nyh · 2026-02-05T15:57:42Z

+def test_stream_shard_filtering_simple_no_children(dynamodb, dynamodbstreams):
+    with create_stream_test_table(dynamodb, StreamViewType='KEYS_ONLY') as table:
+        streams = dynamodbstreams.list_streams(TableName=table.name)
+        arn = streams['Streams'][0]['StreamArn']


I thought you didn't like to get the ARN from ListStreams, and prefered to use DescribeTable to do it :-)

I copied from the test above. I tried slightly to find out property name to get arn from table (which should be there), but failed, so copied.

nyh · 2026-02-05T16:00:56Z

+        assert not shards # no children
+
+def test_stream_shard_filtering_wrong_type(dynamodb, dynamodbstreams):
+    with create_stream_test_table(dynamodb, StreamViewType='KEYS_ONLY') as table:


You have here multiple tests which create a table and a stream and then just run a silly do-nothing (even failing) command against it. Would have been less wasteful to create a module-scope fixture with this test table, and then the individual tests just use it.

Unfortunately this test file doesn't have such a fixture yet so you'll need to add it.

You need to explain to me what exactly module-scope fixture does? It is created once per module (rather than once per test)?

Yes. "module" is just a test file, and a module-scoped fixture is created when the first test needs it, and destroyed when the test file ends. So if 5 tests use the same fixture, it is only created once.
Here is some untested example code:

# Note: See comments above why DynamoDB has a bug (?) in LATEST making it difficult to use the # same test-table fixture for multiple tests that do actual CDC data checks. But it should be fine on # both Alternator and DynamoDB for checking error handling and such @pytest.fixture(scope="module") def table1(dynamodb, dynamodbstreams): with create_stream_test_table(dynamodb, StreamViewType='KEYS_ONLY') as table: yield table

github-actions · 2026-02-10T23:55:28Z

@radoslawcybulski, this PR has merge conflicts with the base branch. Please resolve the conflicts so we can merge it.

ScyllaPiotr · 2026-02-18T15:55:41Z

I'll probably end up merging this patch before I fully understand it.

Something which bothers me is that this patch doesn't just read the CHILD_SHARDS parameter and ignores it - it really has a lot of code handling the actual CHILD_SHARDS. But then, we don't actually get a test for this.

@nyh I think we should target for the code maintainer to merge the code if and only if either a code or an associated test set is fully understood and accepted. Do you agree?

radoslawcybulski · 2026-04-13T07:01:42Z

@nyh will you merge or should i ask @scylladb/scylla-maint for it?

nyh · 2026-04-13T12:45:04Z

@nyh will you merge or should i ask @scylladb/scylla-maint for it?

I can (and so can any other maintainer), but we're, yet again, in a merge freeze :-(

Cherry-pick of f96c1e9 from radoslawcybulski/add-child-shards-filtering (PR scylladb#28189) with conflict resolution to integrate CHILD_SHARDS shard filter support into the tablet stream branch. Add a `CHILD_SHARDS` filter to `DescribeStream` command. When used, user needs to pass a parent stream shard id as json's ShardFilter.ShardId field. DescribeStream will then return only the list of stream shards that are direct descendants of passed parent stream shard. Each stream shard covers a consecutive part of token space. A stream shard Q is considered to be a child of stream shard W, when at least one token belongs to token spaces from both streams. The filtering algorithm itself is somewhat complicated - more details in comments in streams.cc. CHILD_SHARDS is an Amazon's functionality and is required by KCL. Add unit tests. Conflict resolution: kept the tablet/vnode branching for populating topologies from HEAD, added ShardFilter parsing after it. Removed duplicate replica/database.hh include introduced by the merge. Fixes the following CI test failures caused by missing CHILD_SHARDS support on this branch: - test_parent_children_merge - test_parent_children_split - test_parent_filtering Fixes: scylladb#25160 Fixes SCYLLADB-532

nyh · 2026-04-14T12:49:03Z

I came here to merge this PR, but see you pushed a new version and its CI hasn't finished yet. What did you change in the latest version?

scylladbbot · 2026-04-14T16:49:06Z

🔴 CI State: FAILURE

Mode	Stage	Status	Queue Time	Run Time	Node
	Checkout	✅		2m	spider9.cloudius-systems.com
	Framework test	✅		0m	spider9.cloudius-systems.com
	Build	✅	23m	21m	spider3.cloudius-systems.com
dev	test.py	✅		16m	spider3.cloudius-systems.com
	Unit Tests Custom 🔹 test_get_records_too_high_limit 🔹 test_stream_arn_unchanging 🔹 test_stream_list_tables 🔹 test_stream_table_name_length_192_create 🔹 test_stream_table_name_length_192_update 🔹 test_stream_table_name_length_222_create 🔹 test_stream_table_name_length_222_update 🔹 test/boost/alternator_unit_test.cc	✅	0m	4m	spider3.cloudius-systems.com
release	test.py	✅		31m	spider3.cloudius-systems.com
	Unit Tests Custom 🔹 test_get_records_too_high_limit 🔹 test_stream_arn_unchanging 🔹 test_stream_list_tables 🔹 test_stream_table_name_length_192_create 🔹 test_stream_table_name_length_192_update 🔹 test_stream_table_name_length_222_create 🔹 test_stream_table_name_length_222_update 🔹 test/boost/alternator_unit_test.cc	✅	0m	7m	spider3.cloudius-systems.com
debug	test.py	❌		2h 51m	spider3.cloudius-systems.com
dtest	dtest with consistent topology changes	✅		48m
	dtest with tablets	✅		56m

Failed Tests (1/76271):

test_sstables_incrementally_released_during_streaming.debug.2 🔍 🎫 📂

📄 Elasticsearch analysis for test_sstables_incrementally_released_during_streaming

Period	CI Failures / Total	CI Pass Rate (%)	Next Failures / Total	Next Pass Rate (%)
Week (7d)	1/1066	99.91	0/688	100.0
Month (30d)	2/5009	99.96	0/2664	100.0
Quarter (90d)	7/8837	99.92	0/4152	100.0

Recent 10 Failures

Test Name	Arch	Mode	Jenkins Node	Jenkins Job	Ref	Timestamp	Days Ago
test_sstables_incrementally_released_during_streaming.debug.2	x86_64	debug	spider3.cloudius-systems.com	scylla-master/scylla-ci	PR#28189	2026-04-14 13:56	today
test_sstables_incrementally_released_during_streaming.dev.1	x86_64	dev	spider4.cloudius-systems.com	scylla-master/scylla-ci	PR#29171	2026-03-24 08:01	21
test_sstables_incrementally_released_during_streaming.debug.3	x86_64	debug	spider12.cloudius-systems.com	scylla-master/scylla-ci	PR#28520	2026-03-05 11:41	40
test_sstables_incrementally_released_during_streaming.debug.3	x86_64	debug	spider14.cloudius-systems.com	scylla-master/scylla-ci	PR#28338	2026-02-27 14:12	46
test_sstables_incrementally_released_during_streaming.debug.3	x86_64	debug	spider14.cloudius-systems.com	scylla-master/scylla-ci	PR#28761	2026-02-27 00:53	46
test_sstables_incrementally_released_during_streaming.debug.3	x86_64	debug	spider5.cloudius-systems.com	scylla-master/scylla-ci	PR#28338	2026-02-27 00:19	46
test_sstables_incrementally_released_during_streaming.dev.1	x86_64	dev	spider17.cloudius-systems.com	scylla-master/scylla-ci	PR#25998	2026-02-23 11:58	50

Build Details:

Total Queue: 23m
Total Run: 4h 36m

Note: To re-trigger CI for this PR, comment: @scylladbbot trigger-ci

swasik · 2026-04-14T17:40:17Z

I came here to merge this PR, but see you pushed a new version and its CI hasn't finished yet. What did you change in the latest version?

License version to 1.1

scylladbbot · 2026-04-14T21:51:37Z

🟢 CI State: SUCCESS

Mode	Stage	Status	Queue Time	Run Time	Node
	Checkout	✅		2m	spider9.cloudius-systems.com
	Framework test	✅		0m	spider9.cloudius-systems.com
	Build	✅	41m	19m	spider17.cloudius-systems.com
dev	test.py	✅		13m	spider17.cloudius-systems.com
	Unit Tests Custom 🔹 test_get_records_too_high_limit 🔹 test_stream_arn_unchanging 🔹 test_stream_list_tables 🔹 test_stream_table_name_length_192_create 🔹 test_stream_table_name_length_192_update 🔹 test_stream_table_name_length_222_create 🔹 test_stream_table_name_length_222_update 🔹 test/boost/alternator_unit_test.cc	✅	0m	3m	spider17.cloudius-systems.com
release	test.py	✅		23m	spider17.cloudius-systems.com
	Unit Tests Custom 🔹 test_get_records_too_high_limit 🔹 test_stream_arn_unchanging 🔹 test_stream_list_tables 🔹 test_stream_table_name_length_192_create 🔹 test_stream_table_name_length_192_update 🔹 test_stream_table_name_length_222_create 🔹 test_stream_table_name_length_222_update 🔹 test/boost/alternator_unit_test.cc	✅	0m	5m	spider17.cloudius-systems.com
debug	test.py	✅		2h 11m	spider17.cloudius-systems.com
	Unit Tests Custom 🔹 test_get_records_too_high_limit 🔹 test_stream_arn_unchanging 🔹 test_stream_list_tables 🔹 test_stream_table_name_length_192_create 🔹 test_stream_table_name_length_192_update 🔹 test_stream_table_name_length_222_create 🔹 test_stream_table_name_length_222_update 🔹 test/boost/alternator_unit_test.cc	✅	0m	12m	spider17.cloudius-systems.com
dtest	dtest with tablets	✅		1h 19m
	dtest with consistent topology changes	✅		1h 27m

Build Details:

Total Queue: 41m
Total Run: 4h 9m

nyh · 2026-04-15T06:20:29Z

Now there's a merge conflict :-( @radoslawcybulski please rebase.

radoslawcybulski · 2026-04-15T07:13:02Z

@nyh rebased.
@swasik updated.

scylladbbot · 2026-04-15T12:12:20Z

🟢 CI State: SUCCESS

Mode	Stage	Status	Queue Time	Run Time	Node
	Checkout	✅		2m	spider9.cloudius-systems.com
	Framework test	✅		0m	spider9.cloudius-systems.com
	Build	✅	10m	43m	spider6.cloudius-systems.com
dev	test.py	✅		15m	spider6.cloudius-systems.com
	Unit Tests Custom 🔹 test_stream_shard_filtering_missing_shard_id 🔹 test_stream_shard_filtering_missing_type 🔹 test_stream_shard_filtering_simple_no_children 🔹 test_stream_shard_filtering_wrong_shard_id 🔹 test_stream_shard_filtering_wrong_type 🔹 test/boost/alternator_unit_test.cc	✅	0m	4m	spider6.cloudius-systems.com
release	test.py	✅		31m	spider6.cloudius-systems.com
	Unit Tests Custom 🔹 test_stream_shard_filtering_missing_shard_id 🔹 test_stream_shard_filtering_missing_type 🔹 test_stream_shard_filtering_simple_no_children 🔹 test_stream_shard_filtering_wrong_shard_id 🔹 test_stream_shard_filtering_wrong_type 🔹 test/boost/alternator_unit_test.cc	✅	0m	6m	spider6.cloudius-systems.com
debug	test.py	✅		2h 45m	spider6.cloudius-systems.com
	Unit Tests Custom 🔹 test_stream_shard_filtering_missing_shard_id 🔹 test_stream_shard_filtering_missing_type 🔹 test_stream_shard_filtering_simple_no_children 🔹 test_stream_shard_filtering_wrong_shard_id 🔹 test_stream_shard_filtering_wrong_type 🔹 test/boost/alternator_unit_test.cc	✅	0m	16m	spider6.cloudius-systems.com
dtest	dtest with consistent topology changes	✅		1h 33m
	dtest with tablets	✅		1h 38m

Build Details:

Total Queue: 10m
Total Run: 4h 53m

nyh · 2026-04-15T19:23:13Z

Unfortunately I can't merge because this patch has a conflict with another patch already merged into next, the conflict is in alternator/streams.cc (maybe @ScyllaPiotr's auditing patch?).

You'll see the conflict in github when next is promoted, or if you want to see it sooner you can try rebasing your patch on next and see the conflict.

Unfortunately, this is what happens when you have a very long merge freeze just because a branch cutoff date, and everyone is trying to merge stuff at the same time...

radoslawcybulski · 2026-04-15T19:32:48Z

@nyh i see, the fix is rather obvious (the conflict is unlucky), is there anything that can be done or do i need to wait for next to master promotion?

scylladbbot · 2026-04-16T02:32:14Z

🔴 CI State: FAILURE

Mode	Stage	Status	Queue Time	Run Time	Node
	Checkout	✅		2m	spider9.cloudius-systems.com
	Framework test	✅		0m	spider9.cloudius-systems.com
	Build	✅	2h 23m	47m	spider1.cloudius-systems.com
dev	test.py	✅		16m	spider1.cloudius-systems.com
	Unit Tests Custom 🔹 test_stream_shard_filtering_missing_shard_id 🔹 test_stream_shard_filtering_missing_type 🔹 test_stream_shard_filtering_simple_no_children 🔹 test_stream_shard_filtering_wrong_shard_id 🔹 test_stream_shard_filtering_wrong_type 🔹 test/boost/alternator_unit_test.cc	✅	0m	4m	spider1.cloudius-systems.com
release	test.py	✅		31m	spider1.cloudius-systems.com
	Unit Tests Custom 🔹 test_stream_shard_filtering_missing_shard_id 🔹 test_stream_shard_filtering_missing_type 🔹 test_stream_shard_filtering_simple_no_children 🔹 test_stream_shard_filtering_wrong_shard_id 🔹 test_stream_shard_filtering_wrong_type 🔹 test/boost/alternator_unit_test.cc	✅	0m	6m	spider1.cloudius-systems.com
debug	test.py	❌		2h 51m	spider1.cloudius-systems.com
dtest	dtest with tablets	✅		2h 2m
	dtest with consistent topology changes	✅		2h 33m

Failed Tests (2/77896):

📄 Elasticsearch analysis for test_audit_table_auth_multinode

Period	CI Failures / Total	CI Pass Rate (%)	Next Failures / Total	Next Pass Rate (%)
Week (7d)	1/1198	99.92	0/647	100.0
Month (30d)	1/4026	99.98	0/1781	100.0
Quarter (90d)	2/4046	99.95	0/1781	100.0

Recent 10 Failures

Test Name	Arch	Mode	Jenkins Node	Jenkins Job	Ref	Timestamp	Days Ago
test_audit_table_auth_multinode.debug.1	x86_64	debug	spider1.cloudius-systems.com	scylla-master/scylla-ci	PR#28189	2026-04-15 23:40	today
test_audit_table_auth_multinode.release.3	x86_64	release	spider1.cloudius-systems.com	scylla-master/scylla-ci	PR#28650	2026-03-04 17:11	42

Build Details:

Total Queue: 2h 23m
Total Run: 7h 1m

Note: To re-trigger CI for this PR, comment: @scylladbbot trigger-ci

Add a `CHILD_SHARDS` filter to `DescribeStream` command. When used, user need to pass a parent stream shard id as json's ShardFilter.ShardId field. DescribeStream will then return only list of stream shards, that are direct descendants of passed parent stream shard. Each stream shard cover a consecutive part of token space. A stream shard Q is considered to be a child of stream shard W, when at least one token belongs to token spaces from both streams. The filtering algorithm itself is somewhat complicated - more details in comments in streams.cc. CHILD_SHARDS is a Amazon's functionality and is required by KCL. Add unit tests. Fixes: scylladb#25160

ScyllaPiotr · 2026-04-16T11:40:29Z

test_audit_table_auth_multinode.debug.1 🔍 🎫 📂

https://scylladb.atlassian.net/browse/SCYLLADB-1589

scylladbbot · 2026-04-16T15:24:33Z

🟢 CI State: SUCCESS

Mode	Stage	Status	Queue Time	Run Time	Node
	Checkout	✅		2m	spider9.cloudius-systems.com
	Framework test	✅		0m	spider9.cloudius-systems.com
	Build	✅	3h 45m	38m	spider14.cloudius-systems.com
dev	test.py	✅		13m	spider14.cloudius-systems.com
	Unit Tests Custom 🔹 test_stream_shard_filtering_missing_shard_id 🔹 test_stream_shard_filtering_missing_type 🔹 test_stream_shard_filtering_simple_no_children 🔹 test_stream_shard_filtering_wrong_shard_id 🔹 test_stream_shard_filtering_wrong_type 🔹 test/boost/alternator_unit_test.cc	✅	0m	3m	spider14.cloudius-systems.com
release	test.py	✅		22m	spider14.cloudius-systems.com
	Unit Tests Custom 🔹 test_stream_shard_filtering_missing_shard_id 🔹 test_stream_shard_filtering_missing_type 🔹 test_stream_shard_filtering_simple_no_children 🔹 test_stream_shard_filtering_wrong_shard_id 🔹 test_stream_shard_filtering_wrong_type 🔹 test/boost/alternator_unit_test.cc	✅	0m	5m	spider14.cloudius-systems.com
debug	test.py	✅		2h 12m	spider14.cloudius-systems.com
	Unit Tests Custom 🔹 test_stream_shard_filtering_missing_shard_id 🔹 test_stream_shard_filtering_missing_type 🔹 test_stream_shard_filtering_simple_no_children 🔹 test_stream_shard_filtering_wrong_shard_id 🔹 test_stream_shard_filtering_wrong_type 🔹 test/boost/alternator_unit_test.cc	✅	0m	11m	spider14.cloudius-systems.com
dtest	dtest with tablets	✅		44m
	dtest with consistent topology changes	✅		50m

Build Details:

Total Queue: 3h 45m
Total Run: 7h 31m

nyh · 2026-04-16T15:33:39Z

@nyh i see, the fix is rather obvious (the conflict is unlucky), is there anything that can be done or do i need to wait for next to master promotion?

I don't see any magic solution. Whenever you do a conflict resolution, there is a chance of a mistake, so we need to run the CI on the result. So although you can play around with the conflict resolution in advance, eventally we'll need to run the CI on the real code, based on the real master, so anyway you needed to wait for that :-(

The real solution is just not to have so many conflicts... Not everyone should work at the same files at the same time, and if we do that - don't ask everyone to push their patches in a one-week window between a long merge freeze and a release date.

radoslawcybulski requested a review from nyh as a code owner January 15, 2026 21:13

radoslawcybulski self-assigned this Jan 15, 2026

scylladbbot added the status/ci_in_progress label Jan 15, 2026

scylladbbot removed the status/ci_in_progress label Jan 16, 2026

radoslawcybulski force-pushed the add-child-shards-filtering branch from d177b52 to e1208be Compare January 26, 2026 11:05

scylladbbot added status/ci_in_progress and removed status/ci_in_progress labels Jan 26, 2026

radoslawcybulski force-pushed the add-child-shards-filtering branch 2 times, most recently from 30b868f to cb66f1e Compare February 5, 2026 09:03

radoslawcybulski added area/alternator Alternator related Issues backport/none Backport is not required labels Feb 5, 2026

scylladbbot added the status/ci_in_progress label Feb 5, 2026

radoslawcybulski force-pushed the add-child-shards-filtering branch from cb66f1e to 37e4d9e Compare February 5, 2026 10:18

scylladbbot added status/ci_in_progress and removed status/ci_in_progress labels Feb 5, 2026

github-actions Bot added the conflicts label Feb 5, 2026

nyh reviewed Feb 5, 2026

View reviewed changes

radoslawcybulski force-pushed the add-child-shards-filtering branch from 37e4d9e to a3eef77 Compare February 5, 2026 20:30

radoslawcybulski requested a review from nyh February 11, 2026 08:30

radoslawcybulski force-pushed the add-child-shards-filtering branch 2 times, most recently from b00e446 to 36f4a77 Compare February 16, 2026 12:12

ScyllaPiotr mentioned this pull request Apr 13, 2026

alternator: add CHILD_SHARDS filtering radoslawcybulski/scylladb#4

Open

radoslawcybulski force-pushed the add-child-shards-filtering branch from f96c1e9 to 5e413e8 Compare April 14, 2026 12:12

scylladbbot added the status/ci_in_progress label Apr 14, 2026

scylladbbot removed the status/ci_in_progress label Apr 14, 2026

scylladbbot added the status/ci_in_progress label Apr 14, 2026

scylladbbot removed the status/ci_in_progress label Apr 14, 2026

radoslawcybulski force-pushed the add-child-shards-filtering branch from 5e413e8 to fa9b521 Compare April 15, 2026 07:12

scylladbbot added the status/ci_in_progress label Apr 15, 2026

scylladbbot removed the status/ci_in_progress label Apr 15, 2026

radoslawcybulski force-pushed the add-child-shards-filtering branch from fa9b521 to 0120d40 Compare April 15, 2026 19:30

scylladbbot added the status/ci_in_progress label Apr 15, 2026

scylladbbot removed the status/ci_in_progress label Apr 16, 2026

radoslawcybulski force-pushed the add-child-shards-filtering branch from 0120d40 to 203338c Compare April 16, 2026 07:52

Conversation

radoslawcybulski commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nyh commented Jan 15, 2026

Uh oh!

scylladbbot commented Jan 16, 2026

🟢 CI State: SUCCESS

Build Details:

Uh oh!

scylladbbot commented Jan 26, 2026

🟢 CI State: SUCCESS

Build Details:

Uh oh!

radoslawcybulski commented Feb 5, 2026

Uh oh!

scylladbbot commented Feb 5, 2026

🟢 CI State: SUCCESS

Build Details:

Uh oh!

github-actions Bot commented Feb 5, 2026

Uh oh!

nyh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

radoslawcybulski Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Feb 10, 2026

Uh oh!

ScyllaPiotr commented Feb 18, 2026

Uh oh!

radoslawcybulski commented Apr 13, 2026

Uh oh!

radoslawcybulski commented Jan 15, 2026 •

edited

Loading

radoslawcybulski Feb 5, 2026 •

edited

Loading