view: revert cleanup filter that doesn't work with tablets by nyh · Pull Request #16670 · scylladb/scylladb

nyh · 2024-01-07T14:12:13Z

The goal of this PR is fix Scylla so that the dtest test_mvs_populating_from_existing_data, which starts to fail when enabling tablets, will pass.

The main fix (the second patch) is reverting code which doesn't work with tablets, and I explain why I think this code was not necessary in the first place.

Fixes #16598

scylladb-promoter · 2024-01-07T14:38:57Z

🔴 CI State: FAILURE

❌ - Build

Build Details:

Duration: 26 min
Builder: spider1.cloudius-systems.com

nyh · 2024-01-07T14:42:38Z

The self-contained-header check for dht/partition_filter.hh failed. I'll need to fix it and push a new version.

scylladb-promoter · 2024-01-07T16:24:40Z

🔴 CI State: FAILURE

✅ - Build
✅ - dtest
❌ - Unit Tests

Failed Tests (6/23683):

Build Details:

Duration: 1 hr 32 min
Builder: spider1.cloudius-systems.com

nyh · 2024-01-08T09:21:46Z

* [test_raft_remove_ignore_nodes](https://jenkins.scylladb.com//job/scylla-master/job/scylla-ci/5597/testReport/junit/%28root%29/test_raft_ignore_nodes/Tests___Unit_Tests___test_raft_remove_ignore_nodes) [🔍](https://github.com/scylladb/scylladb/issues?q=is:issue+is:open+test_raft_remove_ignore_nodes)

Known flaky test: #16420 :-(
I'll restart the CI.

scylladb-promoter · 2024-01-08T12:37:11Z

🟢 CI State: SUCCESS

✅ - Build
✅ - dtest
✅ - Unit Tests

Build Details:

Duration: 3 hr 15 min
Builder: i-0170842cc1954f731 (m5ad.12xlarge)

nyh · 2024-01-10T10:27:01Z

@bhalevy please review, as it is your code which I'm reverting here.

bhalevy · 2024-01-10T10:32:56Z

This PR title is confusing since it does not fix the dtest code, but rather it fixes scylla so the dtest will stop failing.
I thing we better focus on the change to scylla.
The dtest passing is a nice side effect of that :)

denesb

Maybe we need a test which confirms that MV update generation indeed ignores data not owned by the current node.

denesb · 2024-01-10T10:33:06Z

                    // have had other updates.
                    std::move(sstables.begin(), sstables.end(), std::back_inserter(_sstables_with_tables[t]));
+                    // Sleep a bit, to avoid a tight loop repeatedly spamming the log with the same message.
+                    seastar::sleep(std::chrono::seconds(1)).get();


Maybe not in this PR, but we should make this exponential back-off, with a ceiling of say 1 minute.

I don't know if it's important. Basically, if the generator fails after a long time (e.g., after working for 10 minutes), it doesn't matter if we sleep for 1 second or 60 seconds. If it fails after a long time (in this case, it failed after a fraction of a millisecond), again it doesn't matter if we sleep for 1 or 60 seconds - both are negligible performance-wise, and in both the log spam as at a manageable level.

bhalevy · 2024-01-10T10:35:52Z

                    // have had other updates.
                    std::move(sstables.begin(), sstables.end(), std::back_inserter(_sstables_with_tables[t]));
+                    // Sleep a bit, to avoid a tight loop repeatedly spamming the log with the same message.
+                    seastar::sleep(std::chrono::seconds(1)).get();


I think that @kostja wanted to standardize on 0.1 seconds (100ms) sleeps.
Regarding the log spam, shouldn't we use logger::rate_limit?

That said, if you think that 1s sleep is more appropriate for this use case, I'm good with that.

Where did he want to standardized this way?
In my response to Botond above, I tried to explain why I think it doesn't matter what the sleep is (within reason), I don't think it needs to be smart or even standardized. It's not something that a user will ever encounter (it's not a user-facing timeout or delay or anything). The only purpose of this sleep is to reduce log spam - from a million lines per second to just 1 per second - 0.1 or 10 per second would also be just as acceptable.

bhalevy · 2024-01-10T10:36:54Z

                    inject_failure("view_update_generator_consume_staging_sstable");
-                    auto result = staging_sstable_reader.consume_in_thread(view_updating_consumer(*this, s, std::move(permit), *t, sstables, _as, staging_sstable_reader_handle),
-                        dht::incremental_owned_ranges_checker::make_partition_filter(_db.get_keyspace_local_ranges(s->ks_name())));
+                    auto result = staging_sstable_reader.consume_in_thread(view_updating_consumer(*this, s, std::move(permit), *t, sstables, _as, staging_sstable_reader_handle));


nit: s/with tables/with tablets/ in commit message.

Otherwise, LGTM

nyh · 2024-01-14T11:13:05Z

Maybe we need a test which confirms that MV update generation indeed ignores data not owned by the current node.

We already have a bunch of dtests that check multi-node MV operations. The specific dtest that this PR fixes is one of these tests - it involves view building and concurrent range ownership movements (tablets moving around).

To be honest, if I thought we had a problem, I would write a new test to confirm it, but in this case I don't think we have a problem: If a base replica does not own a range, it simply has no way to send the view update to anyone - if it's the Nth base replica it sends the update to the Nth view replica, but if it's not a base replica at all, it has noone to send anything to. So I don't think such a problem can even exist.

The "view update generator" is responsible for generating view updates for staging sstables (such as coming from repair). If the processing fails, the code retries - immediately. If there is some persistent bug, such as issue scylladb#16598, we will have a tight loop of error messages, potentially a gigabyte of identical messages every second. In this patch we simply add a sleep of one second after view update generation fails before retrying. We can still get many identical error messages if there is some bug, but not more than one per second. Refs scylladb#16598. Signed-off-by: Nadav Har'El <[email protected]>

nyh · 2024-01-14T11:14:54Z

This PR title is confusing since it does not fix the dtest code, but rather it fixes scylla so the dtest will stop failing. I thing we better focus on the change to scylla. The dtest passing is a nice side effect of that :)

The title of the actual patch (the second patch) was exactly like you wanted: "view: revert cleanup filter that doesn't work with tablets". I just had a silly title for the merge commit. I'll change it.

This patch reverts commit 10f8f13 from November 2022. That commit added to the "view update generator", the code which builds view updates for staging sstables, a filter that ignores ranges that do not belong to this node. However, 1. I believe this filter was never necessary, because the view update code already silently ignores base updates which do not belong to this replica (see get_view_natural_endpoint()). After all, the view update needs to know that this replica is the Nth owner of the base update to send its update to the Nth view replica, but if no such N exists, no view update is sent. 2. The code introduced for that filter used a per-keyspace replication map, which was ok for vnodes but no longer works for tablets, and causes the operation using it to fail. 3. The filter was used every time the "view update generator" was used, regardless of whether any cleanup is necessary or not, so every such operation would fail with tablets. So for example the dtest test_mvs_populating_from_existing_data fails with tablets: * This test has view building in parallel with automatic tablet movement. * Tablet movement is streaming. * When streaming happens before view building has finished, the streamed sstables get "view update generator" run on them. This causes the problematic code to be called. Before this patch, the dtest test_mvs_populating_from_existing_data fails when tablets are enabled. After this patch, it passes. Fixes scylladb#16598 Signed-off-by: Nadav Har'El <[email protected]>

nyh · 2024-01-14T11:28:46Z

Rebased, and fixed typo in one commit message and the title of the pull request.

I did not change the 1-second sleep in the first patch. This first patch wasn't even the main point of this PR - I just had a hard time debugging this issue because before the fix in the second patch I would get millions of log messages which completely overwhelmed test.py, so the first patch is just to make this managable - failure in view_update_generator will now happen just once a second, not a million times a second.

scylladb-promoter · 2024-01-14T13:11:53Z

🔴 CI State: FAILURE

✅ - Build
❌ - Unit Tests
✅ - dtest

Failed Tests (2/23733):

Build Details:

Duration: 1 hr 45 min
Builder: spider1.cloudius-systems.com

nyh · 2024-01-14T13:53:06Z

* [test_tablet_cql_lsi](https://jenkins.scylladb.com//job/scylla-master/job/scylla-ci/5759/testReport/junit/%28root%29/test_mv_tablets/Tests___Unit_Tests___test_tablet_cql_lsi) [🔍](https://github.com/scylladb/scylladb/issues?q=is:issue+is:open+test_tablet_cql_lsi)

This test didn't change in this PR. It seems to be flakiness in the topology test framework, in the concurrent server creation function servers_add(). I remember we saw this problem in the past but can't find this issue to update :-( I'll look more but @kbr-scylla @kostja maybe you remember?

I'll retry the CI.

You can see the failure in https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/5759/testReport/junit/(root)/non-boost%20tests/Tests___Unit_Tests___topology_experimental_raft_test_mv_tablets_debug_1/ :

=================================== FAILURES ===================================
_____________________________ test_tablet_cql_lsi ______________________________

self = <test.pylib.manager_client.ManagerClient object at 0x7fdcbe8a2c90>
servers_num = 2, cmdline = None, config = None, property_file = None
start = True, expected_error = None

    async def servers_add(self, servers_num: int = 1,
                          cmdline: Optional[List[str]] = None,
                          config: Optional[dict[str, Any]] = None,
                          property_file: Optional[dict[str, Any]] = None,
                          start: bool = True,
                          expected_error: Optional[str] = None) -> [ServerInfo]:
        """Add new servers concurrently.
        This function can be called only if the cluster uses consistent topology changes, which support
        concurrent bootstraps. If your test does not fulfill this condition and you want to add multiple
        servers, you should use multiple server_add calls."""
        assert servers_num > 0, f"servers_add: cannot add {servers_num} servers, servers_num must be positive"
    
        try:
            data = self._create_server_add_data(None, cmdline, config, property_file, start, expected_error)
            data['servers_num'] = servers_num
>           server_infos = await self.client.put_json("/cluster/addservers", data, response_type="json",
                                                      timeout=ScyllaServer.TOPOLOGY_TIMEOUT * servers_num)

test/pylib/manager_client.py:270: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <test.pylib.rest_client.UnixRESTClient object at 0x7fdcbe8a1090>
resource_uri = '/cluster/addservers', data = {'servers_num': 2, 'start': True}
host = None, port = None, params = None, response_type = 'json', timeout = 2000

    async def put_json(self, resource_uri: str, data: Optional[Mapping] = None, host: Optional[str] = None,
                       port: Optional[int] = None, params: Optional[dict[str, str]] = None,
                       response_type: Optional[str] = None, timeout: Optional[float] = None) -> Any:
>       ret = await self._fetch("PUT", resource_uri, response_type = response_type, host = host,
                                port = port, params = params, json = data, timeout = timeout)

test/pylib/rest_client.py:104: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <test.pylib.rest_client.UnixRESTClient object at 0x7fdcbe8a1090>
method = 'PUT', resource = '/cluster/addservers', response_type = 'json'
host = None, port = None, params = None
json = {'servers_num': 2, 'start': True}, timeout = 2000

    async def _fetch(self, method: str, resource: str, response_type: Optional[str] = None,
                     host: Optional[str] = None, port: Optional[int] = None,
                     params: Optional[Mapping[str, str]] = None,
                     json: Optional[Mapping] = None, timeout: Optional[float] = None) -> Any:
        # Can raise exception. See https://docs.aiohttp.org/en/latest/web_exceptions.html
        assert method in ["GET", "POST", "PUT", "DELETE"], f"Invalid HTTP request method {method}"
        assert response_type is None or response_type in ["text", "json"], \
                f"Invalid response type requested {response_type} (expected 'text' or 'json')"
        # Build the URI
        port = port if port else self.default_port if hasattr(self, "default_port") else None
        port_str = f":{port}" if port else ""
        assert host is not None or hasattr(self, "default_host"), "_fetch: missing host for " \
                "{method} {resource}"
        host_str = host if host is not None else self.default_host
        uri = self.uri_scheme + "://" + host_str + port_str + resource
        logging.debug(f"RESTClient fetching {method} {uri}")
    
        client_timeout = ClientTimeout(total = timeout if timeout is not None else 300)
        async with request(method, uri,
                           connector = self.connector if hasattr(self, "connector") else None,
                           params = params, json = json, timeout = client_timeout) as resp:
            if resp.status != 200:
                text = await resp.text()
>               raise HTTPError(uri, resp.status, params, json, text)
E               test.pylib.rest_client.HTTPError: HTTP error 500, uri: http+unix://api/cluster/addservers, params: None, json: {'start': True, 'servers_num': 2}, body:
E               failed to start the node, server_id 1410, IP 127.54.45.49, workdir scylla-1410, host_id 993789d7-6c5b-4648-a699-cbb76d4255c5, cql [not connected]
E               Check the log files:
E               /scylladir/testlog/x86_64/test.py.debug-release-dev.log
E               /scylladir/testlog/x86_64/debug/scylla-1410.log

test/pylib/rest_client.py:69: HTTPError

The above exception was the direct cause of the following exception:

manager = <test.pylib.manager_client.ManagerClient object at 0x7fdcbe8a2c90>

    @pytest.mark.asyncio
    @skip_mode('release', 'error injections are not supported in release mode')
    async def test_tablet_cql_lsi(manager: ManagerClient):
        """A simple reproducer for issue #16371 where CQL LSI (local secondary
           index) was not using synchronous view updates when tablets are enabled,
           contrary to what the documentation for local SI says. In other words,
           we could write to a table with CL=QUORUM and then try to read with
           CL=QUORUM using the index - and not find the data.
    
           We use a cluster of just two nodes and RF=1, and control the tablets
           so all base tablets will be in node 0 and all view tablets will be
           in node 1, to ensure that the view update is remote and therefore
           not synchronous by default. To make the test failure even more
           likely on a fast machine, we use the "delay_before_remote_view_update"
           injection point to add a delay to the view update more than usual.
           Reproduces #16371.
        """
>       servers = await manager.servers_add(2)

test/topology_experimental_raft/test_mv_tablets.py:303: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <test.pylib.manager_client.ManagerClient object at 0x7fdcbe8a2c90>
servers_num = 2, cmdline = None, config = None, property_file = None
start = True, expected_error = None

    async def servers_add(self, servers_num: int = 1,
                          cmdline: Optional[List[str]] = None,
                          config: Optional[dict[str, Any]] = None,
                          property_file: Optional[dict[str, Any]] = None,
                          start: bool = True,
                          expected_error: Optional[str] = None) -> [ServerInfo]:
        """Add new servers concurrently.
        This function can be called only if the cluster uses consistent topology changes, which support
        concurrent bootstraps. If your test does not fulfill this condition and you want to add multiple
        servers, you should use multiple server_add calls."""
        assert servers_num > 0, f"servers_add: cannot add {servers_num} servers, servers_num must be positive"
    
        try:
            data = self._create_server_add_data(None, cmdline, config, property_file, start, expected_error)
            data['servers_num'] = servers_num
            server_infos = await self.client.put_json("/cluster/addservers", data, response_type="json",
                                                      timeout=ScyllaServer.TOPOLOGY_TIMEOUT * servers_num)
        except Exception as exc:
>           raise Exception("Failed to add servers") from exc
E           Exception: Failed to add servers

test/pylib/manager_client.py:273: Exception

scylladb-promoter · 2024-01-14T15:52:00Z

🟢 CI State: SUCCESS

✅ - Build
✅ - Unit Tests
✅ - dtest

Build Details:

Duration: 1 hr 58 min
Builder: spider7.cloudius-systems.com

kbr-scylla · 2024-01-14T18:04:24Z

It seems to be flakiness in the topology test framework, in the concurrent server creation function servers_add().

There are no known issues with this that haven't been fixed already

The failure report points to one of the servers failing to boot

E               failed to start the node, server_id 1410, IP 127.54.45.49, workdir scylla-1410, host_id 993789d7-6c5b-4648-a699-cbb76d4255c5, cql [not connected]
E               Check the log files:
E               /scylladir/testlog/x86_64/test.py.debug-release-dev.log
E               /scylladir/testlog/x86_64/debug/scylla-1410.log

Logs of this server: https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/5759/artifact/testlog/x86_64/debug/scylla-1410.log

WARN  2024-01-14 15:05:12,249 [shard 0: gms] repair - repair[c9bd731e-73eb-4acc-a412-eb12841925e4]: get_row_diff: got error from node=127.54.45.48, keyspace=system_auth, table=roles, range=(-648230420654190980,862260533620703931], error=std::out_of_range (regular column id 0 >= 0)
WARN  2024-01-14 15:05:12,250 [shard 0: gms] repair - repair[c9bd731e-73eb-4acc-a412-eb12841925e4]: shard=0, keyspace=system_auth, cf=roles, range=(-648230420654190980,862260533620703931], got error in row level repair: std::out_of_range (regular column id 0 >= 0)
INFO  2024-01-14 15:05:12,264 [shard 1: gms] repair - repair[c9bd731e-73eb-4acc-a412-eb12841925e4]: Started to repair 2 out of 3 tables in keyspace=system_auth, table=role_members, table_id=0ecdaa87-f8fb-3e60-88d1-74fb36fe5c0d, repair_reason=bootstrap
INFO  2024-01-14 15:05:12,265 [shard 0: gms] repair - repair[c9bd731e-73eb-4acc-a412-eb12841925e4]: stats: repair_reason=bootstrap, keyspace=system_auth, tables={roles, role_members, role_attributes}, ranges_nr=16, round_nr=16, round_nr_fast_path_already_synced=15, round_nr_fast_path_same_combined_hashes=0, round_nr_slow_path=0, rpc_call_nr=82, tx_hashes_nr=0, rx_hashes_nr=1, duration=0.062315915 seconds, tx_row_nr=0, rx_row_nr=0, tx_row_bytes=0, rx_row_bytes=0, row_from_disk_bytes={{127.54.45.48, 178}, {127.54.45.49, 0}}, row_from_disk_nr={{127.54.45.48, 1}, {127.54.45.49, 0}}, row_from_disk_bytes_per_sec={{127.54.45.48, 0.00272409}, {127.54.45.49, 0}} MiB/s, row_from_disk_rows_per_sec={{127.54.45.48, 16.0473}, {127.54.45.49, 0}} Rows/s, tx_row_nr_peer={}, rx_row_nr_peer={}
WARN  2024-01-14 15:05:12,266 [shard 0: gms] repair - repair[c9bd731e-73eb-4acc-a412-eb12841925e4]: 1 out of 48 ranges failed, keyspace=system_auth, tables={roles, role_members, role_attributes}, repair_reason=bootstrap, nodes_down_during_repair={}, aborted_by_user=false, failed_because=seastar::nested_exception: std::runtime_error (Failed to repair for keyspace=system_auth, cf=roles, range=(-648230420654190980,862260533620703931]) (while cleaning up after std::out_of_range (regular column id 0 >= 0))
INFO  2024-01-14 15:05:12,322 [shard 1: gms] repair - repair[c9bd731e-73eb-4acc-a412-eb12841925e4]: Started to repair 3 out of 3 tables in keyspace=system_auth, table=role_attributes, table_id=6b8c7359-a843-33f2-a1d8-5dc6a187436f, repair_reason=bootstrap
INFO  2024-01-14 15:05:12,383 [shard 1: gms] repair - repair[c9bd731e-73eb-4acc-a412-eb12841925e4]: stats: repair_reason=bootstrap, keyspace=system_auth, tables={roles, role_members, role_attributes}, ranges_nr=16, round_nr=48, round_nr_fast_path_already_synced=48, round_nr_fast_path_same_combined_hashes=0, round_nr_slow_path=0, rpc_call_nr=240, tx_hashes_nr=0, rx_hashes_nr=0, duration=0.17948368 seconds, tx_row_nr=0, rx_row_nr=0, tx_row_bytes=0, rx_row_bytes=0, row_from_disk_bytes={{127.54.45.48, 0}, {127.54.45.49, 0}}, row_from_disk_nr={{127.54.45.48, 0}, {127.54.45.49, 0}}, row_from_disk_bytes_per_sec={{127.54.45.48, 0}, {127.54.45.49, 0}} MiB/s, row_from_disk_rows_per_sec={{127.54.45.48, 0}, {127.54.45.49, 0}} Rows/s, tx_row_nr_peer={}, rx_row_nr_peer={}
INFO  2024-01-14 15:05:12,383 [shard 1: gms] repair - repair[c9bd731e-73eb-4acc-a412-eb12841925e4]: completed successfully, keyspace=system_auth
WARN  2024-01-14 15:05:12,384 [shard 0: gms] repair - repair[c9bd731e-73eb-4acc-a412-eb12841925e4]: sync data for keyspace=system_auth, status=failed: std::runtime_error ({shard 0: std::runtime_error (repair[c9bd731e-73eb-4acc-a412-eb12841925e4]: 1 out of 48 ranges failed, keyspace=system_auth, tables={roles, role_members, role_attributes}, repair_reason=bootstrap, nodes_down_during_repair={}, aborted_by_user=false, failed_because=seastar::nested_exception: std::runtime_error (Failed to repair for keyspace=system_auth, cf=roles, range=(-648230420654190980,862260533620703931]) (while cleaning up after std::out_of_range (regular column id 0 >= 0)))})
ERROR 2024-01-14 15:05:12,384 [shard 0: gms] storage_service - raft topology: raft_topology_cmd failed with: std::runtime_error ({shard 0: std::runtime_error (repair[c9bd731e-73eb-4acc-a412-eb12841925e4]: 1 out of 48 ranges failed, keyspace=system_auth, tables={roles, role_members, role_attributes}, repair_reason=bootstrap, nodes_down_during_repair={}, aborted_by_user=false, failed_because=seastar::nested_exception: std::runtime_error (Failed to repair for keyspace=system_auth, cf=roles, range=(-648230420654190980,862260533620703931]) (while cleaning up after std::out_of_range (regular column id 0 >= 0)))})

got error in row level repair: std::out_of_range (regular column id 0 >= 0)
I don't recall anything like this -- looks like a new issue.

Curious that it happened when repairing keyspace=system_auth, table=roles and what is even that error...

Node 127.54.45.48 (https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/5759/artifact/testlog/x86_64/debug/scylla-1409.log) didn't report anything interesting in its logs during this repair time.

kbr-scylla · 2024-01-14T18:08:28Z

I also recommend in general doing a shallow 5 minute investigation (like I did above) before making a conclusion that this is a known issue --- there can be trillions of different reasons why bootstrapping a server might fail, and different root causes should be different issues IMO ("adding servers fails" is too generic). I haven't seen this std::out_of_range during repair before.

nyh · 2024-01-14T20:42:41Z

You're making a mountain of a molehill. I was trying to babysit this PR which contains a real fix which really is about work assigned to me, not debug random unrelated issues in areas of the code I'm not familiar. I actually did spend 5 minutes and found that add_servers failure and "showed my work" above. This fit with what you told me about a known bug in add_servers (which I didn't know had already been fixed). The insistence of topology tests developers to not put all the evidence of failures in a single file (which I insisted on when i wrote test/cql-pytest/run) meant I obviously missed some of the evidence in other files, which it seems you are more experienced at finding than I was. I don't think I should be executed for that :-)

nyh · 2024-01-17T11:35:07Z

Opened a new issue #16821 on the CI failure that we saw above (and @kbr-scylla analyzed).

nyh requested review from bhalevy and eliransin January 7, 2024 14:12

nyh force-pushed the fix-16598a branch from 349be2c to 15a5e29 Compare January 7, 2024 14:52

denesb reviewed Jan 10, 2024

View reviewed changes

bhalevy reviewed Jan 10, 2024

View reviewed changes

nyh changed the title ~~Fix dtest test_mvs_populating_from_existing_data with tablets~~ view: revert cleanup filter that doesn't work with tablets Jan 14, 2024

nyh force-pushed the fix-16598a branch from 15a5e29 to 1bcaeb8 Compare January 14, 2024 11:25

nyh mentioned this pull request Jan 15, 2024

dtest, views: database - Tried to obtain per-keyspace effective replication map of ks but it's per-table #16734

Closed

nyh requested a review from bhalevy January 15, 2024 12:17

bhalevy approved these changes Jan 16, 2024

View reviewed changes

scylladb-promoter merged commit a9844ed into scylladb:master Jan 16, 2024

nyh mentioned this pull request Jan 17, 2024

Sporadic repair failure when adding a node #16821

Closed

Conversation

nyh commented Jan 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scylladb-promoter commented Jan 7, 2024

🔴 CI State: FAILURE

Build Details:

Uh oh!

nyh commented Jan 7, 2024

Uh oh!

scylladb-promoter commented Jan 7, 2024

🔴 CI State: FAILURE

Failed Tests (6/23683):

Build Details:

Uh oh!

nyh commented Jan 8, 2024

Uh oh!

scylladb-promoter commented Jan 8, 2024

🟢 CI State: SUCCESS

Build Details:

Uh oh!

nyh commented Jan 10, 2024

Uh oh!

bhalevy commented Jan 10, 2024

Uh oh!

denesb left a comment

Choose a reason for hiding this comment

Uh oh!

denesb Jan 10, 2024

Choose a reason for hiding this comment

Uh oh!

nyh Jan 14, 2024

Choose a reason for hiding this comment

Uh oh!

bhalevy Jan 10, 2024

Choose a reason for hiding this comment

Uh oh!

nyh Jan 14, 2024

Choose a reason for hiding this comment

Uh oh!

bhalevy Jan 10, 2024

Choose a reason for hiding this comment

Uh oh!

bhalevy Jan 10, 2024

Choose a reason for hiding this comment

Uh oh!

nyh commented Jan 14, 2024

Uh oh!

nyh commented Jan 14, 2024

Uh oh!

nyh commented Jan 14, 2024

Uh oh!

scylladb-promoter commented Jan 14, 2024

🔴 CI State: FAILURE

Failed Tests (2/23733):

Build Details:

Uh oh!

nyh commented Jan 14, 2024

Uh oh!

scylladb-promoter commented Jan 14, 2024

🟢 CI State: SUCCESS

Build Details:

Uh oh!

kbr-scylla commented Jan 14, 2024

Uh oh!

kbr-scylla commented Jan 14, 2024

Uh oh!

nyh commented Jan 14, 2024

Uh oh!

nyh commented Jan 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

nyh commented Jan 7, 2024 •

edited

Loading