Skip to content

cql_server::connection: Process rebounce message in case of multiple shard migrations#20493

Merged
scylladb-promoter merged 8 commits intoscylladb:masterfrom
bitpathfinder:15465_fix_multiple_query_rebounce
Sep 20, 2024
Merged

cql_server::connection: Process rebounce message in case of multiple shard migrations#20493
scylladb-promoter merged 8 commits intoscylladb:masterfrom
bitpathfinder:15465_fix_multiple_query_rebounce

Conversation

@bitpathfinder
Copy link
Copy Markdown
Contributor

@bitpathfinder bitpathfinder commented Sep 9, 2024

During a query execution, the query can be re-bounced to another shard if the requested data is located there. Previous implementation assumed that the shard cannot be changed after first re-bounce, however with the introduction of Tablets, data could be migrated to another shard after the query was already re-bounced, causing a failure of the query execution. To avoid this issue, the query is re-bounced as needed until it is executed on the correct shard.

Fixes #15465

@bitpathfinder bitpathfinder added the backport/none Backport is not required label Sep 9, 2024
@avikivity
Copy link
Copy Markdown
Member

This PR inherits from #17309 "cql_server::connection: process: rebounce msg if needed" What was added in this PR (compared with PR 17309)

  • Changes are rebased with the latest master
  • A test reproducing the issue was added
  • Changed std::invoke_result_t concept to std::is_invocable_r

Please make the cover letter stand alone and not rely on the reader knowing anything about 17309.

@bitpathfinder bitpathfinder force-pushed the 15465_fix_multiple_query_rebounce branch from e91a1a2 to 7876a1f Compare September 9, 2024 10:02
@bitpathfinder
Copy link
Copy Markdown
Contributor Author

New version: Fixing a build error.

@bitpathfinder
Copy link
Copy Markdown
Contributor Author

This PR inherits from #17309 "cql_server::connection: process: rebounce msg if needed" What was added in this PR (compared with PR 17309)

  • Changes are rebased with the latest master
  • A test reproducing the issue was added
  • Changed std::invoke_result_t concept to std::is_invocable_r

Please make the cover letter stand alone and not rely on the reader knowing anything about 17309.

PR description was updated.

@bitpathfinder
Copy link
Copy Markdown
Contributor Author

@avikivity after discussing this item with @kbr-scylla, we think that we should like split the fix for message multi-rebounce and selection of the shard for executing the query (as per these comments in the base PR) to different issues since these are different problems. What's your opinion on that?

@scylladb-promoter
Copy link
Copy Markdown
Contributor

🟢 CI State: SUCCESS

✅ - Build
✅ - Unit Tests Custom
The following new/updated tests ran 100 times for each mode:
🔹 topology_custom/test_query_rebounce
✅ - Container Test
✅ - dtest with tablets
✅ - dtest
✅ - dtest with topology changes
✅ - Unit Tests

Build Details:

  • Duration: 3 hr 29 min
  • Builder: spider1.cloudius-systems.com

@avikivity
Copy link
Copy Markdown
Member

Fix the case when a query message is re-bounced several times (for example, with tablets migration).

7 commits, 1 sentence to explain.

Comment thread transport/server.hh Outdated
Comment thread transport/server.cc Outdated
[this, is = std::move(is), cs = cs.move_to_other_shard(), stream, permit = std::move(permit), process_fn,
gt = tracing::global_trace_state_ptr(std::move(trace_state)),
cached_vals = std::move(bounce_msg->take_cached_pk_function_calls()), dialect] (cql_server& server) {
cached_vals = std::move(cached_vals), dialect] (cql_server& server) mutable {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That mutable here is actually make things worse. cahced_vals should be copied, not moved since we copy between cpu here and move will move pointers instead. It accidentally worked as intended because of missing mutable, but in fact cached_vals should not be moved below.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I guess the limitation is because of the single-threaded custom allocator?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I guess the limitation is because of the single-threaded custom allocator?

Yes. We do have cross shard free (mostly because of exception which we do not control where they are allocated), but it is less efficient (it is a single consumer multiple producers queue with all the locking that it entails).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But when cached values only? Basically everything should be copied.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, now cached values are copied, but lambda is still mutable due to further move of cached values in the process_fn

Comment thread cql3/statements/modification_statement.cc Outdated
Comment thread cql3/statements/modification_statement.cc Outdated
Comment thread cql3/statements/modification_statement.cc
@bitpathfinder bitpathfinder force-pushed the 15465_fix_multiple_query_rebounce branch from 7876a1f to 4056777 Compare September 10, 2024 16:04
@bitpathfinder
Copy link
Copy Markdown
Contributor Author

New version (addressing review comments):

  • Removed std::atomic.
  • Removed redundant constexpr if..
  • Made test rebounce to different shards.
  • Removed std::move for cached values.

@scylladb-promoter
Copy link
Copy Markdown
Contributor

🔴 CI State: FAILURE

✅ - Build
✅ - Unit Tests Custom
The following new/updated tests ran 100 times for each mode:
🔹 topology_custom/test_query_rebounce
✅ - Container Test
❌ - dtest with tablets
✅ - dtest with topology changes
✅ - dtest
✅ - Unit Tests

Build Details:

  • Duration: 3 hr 8 min
  • Builder: spider3.cloudius-systems.com

Comment thread transport/server.cc Outdated
@bitpathfinder bitpathfinder self-assigned this Sep 11, 2024
@kbr-scylla
Copy link
Copy Markdown
Contributor

So it can be used for a template concept in the next patch.

Signed-off-by: Benny Halevy <[email protected]>
Quoting Avi Kivity:
> Out of scope: we should consider detemplating this.

As a follow-up we should consider that and pass
a function object as process_fn, just make sure
there are no drawbacks.

Signed-off-by: Benny Halevy <[email protected]>
So it can safely passed between shards, as will be needed
in the following patch that handles a (re)bounce_to_shard result
from process_fn that's called by `process_on_shard` on the
`move_to_shard`.

With that in mind, pass the `bounce_to_shard` payload
to `process_on_shard` rather than the foreign shared ptr
since the latter grabs what it needs from it on entry
and the shared_ptr can be released on the calling shard.

Signed-off-by: Benny Halevy <[email protected]>
It is currently unused in `process_on_shard`, which
generates an empty service_permit.

The next patch may call process_on_shard in a loop,
so it can't simply move the permit to the callee
and better hold on to it until processing completes.

`cql_server::connection::process` was turned into
a coroutine in this patch to hold on to the permit parameter
in a simple way. This is a preliminary step to changing
`if (bounce_msg)` to `while (bounce_msg)` that will allow
rebouncing the message in case it moved yet again when
yielding in `process_on_shard`.

Signed-off-by: Benny Halevy <[email protected]>
@bitpathfinder bitpathfinder force-pushed the 15465_fix_multiple_query_rebounce branch from 4056777 to 1f15f09 Compare September 12, 2024 09:34
@bitpathfinder
Copy link
Copy Markdown
Contributor Author

New version:

  • Rebased to the latest master.
  • Addressed review issues.

@scylladb-promoter
Copy link
Copy Markdown
Contributor

🟢 CI State: SUCCESS

✅ - Build
✅ - Container Test
✅ - dtest with topology changes
✅ - dtest
✅ - dtest with tablets

Build Details:

  • Duration: 1 hr 45 min
  • Builder: spider1.cloudius-systems.com

Comment thread transport/server.cc
Comment thread cql3/statements/modification_statement.cc
Comment thread cql3/statements/modification_statement.cc
bitpathfinder and others added 3 commits September 17, 2024 14:54
…shard

`cql_server::connection::process_on_shard` is made a co-routine to
make sure captured objects' lifetime is managed by the source shard,
avoiding error prone inter-shard objects transfers.
Rebounce the msg to another shard if needed,
e.g. in the case of tablet migration.

An example for that, as given by Tomasz Grabiec:
> Bouncing happens when executing LWT statement in
> modification_statement::execute_with_condition by returning a
> special result message kind. The code assumes that after
> jumping to the shard from the bounce request, the result
> message is the regular one and not yet another bounce.
> There is no problem with vnodes, because shards don't change.
> With tablets, they can change at run time on migration.

Fixes scylladb#15465

Signed-off-by: Benny Halevy <[email protected]>
The test emulates several LWT(Lightweight Transaction) query rebounces. Currently, the code
that processes queries does not expect that a query may be rebounced more than once.
It was impossible with the VNodes, but with intruduction of the Tablets, data can be moved
between shards by the balancer thus a query can be rebounced to different shards multiple times.
@scylladb-promoter
Copy link
Copy Markdown
Contributor

🟢 CI State: SUCCESS

✅ - Build
✅ - Unit Tests Custom
The following new/updated tests ran 100 times for each mode:
🔹 topology_custom/test_query_rebounce
✅ - Container Test
✅ - dtest
✅ - dtest with topology changes
✅ - dtest with tablets
✅ - Unit Tests

Build Details:

  • Duration: 6 hr 46 min
  • Builder: i-046e50f0de2689a35 (m5ad.8xlarge)

@bitpathfinder
Copy link
Copy Markdown
Contributor Author

New version:

  • fixed review issues.

@avikivity
Copy link
Copy Markdown
Member

New version:

  • fixed review issues.

In the future, itemize the changes. "fixed review issues" asks the reviewer to go back to the discussion and discover what the review issues were, esp. which are newer issues and which are older issues.

Comment thread transport/server.cc
@bitpathfinder
Copy link
Copy Markdown
Contributor Author

New version:

  • fixed review issues.

In the future, itemize the changes. "fixed review issues" asks the reviewer to go back to the discussion and discover what the review issues were, esp. which are newer issues and which are older issues.

I thought as I comment on each of review threads that it was fixed/addressed(or not), it would be a repetition to mention the same items in the summary.

@avikivity
Copy link
Copy Markdown
Member

New version:

  • fixed review issues.

In the future, itemize the changes. "fixed review issues" asks the reviewer to go back to the discussion and discover what the review issues were, esp. which are newer issues and which are older issues.

I thought as I comment on each of review threads that it was fixed/addressed(or not), it would be a repetition to mention the same items in the summary.

These get scattered around, they're not ordered in time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/none Backport is not required promoted-to-master

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tablets: Shard bouncing should not assume that shards don't change

7 participants