cql_server::connection: Process rebounce message in case of multiple shard migrations#20493
Conversation
Please make the cover letter stand alone and not rely on the reader knowing anything about 17309. |
e91a1a2 to
7876a1f
Compare
|
New version: Fixing a build error. |
PR description was updated. |
|
@avikivity after discussing this item with @kbr-scylla, we think that we should like split the fix for message multi-rebounce and selection of the shard for executing the query (as per these comments in the base PR) to different issues since these are different problems. What's your opinion on that? |
🟢 CI State: SUCCESS✅ - Build Build Details:
|
7 commits, 1 sentence to explain. |
| [this, is = std::move(is), cs = cs.move_to_other_shard(), stream, permit = std::move(permit), process_fn, | ||
| gt = tracing::global_trace_state_ptr(std::move(trace_state)), | ||
| cached_vals = std::move(bounce_msg->take_cached_pk_function_calls()), dialect] (cql_server& server) { | ||
| cached_vals = std::move(cached_vals), dialect] (cql_server& server) mutable { |
There was a problem hiding this comment.
That mutable here is actually make things worse. cahced_vals should be copied, not moved since we copy between cpu here and move will move pointers instead. It accidentally worked as intended because of missing mutable, but in fact cached_vals should not be moved below.
There was a problem hiding this comment.
Good catch. I guess the limitation is because of the single-threaded custom allocator?
There was a problem hiding this comment.
Good catch. I guess the limitation is because of the single-threaded custom allocator?
Yes. We do have cross shard free (mostly because of exception which we do not control where they are allocated), but it is less efficient (it is a single consumer multiple producers queue with all the locking that it entails).
There was a problem hiding this comment.
But when cached values only? Basically everything should be copied.
There was a problem hiding this comment.
Fixed, now cached values are copied, but lambda is still mutable due to further move of cached values in the process_fn
7876a1f to
4056777
Compare
|
New version (addressing review comments):
|
🔴 CI State: FAILURE✅ - Build Build Details:
|
So it can be used for a template concept in the next patch. Signed-off-by: Benny Halevy <[email protected]>
Quoting Avi Kivity: > Out of scope: we should consider detemplating this. As a follow-up we should consider that and pass a function object as process_fn, just make sure there are no drawbacks. Signed-off-by: Benny Halevy <[email protected]>
So it can safely passed between shards, as will be needed in the following patch that handles a (re)bounce_to_shard result from process_fn that's called by `process_on_shard` on the `move_to_shard`. With that in mind, pass the `bounce_to_shard` payload to `process_on_shard` rather than the foreign shared ptr since the latter grabs what it needs from it on entry and the shared_ptr can be released on the calling shard. Signed-off-by: Benny Halevy <[email protected]>
It is currently unused in `process_on_shard`, which generates an empty service_permit. The next patch may call process_on_shard in a loop, so it can't simply move the permit to the callee and better hold on to it until processing completes. `cql_server::connection::process` was turned into a coroutine in this patch to hold on to the permit parameter in a simple way. This is a preliminary step to changing `if (bounce_msg)` to `while (bounce_msg)` that will allow rebouncing the message in case it moved yet again when yielding in `process_on_shard`. Signed-off-by: Benny Halevy <[email protected]>
Signed-off-by: Benny Halevy <[email protected]>
4056777 to
1f15f09
Compare
|
New version:
|
🟢 CI State: SUCCESS✅ - Build Build Details:
|
…shard `cql_server::connection::process_on_shard` is made a co-routine to make sure captured objects' lifetime is managed by the source shard, avoiding error prone inter-shard objects transfers.
Rebounce the msg to another shard if needed, e.g. in the case of tablet migration. An example for that, as given by Tomasz Grabiec: > Bouncing happens when executing LWT statement in > modification_statement::execute_with_condition by returning a > special result message kind. The code assumes that after > jumping to the shard from the bounce request, the result > message is the regular one and not yet another bounce. > There is no problem with vnodes, because shards don't change. > With tablets, they can change at run time on migration. Fixes scylladb#15465 Signed-off-by: Benny Halevy <[email protected]>
The test emulates several LWT(Lightweight Transaction) query rebounces. Currently, the code that processes queries does not expect that a query may be rebounced more than once. It was impossible with the VNodes, but with intruduction of the Tablets, data can be moved between shards by the balancer thus a query can be rebounced to different shards multiple times.
1f15f09 to
68740f5
Compare
🟢 CI State: SUCCESS✅ - Build Build Details:
|
|
New version:
|
In the future, itemize the changes. "fixed review issues" asks the reviewer to go back to the discussion and discover what the review issues were, esp. which are newer issues and which are older issues. |
I thought as I comment on each of review threads that it was fixed/addressed(or not), it would be a repetition to mention the same items in the summary. |
These get scattered around, they're not ordered in time. |
During a query execution, the query can be re-bounced to another shard if the requested data is located there. Previous implementation assumed that the shard cannot be changed after first re-bounce, however with the introduction of Tablets, data could be migrated to another shard after the query was already re-bounced, causing a failure of the query execution. To avoid this issue, the query is re-bounced as needed until it is executed on the correct shard.
Fixes #15465