Skip to content

Conversation

@alexmiller-apple
Copy link
Contributor

@alexmiller-apple alexmiller-apple commented May 15, 2019

A profile on mighty showed that we're spending ~10%-15% of CPU doing peekMessagesFromMemory, which are largely going to be for spilled pieces of data, and thus completely wasted.

This involves a protocol change, so it's not 6.1 patchable.

There's another way to go with this, which would be to do this as a heuristic entirely on the TLog, via looking at if the request is for a begin that's within N million versions of the current version, and then taking endVersion an knownDurableVersion from the peek reply, and doing the same heuristic on the client side. That feels potentially trickier to feel comfortable with, but would be 6.1 patchable instead.

If a peek is entirely fulfilled from spilled data, then it's likely that
the next peek will be also.  It is thus wasteful for each of these peeks
to call peekMessagesFromMemory, which memcpy's excessively, and then
throw all that data away without using it.

Now, TLogs will give a hint back to peek cursors about if the provided
reply was served entirely from the spilled data, which peek curors then
feed back as the hint into their next request.

At some point, a cursor will send a request for only spilled data, get
an incomplete response, and then be told to send its next request as one
that peeks from memory as well, and then it will fully catch up.
If there's some spilled data, there's probably a lot of spilled data,
and now we can pull all of it faster.
@alexmiller-apple
Copy link
Contributor Author

alexmiller-apple commented May 15, 2019

I realized I was down to a one-line diff to address #1529 ParallelPeekGetMore when we've peeked only spilled data.

I haven't fully thought through if this is a good idea in all cases, but it does pass correctness ¯\_(ツ)_/¯

@alexmiller-apple
Copy link
Contributor Author

@fdb-build, test this please

expectedBegin = res.end;
self->futureResults.pop_front();
self->results = res;
self->onlySpilled = res.onlySpilled;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When switching from parallelGetMore back to regular GetMore, we still want to process the requests we already have outstanding to the TLogs to both avoid wasting work by the logs and to avoid an oscillation where the tlog spills more just as we stop using parallelGetMore. I think this can be done by uses the size of futureResults as another indicator that this function should be called, and then avoid issuing more requests is onlySpilled==false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants