Save TLog resources by letting peek request only spilled data. #1584

alexmiller-apple · 2019-05-15T01:43:11Z

A profile on mighty showed that we're spending ~10%-15% of CPU doing peekMessagesFromMemory, which are largely going to be for spilled pieces of data, and thus completely wasted.

This involves a protocol change, so it's not 6.1 patchable.

There's another way to go with this, which would be to do this as a heuristic entirely on the TLog, via looking at if the request is for a begin that's within N million versions of the current version, and then taking endVersion an knownDurableVersion from the peek reply, and doing the same heuristic on the client side. That feels potentially trickier to feel comfortable with, but would be 6.1 patchable instead.

If a peek is entirely fulfilled from spilled data, then it's likely that the next peek will be also. It is thus wasteful for each of these peeks to call peekMessagesFromMemory, which memcpy's excessively, and then throw all that data away without using it. Now, TLogs will give a hint back to peek cursors about if the provided reply was served entirely from the spilled data, which peek curors then feed back as the hint into their next request. At some point, a cursor will send a request for only spilled data, get an incomplete response, and then be told to send its next request as one that peeks from memory as well, and then it will fully catch up.

If there's some spilled data, there's probably a lot of spilled data, and now we can pull all of it faster.

alexmiller-apple · 2019-05-15T07:08:04Z

I realized I was down to a one-line diff to address #1529 ParallelPeekGetMore when we've peeked only spilled data.

I haven't fully thought through if this is a good idea in all cases, but it does pass correctness ¯\_(ツ)_/¯

alexmiller-apple · 2019-05-17T19:43:06Z

@fdb-build, test this please

etschannen · 2019-05-23T18:13:16Z

fdbserver/LogSystemPeekCursor.actor.cpp

 					expectedBegin = res.end;
 					self->futureResults.pop_front();
 					self->results = res;
+					self->onlySpilled = res.onlySpilled;


When switching from parallelGetMore back to regular GetMore, we still want to process the requests we already have outstanding to the TLogs to both avoid wasting work by the logs and to avoid an oscillation where the tlog spills more just as we stop using parallelGetMore. I think this can be done by uses the size of futureResults as another indicator that this function should be called, and then avoid issuing more requests is onlySpilled==false

MultiCursor already did this.

alexmiller-apple requested a review from etschannen May 15, 2019 01:43

alexmiller-apple assigned etschannen May 15, 2019

And now use spilledOnly as a hint to do parallel peeks.

658e61b

If there's some spilled data, there's probably a lot of spilled data, and now we can pull all of it faster.

etschannen reviewed May 23, 2019

View reviewed changes

alexmiller-apple added 3 commits June 18, 2019 17:33

Merge remote-tracking branch 'upstream/master' into spilled-only-peek

51fd42a

Fully consume parallelPeekMore results before switching back.

ce24db3

Update getMore() contract.

26343f5

MultiCursor already did this.

etschannen merged commit 1c005d5 into apple:master Jun 21, 2019

alexmiller-apple mentioned this pull request Jul 9, 2019

Improve the behavior of parallelPeekMore+onlySpilled. #1812

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Save TLog resources by letting peek request only spilled data. #1584

Save TLog resources by letting peek request only spilled data. #1584

Uh oh!

alexmiller-apple commented May 15, 2019 •

edited

Loading

Uh oh!

alexmiller-apple commented May 15, 2019 •

edited

Loading

Uh oh!

alexmiller-apple commented May 17, 2019

Uh oh!

etschannen May 23, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Save TLog resources by letting peek request only spilled data. #1584

Save TLog resources by letting peek request only spilled data. #1584

Uh oh!

Conversation

alexmiller-apple commented May 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexmiller-apple commented May 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexmiller-apple commented May 17, 2019

Uh oh!

etschannen May 23, 2019

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alexmiller-apple commented May 15, 2019 •

edited

Loading

alexmiller-apple commented May 15, 2019 •

edited

Loading