-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Redwood page splitting/building refactor #4616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redwood page splitting/building refactor #4616
Conversation
AWS CodeBuild CI Report
|
AWS CodeBuild CI Report
|
AWS CodeBuild CI Report
|
AWS CodeBuild CI Report
|
…ing file, and whether or not to insert new records.
AWS CodeBuild CI Report
|
AWS CodeBuild CI Report
|
|
This PR includes #4617 and should be merged after it. |
AWS CodeBuild CI Report
|
…improvements # Conflicts: # fdbserver/VersionedBTree.actor.cpp
AWS CodeBuild CI Report
|
AWS CodeBuild CI Report
|
AWS CodeBuild CI Report
|
AWS CodeBuild CI Report
|
…newest TLog version only). Moved StorageBytes detail from SpecialCounters to the traceCounters() decorator callback to avoid calling getStorageBytes(), which makes a system call, four extra times on storage servers and eight extra times on logs.
AWS CodeBuild CI Report
|
|
This PR includes #4617 |
AWS CodeBuild CI Report
|
AWS CodeBuild CI Report
|
| // This is a loop because the first expansion may increase per-node overhead which could | ||
| // then require a second expansion. | ||
| while (bytesLeft < 0) { | ||
| int newBlocks = (-bytesLeft + blockSize - 1) / blockSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment about this math as we discussed.
| upperBound->toString(false).c_str()); | ||
| ASSERT(!records.empty()); | ||
|
|
||
| // Leaves can have just one record if it's large, but internal pages should have at least 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth adding a comment on choosing this number for internal nodes.
| --p.count; | ||
| debug_printf("Skipping first null record, new count=%d\n", p.count); | ||
|
|
||
| // If the page is now empty then it must be the last page in pagesToBuild, otherwise there would |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per discussion, please add a slide or an ascii picture to make it clear visually whenever you get a chance.
sfc-gh-ngoyal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great.
I know you already plan to make the naming consistent. So looking forward to the page/block/node naming to be made consistent. It can be a bit confusing while reading the code as the Pager also uses page as its building block and it's different from what BTree treats as pages.
Backport changes made to TLogServer in apple#4616 back to old TLog implementations. Old implementations were using SpecialCounter, which is string. The new type is int64_t. Without the change, we can have many events like: <Event Severity="30" Time="1658165532.958152" DateTime="2022-07-18T17:32:12Z" Type="ScopeEventFieldTypeMismatch" ID="0000000000000000" EventType="TraceEvent.TLogMetrics" FieldName="QueueDiskBytesUsed" OldType="Int64" NewType="String" ThreadID="2451559325190099131" Machine="10.242.248.26:4703" LogGroup="xxxx" Roles="TL" />
Backport changes made to TLogServer in apple#4616 back to old TLog implementations. Old implementations were using SpecialCounter, which is string. The new type is int64_t. Without the change, we can have many events like: <Event Severity="30" Time="1658165532.958152" DateTime="2022-07-18T17:32:12Z" Type="ScopeEventFieldTypeMismatch" ID="0000000000000000" EventType="TraceEvent.TLogMetrics" FieldName="QueueDiskBytesUsed" OldType="Int64" NewType="String" ThreadID="2451559325190099131" Machine="10.242.248.26:4703" LogGroup="xxxx" Roles="TL" />
Backport changes made to TLogServer in #4616 back to old TLog implementations. Old implementations were using SpecialCounter, which is string. The new type is int64_t. Without the change, we can have many events like: <Event Severity="30" Time="1658165532.958152" DateTime="2022-07-18T17:32:12Z" Type="ScopeEventFieldTypeMismatch" ID="0000000000000000" EventType="TraceEvent.TLogMetrics" FieldName="QueueDiskBytesUsed" OldType="Int64" NewType="String" ThreadID="2451559325190099131" Machine="10.242.248.26:4703" LogGroup="xxxx" Roles="TL" />
Backport changes made to TLogServer in apple#4616 back to old TLog implementations. Old implementations were using SpecialCounter, which is string. The new type is int64_t. Without the change, we can have many events like: <Event Severity="30" Time="1658165532.958152" DateTime="2022-07-18T17:32:12Z" Type="ScopeEventFieldTypeMismatch" ID="0000000000000000" EventType="TraceEvent.TLogMetrics" FieldName="QueueDiskBytesUsed" OldType="Int64" NewType="String" ThreadID="2451559325190099131" Machine="10.242.248.26:4703" LogGroup="xxxx" Roles="TL" />
When pages overflow, two or more replacement pages are built. This refactor avoids some problems with the previous logic that could result in many under-filled pages being generated. The new logic will split 1 page into 2 more evenly, and if more than 2 pages are being generated then page boundaries will be selected to maximize space usage in all but the final two pages which will share the remaining slack between them.
Previously it was unclear how much space used by Redwood was due to the remap cleanup window and how much was caused by excessive page slack addressed by this PR, so StorageBytes now includes a temporary space concept. Also, logging of storage bytes has been moved to a more efficient place where it is called 4x or 8x less often without any loss of detail.
Passed 100k correctness of only RedwoodCorrectnessBTree
20210404-084601-satherton_redwood-99a5a8607c0ca3efPassed 100k full correctness, there were errors but they seem similar to recent nightly_master errors.
Code-Reviewer Section
The general guidelines can be found here.
Please check each of the following things and check all boxes before accepting a PR.
For Release-Branches
If this PR is made against a release-branch, please also check the following:
release-branchormasterif this is the youngest branch)