Skip to content

Conversation

@sfc-gh-satherton
Copy link
Collaborator

@sfc-gh-satherton sfc-gh-satherton commented Apr 5, 2021

When pages overflow, two or more replacement pages are built. This refactor avoids some problems with the previous logic that could result in many under-filled pages being generated. The new logic will split 1 page into 2 more evenly, and if more than 2 pages are being generated then page boundaries will be selected to maximize space usage in all but the final two pages which will share the remaining slack between them.

Previously it was unclear how much space used by Redwood was due to the remap cleanup window and how much was caused by excessive page slack addressed by this PR, so StorageBytes now includes a temporary space concept. Also, logging of storage bytes has been moved to a more efficient place where it is called 4x or 8x less often without any loss of detail.

Passed 100k correctness of only RedwoodCorrectnessBTree 20210404-084601-satherton_redwood-99a5a8607c0ca3ef
Passed 100k full correctness, there were errors but they seem similar to recent nightly_master errors.

Code-Reviewer Section

The general guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • The PR has a description, explaining both the problem and the solution.
  • The description mentions which forms of testing were done and the testing seems reasonable.
  • Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

  • This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or master if this is the youngest branch)
  • There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: foundationdb-pull-request-build
  • Commit ID: 684ae96
  • Result: FAILED
  • Build Logs (available for 7 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: foundationdb-pull-request-build
  • Commit ID: 5c93e68
  • Result: FAILED
  • Build Logs (available for 7 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: foundationdb-pull-request-build
  • Commit ID: 5c93e68
  • Result: FAILED
  • Build Logs (available for 7 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: foundationdb-pull-request-build
  • Commit ID: 5c93e68
  • Result: FAILED
  • Build Logs (available for 7 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: foundationdb-pull-request-build
  • Commit ID: 5f89640
  • Result: FAILED
  • Build Logs (available for 7 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: foundationdb-pull-request-build
  • Commit ID: aacee06
  • Result: FAILED
  • Build Logs (available for 7 days)

@sfc-gh-satherton
Copy link
Collaborator Author

This PR includes #4617 and should be merged after it.

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: foundationdb-pull-request-build
  • Commit ID: 394e562
  • Result: FAILED
  • Build Logs (available for 7 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: foundationdb-pull-request-build
  • Commit ID: f8786da
  • Result: FAILED
  • Build Logs (available for 7 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: foundationdb-pull-request-build
  • Commit ID: 27de8e3
  • Result: FAILED
  • Build Logs (available for 7 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: foundationdb-pull-request-build
  • Commit ID: 5074ac6
  • Result: FAILED
  • Build Logs (available for 7 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: foundationdb-pull-request-build
  • Commit ID: 5e6655f
  • Result: FAILED
  • Build Logs (available for 7 days)

…newest TLog version only). Moved StorageBytes detail from SpecialCounters to the traceCounters() decorator callback to avoid calling getStorageBytes(), which makes a system call, four extra times on storage servers and eight extra times on logs.
@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: foundationdb-pull-request-build
  • Commit ID: cbd77fe
  • Result: FAILED
  • Build Logs (available for 7 days)

@sfc-gh-satherton
Copy link
Collaborator Author

This PR includes #4617

@sfc-gh-satherton sfc-gh-satherton marked this pull request as ready for review April 15, 2021 01:26
@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: foundationdb-pull-request-build
  • Commit ID: 3aa237b
  • Result: SUCCEEDED
  • Build Logs (available for 7 days)

@foundationdb-ci
Copy link
Contributor

AWS CodeBuild CI Report

  • CodeBuild project: foundationdb-pull-request-build
  • Commit ID: 561dfc0
  • Result: SUCCEEDED
  • Build Logs (available for 7 days)

// This is a loop because the first expansion may increase per-node overhead which could
// then require a second expansion.
while (bytesLeft < 0) {
int newBlocks = (-bytesLeft + blockSize - 1) / blockSize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment about this math as we discussed.

upperBound->toString(false).c_str());
ASSERT(!records.empty());

// Leaves can have just one record if it's large, but internal pages should have at least 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth adding a comment on choosing this number for internal nodes.

--p.count;
debug_printf("Skipping first null record, new count=%d\n", p.count);

// If the page is now empty then it must be the last page in pagesToBuild, otherwise there would
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per discussion, please add a slide or an ascii picture to make it clear visually whenever you get a chance.

Copy link
Contributor

@sfc-gh-ngoyal sfc-gh-ngoyal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

I know you already plan to make the naming consistent. So looking forward to the page/block/node naming to be made consistent. It can be a bit confusing while reading the code as the Pager also uses page as its building block and it's different from what BTree treats as pages.

@sfc-gh-ngoyal sfc-gh-ngoyal merged commit 63d82b9 into apple:master May 3, 2021
jzhou77 added a commit to jzhou77/foundationdb that referenced this pull request Jul 20, 2022
Backport changes made to TLogServer in apple#4616 back to old TLog implementations.
Old implementations were using SpecialCounter, which is string. The new type
is int64_t. Without the change, we can have many events like:

<Event Severity="30" Time="1658165532.958152" DateTime="2022-07-18T17:32:12Z" Type="ScopeEventFieldTypeMismatch" ID="0000000000000000" EventType="TraceEvent.TLogMetrics" FieldName="QueueDiskBytesUsed" OldType="Int64" NewType="String" ThreadID="2451559325190099131" Machine="10.242.248.26:4703" LogGroup="xxxx" Roles="TL" />
jzhou77 added a commit to jzhou77/foundationdb that referenced this pull request Jul 20, 2022
Backport changes made to TLogServer in apple#4616 back to old TLog implementations.
Old implementations were using SpecialCounter, which is string. The new type
is int64_t. Without the change, we can have many events like:

<Event Severity="30" Time="1658165532.958152" DateTime="2022-07-18T17:32:12Z" Type="ScopeEventFieldTypeMismatch" ID="0000000000000000" EventType="TraceEvent.TLogMetrics" FieldName="QueueDiskBytesUsed" OldType="Int64" NewType="String" ThreadID="2451559325190099131" Machine="10.242.248.26:4703" LogGroup="xxxx" Roles="TL" />
jzhou77 added a commit that referenced this pull request Jul 20, 2022
Backport changes made to TLogServer in #4616 back to old TLog implementations.
Old implementations were using SpecialCounter, which is string. The new type
is int64_t. Without the change, we can have many events like:

<Event Severity="30" Time="1658165532.958152" DateTime="2022-07-18T17:32:12Z" Type="ScopeEventFieldTypeMismatch" ID="0000000000000000" EventType="TraceEvent.TLogMetrics" FieldName="QueueDiskBytesUsed" OldType="Int64" NewType="String" ThreadID="2451559325190099131" Machine="10.242.248.26:4703" LogGroup="xxxx" Roles="TL" />
1inker pushed a commit to owtech/foundationdb that referenced this pull request Aug 17, 2022
Backport changes made to TLogServer in apple#4616 back to old TLog implementations.
Old implementations were using SpecialCounter, which is string. The new type
is int64_t. Without the change, we can have many events like:

<Event Severity="30" Time="1658165532.958152" DateTime="2022-07-18T17:32:12Z" Type="ScopeEventFieldTypeMismatch" ID="0000000000000000" EventType="TraceEvent.TLogMetrics" FieldName="QueueDiskBytesUsed" OldType="Int64" NewType="String" ThreadID="2451559325190099131" Machine="10.242.248.26:4703" LogGroup="xxxx" Roles="TL" />
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants