Skip to content

Conversation

@jzhou77
Copy link
Contributor

@jzhou77 jzhou77 commented Nov 5, 2022

Cherrypick #8745

When VV is enabled, the comparison of storage server version and read version should use the original read version, otherwise, the client may get the wrong transaction_too_old error. In the failed seed, this prevents the consistency check from finishing. the waitForVersion() may use the oldest version in the MVCC window as the actual read version, which may often result in the transaction_too_old if the MVCC moved, especially during a readRange(). The change here is to try using a higher read version that is still valid, but not likely to cause transaction_too_old error.

commit: 333a8bf
seed: -f ./foundationdb/tests/fast/MutationLogReaderCorrectness.toml -s 728034139 -b on
clang

500k 20221105-190436-jzhou-52ae8ebf0861e8bd passed.

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • The PR has a description, explaining both the problem and the solution.
  • The description mentions which forms of testing were done and the testing seems reasonable.
  • Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

  • This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or main if this is the youngest branch)
  • There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

When VV is enabled, the comparison of storage server version and read version
should use the original read version, otherwise, the client may get the wrong
transaction_too_old error.
@fdb-windows-ci
Copy link
Collaborator

Doxense CI Report for Windows 10

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS BigSur 11.5.2

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: 7121350
  • Duration 0:32:00
  • Result: ❌ FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Monterey 12.x

@fdb-windows-ci
Copy link
Collaborator

Doxense CI Report for Windows 10

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

@jzhou77 jzhou77 marked this pull request as ready for review November 5, 2022 21:49
@jzhou77 jzhou77 requested review from dlambrig and sbodagala November 7, 2022 16:05
@jzhou77 jzhou77 self-assigned this Nov 7, 2022
@jzhou77 jzhou77 marked this pull request as draft November 7, 2022 21:33
@jzhou77
Copy link
Contributor Author

jzhou77 commented Nov 7, 2022

Changed the return value of waitForVersion() as the fix instead.

500k 20221107-224558-jzhou-0ff337f1652a9d5e. 3 failures are not using version vector, so should be unrelated.

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Monterey 12.x

@fdb-windows-ci
Copy link
Collaborator

Doxense CI Report for Windows 10

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: 18add2a
  • Duration 1:15:59
  • Result: ❌ FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

@jzhou77 jzhou77 marked this pull request as ready for review November 8, 2022 00:46
@jzhou77 jzhou77 requested a review from sbodagala November 8, 2022 00:46
@dlambrig
Copy link
Contributor

dlambrig commented Nov 8, 2022

What test surfaced this problem?

if (data->version.get() < readVersion) {
// Majority of the case, try using higher version to avoid
// transaction_too_old error when oldestVersion advances.
return data->version.get(); // majority of cases
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this solution. This is correct because this storage server doesn't have any newer commits in the range (commitVersion, readVersion] and so can read at any version in this range.

sbodagala
sbodagala previously approved these changes Nov 8, 2022
@jzhou77
Copy link
Contributor Author

jzhou77 commented Nov 8, 2022

What test surfaced this problem?

A simulation test found this:

commit: 333a8bf
seed: -f ./foundationdb/tests/fast/MutationLogReaderCorrectness.toml -s 728034139 -b on
clang

@jzhou77 jzhou77 requested a review from sbodagala November 8, 2022 17:06
@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

@fdb-windows-ci
Copy link
Collaborator

Doxense CI Report for Windows 10

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

@jzhou77 jzhou77 merged commit 3a4641a into apple:release-7.1 Nov 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants