Skip to content

Conversation

@flowguru
Copy link
Contributor

@flowguru flowguru commented Nov 10, 2022

A fraction of byteLimit will be used as the limit to fetch index. For the indexes fetched, fetch records for them in batch.

byteLimit always count the index size, it also count record if exist, it at least return 1 index-record entry and always include the last entry despite that adding the last entry despite it might exceed limit.

There is a Knob STRICTLY_ENFORCE_BYTE_LIMIT, when it is set, records will be discarded once the byteLimit is hit, despite they are fetched. Otherwise, return the whole batch.

joshua 20221111-061921-haofu-18a7457297fa0065

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • The PR has a description, explaining both the problem and the solution.
  • The description mentions which forms of testing were done and the testing seems reasonable.
  • Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

  • This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or main if this is the youngest branch)
  • There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: 2993b06
  • Duration 0:14:56
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: fdf79cf
  • Duration 0:21:35
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: 2993b06
  • Duration 0:27:33
  • Result: ❌ FAILED
  • Error: Error while executing command: ctest -j ${NPROC} --no-compress-output -T test --output-on-failure. Reason: exit status 8
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: fdf79cf
  • Duration 0:32:58
  • Result: ❌ FAILED
  • Error: Error while executing command: ctest -j ${NPROC} --no-compress-output -T test --output-on-failure. Reason: exit status 8
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS BigSur 11.5.2

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Monterey 12.x

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: 2993b06
  • Duration 1:57:05
  • Result: ❌ FAILED
  • Error: Error while executing command: if $(grep -q -- "--- FAIL:" ${CODEBUILD_SRC_DIR}/fdb-kubernetes-tests/logs/*.log); then echo "TESTS FAILED SEE THESE LOGS:"; echo ; grep -l -- "--- FAIL:" ${CODEBUILD_SRC_DIR}/fdb-kubernetes-tests/logs/*.log; exit 1; fi. Reason: exit status 1
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: fdf79cf
  • Duration 2:03:42
  • Result: ❌ FAILED
  • Error: Error while executing command: if $(grep -q -- "--- FAIL:" ${CODEBUILD_SRC_DIR}/fdb-kubernetes-tests/logs/*.log); then echo "TESTS FAILED SEE THESE LOGS:"; echo ; grep -l -- "--- FAIL:" ${CODEBUILD_SRC_DIR}/fdb-kubernetes-tests/logs/*.log; exit 1; fi. Reason: exit status 1
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

@fdb-windows-ci
Copy link
Collaborator

Doxense CI Report for Windows 10

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: 333b70b
  • Duration 0:16:52
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: 333b70b
  • Duration 0:37:09
  • Result: ❌ FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS BigSur 11.5.2

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Monterey 12.x


// create a temporary byte limit for index fetching ONLY, this should be excessive
// because readRange is cheap when reading additional bytes
const double fraction = 0.8;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My assumption would actually be that the primary is a much smaller fraction--maybe closer to 20% than 80%. That's just a guess though. Perhaps this should be a knob?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this probably should be a knob, but I suppose the other side would be, if we're assuming that overriding from the primary scan is cheap (because it's a single scan), and we only really are concerned about overreading from the secondary scans (because those are random reads and are potentially larger), then I suppose having this overestimate (as a heuristic) is fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, will make it a Knob.

@fdb-windows-ci
Copy link
Collaborator

Doxense CI Report for Windows 10

Comment on lines 4881 to 4871
*remainingLimitBytes -= size;
++cnt;
result.data.push_back(result.arena, kvms[i]);
if (SERVER_KNOBS->STRICTLY_ENFORCE_BYTE_LIMIT && *remainingLimitBytes <= 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alecgrieser I checked the byte here and before each batch

"storageserver.mapKeyValues.BeforeLoop");
for (; offset < sz; offset += SERVER_KNOBS->MAX_PARALLEL_QUICK_GET_VALUE) {

for (; offset<sz&& * remainingLimitBytes> 0; offset += SERVER_KNOBS->MAX_PARALLEL_QUICK_GET_VALUE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this condition (remainingLimitBytes > 0) correct with the EXACT streaming mode? It's a relatively rare streaming mode, but in that case, the user specifies that they want the row limit to be exactly matched, and so they specify a _byte) limit of 0 (indicating "unlimited"). I believe that in that case, we'd effectively want to ignore the remaining limit bytes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a great question, basically to clarify on the case where we have unlimited byteLimit.

  • it is handled here then here, and set to CLIENT_KNOBS->REPLY_BYTE_LIMIT. So if rowLimit wants more, there is nothing we can do anyways.

However I found a bug -- we actually allow either rowLimit or byteLimit to be empty, rather than only enforce rowLimit code

Comment on lines 4886 to 4887
result.data.back().key = input.data[i + offset].key;
result.data.back().value = input.data[i + offset].value;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we similarly need to set the results back() key and value if we terminate early because of the byte limit after a batch (that is, if the STRICTLY_ENFORCE_BYTE_LIMIT knob is false)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in mapSubQuery(, we set the KV if isBoundary is true. I agree it is a bit complicated here -- we need to set both the [0] and [indexResultSize - 1], so we have bool isBoundary = (i + offset) == 0 || (i + offset) == sz - 1; and pass it to mapSubQuery(.

So this code is only for "early" termination, KV will be set in mapSubQuery( in normal termination(i.e. return all entries)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstood this comment, this is a good catch, I missed it, will fix.


// create a temporary byte limit for index fetching ONLY, this should be excessive
// because readRange is cheap when reading additional bytes
const double fraction = 0.8;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this probably should be a knob, but I suppose the other side would be, if we're assuming that overriding from the primary scan is cheap (because it's a single scan), and we only really are concerned about overreading from the secondary scans (because those are random reads and are potentially larger), then I suppose having this overestimate (as a heuristic) is fine.

wait(mapKeyValues(data, getKeyValuesReply, req.mapper, &req, req.matchIndex, &remainingLimitBytes));
r = _r;
} catch (Error& e) {
// TODO: cannot allow txn_too_old to reach here, need to handle it gracefully, add test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think transaction_too_old here actually is fine. It'll get propagated up to the user, and it would be (I think) the expected behavior if doing a read-range close to the 5-second limit boundary. That being said, it would be kind of nice if it could return whatever it had already read rather than just throwing an error, but probably not the end of the world,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for explaining and makes total sense, it is something I want to get to the bottom to, before merging the PR.

// Map the scanned range to another list of keys and look up.
GetMappedKeyValuesReply _r =
wait(mapKeyValues(data, getKeyValuesReply, req.mapper, &req, tenantPrefix, req.matchIndex));
wait(mapKeyValues(data, getKeyValuesReply, req.mapper, &req, req.matchIndex, &remainingLimitBytes));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still need remainingLimitBytes to be a pointer? As I understood it, this was (in the older PR) passed as a pointer because it needed to include the secondary fetches and then decrease that value from the primary scans limit. However, with the current design, I don't think that's a thing any more, as the first scan uses a different limit, and then the secondary scans will keep iterating through batches until they have read up to the original limit (I think)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, it is used for stats and metric only now.. but we might still want to keep it?

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: 333b70b
  • Duration 1:57:19
  • Result: ❌ FAILED
  • Error: Error while executing command: if $(grep -q -- "--- FAIL:" ${CODEBUILD_SRC_DIR}/fdb-kubernetes-tests/logs/*.log); then echo "TESTS FAILED SEE THESE LOGS:"; echo ; grep -l -- "--- FAIL:" ${CODEBUILD_SRC_DIR}/fdb-kubernetes-tests/logs/*.log; exit 1; fi. Reason: exit status 1
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: de661bc
  • Duration 0:17:21
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

@flowguru flowguru force-pushed the byte branch 2 times, most recently from c75c183 to 03ea326 Compare November 10, 2022 21:27
@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS BigSur 11.5.2

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Monterey 12.x

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: dd101bd
  • Duration 1:58:26
  • Result: ❌ FAILED
  • Error: Error while executing command: if $(grep -q -- "--- FAIL:" ${CODEBUILD_SRC_DIR}/fdb-kubernetes-tests/logs/*.log); then echo "TESTS FAILED SEE THESE LOGS:"; echo ; grep -l -- "--- FAIL:" ${CODEBUILD_SRC_DIR}/fdb-kubernetes-tests/logs/*.log; exit 1; fi. Reason: exit status 1
  • Build Logs (available for 30 days)
  • Build Artifact (available for 30 days)

@fdb-windows-ci
Copy link
Collaborator

Doxense CI Report for Windows 10

alecgrieser
alecgrieser previously approved these changes Nov 11, 2022
@apple apple deleted a comment from foundationdb-ci Nov 11, 2022
@apple apple deleted a comment from foundationdb-ci Nov 11, 2022
@apple apple deleted a comment from foundationdb-ci Nov 11, 2022
@apple apple deleted a comment from foundationdb-ci Nov 11, 2022
@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

init( QUICK_GET_VALUE_FALLBACK, true );
init( QUICK_GET_KEY_VALUES_FALLBACK, true );
init( MAX_PARALLEL_QUICK_GET_VALUE, 50 ); if ( randomize && BUGGIFY ) MAX_PARALLEL_QUICK_GET_VALUE = deterministicRandom()->randomInt(1, 100);
init( STRICTLY_ENFORCE_BYTE_LIMIT, false); if( randomize && BUGGIFY ) STRICTLY_ENFORCE_BYTE_LIMIT = deterministicRandom()->coinflip() ? false : true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

STRICTLY_ENFORCE_BYTE_LIMIT = deterministicRandom()->coinflip();

@flowguru flowguru merged commit 7e78795 into apple:main Nov 11, 2022
flowguru added a commit to flowguru/foundationdb that referenced this pull request Nov 11, 2022
* add bytelimit for prefetch

A fraction of byteLimit will be used as the limit to fetch index.
For the indexes fetched, fetch records for them in batch.

byteLimit always count the index size, it also count record if exist,
it at least return 1 index-record entry and always include the last entry
despite that adding the last entry despite it might exceed limit.

There is a Knob STRICTLY_ENFORCE_BYTE_LIMIT, when it is set, records
will be discarded once the byteLimit is hit, despite they are fetched.
Otherwise, return the whole batch.
@apple apple deleted a comment from foundationdb-ci Nov 11, 2022
@flowguru flowguru deleted the byte branch November 11, 2022 22:38
flowguru added a commit that referenced this pull request Nov 12, 2022
* add bytelimit for prefetch

A fraction of byteLimit will be used as the limit to fetch index.
For the indexes fetched, fetch records for them in batch.

byteLimit always count the index size, it also count record if exist,
it at least return 1 index-record entry and always include the last entry
despite that adding the last entry despite it might exceed limit.

There is a Knob STRICTLY_ENFORCE_BYTE_LIMIT, when it is set, records
will be discarded once the byteLimit is hit, despite they are fetched.
Otherwise, return the whole batch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants