Skip to content

Conversation

@alexmiller-apple
Copy link
Contributor

Formerly, they would prefer to peek from the primary's logs. Testing of
a failed region rejoining the cluster revealed that this becomes quite a
strain on the primary logs when extremely large volumes of peek requests
are coming from the Log Routers. It happens that we have satellites
that contain the same mutations with Log Router tags, that have no other
peeking load, so we can prefer to use the satellite to peek rather than
the primary to distribute load across TLogs better.

Unfortunately, this revealed a latent bug in how tagged mutations in the
KnownCommittedVersion->RecoveryVersion gap were copied across
generations when the number of log router tags were decreased.
Satellite TLogs would be assigned log router tags using the
team-building based logic in getPushLocations(), whereas TLogs would
internally re-index tags according to tag.id%logRouterTags. This
mismatch would mean that we could have:

Log0 -2:0 ----- -2:0  Log 0

Log1 -2:1 \
           >--- -2:1,-2:0 (-2:2 mod 2 becomes -2:0)  Log 1
Log2 -2:2 /

And now we have data that's tagged as -2:0 on a TLog that's not the
preferred location for -2:0, and therefore a BestLocationOnly cursor
would miss the mutations.

This was never noticed before, as we never
used a satellite as a preferred location to peek from. Merge cursors
always peek from all locations, and thus a peek for -2:0 that needed
data from the satellites would have gone to both TLogs and merged the
results.

We now take this mod-based re-indexing into account when assigning which
TLogs need to recover which tags from the previous generation, to make
sure that tag.id%logRouterTags always results in the assigned TLog being
the preferred location.

Unfortunately, previously existing will potentially have existing
satellites with log router tags indexed incorrectly, so this transition
needs to be gated on a log_version transition. Old LogSets will have
an old LogVersion, and we won't prefer the sattelite for peeking. Log
Sets post-6.2 (opt-in) or post-6.3 (default) will be indexed correctly,
and therefore we can safely offload peeking onto the satellites.

@alexmiller-apple
Copy link
Contributor Author

I did finally get a correctness failure when not handling the upgrade, so it is indeed necessary.

Correctness on this is running as 20190703-195123-alexmiller-8c4ae88ea8c5e64c , but previous iterations have already put a few tens of thousands of passes behind this.

Copy link
Contributor

@vishesh vishesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nits.

set_config("log_spill:=1"); // VALUE
}
int logVersion = deterministicRandom()->randomInt( 0, 3 );
int logVersion = deterministicRandom()->randomInt( 0, 4 );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this can be ...->randomInt(0, MAX_SUPPORTED) instead?

case 2:
set_config("log_version:=3"); // 6.1
break;
case 3:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to previous comment, this whole switch block perhaps can instead be if (logVersion >= MIN_SUPPORTED && logVersion < MAX_SUPPORTED) { set_config(format("log_version=%d", logVersion)); } so that this doesn't need to change each time version is bumped.

@alexmiller-apple
Copy link
Contributor Author

CTest failure is CI badness:

distcc[2039] (dcc_mkdir) ERROR: mkdir '/opt/distcc/lock' failed: Permission denied

@alexmiller-apple
Copy link
Contributor Author

@fdb-build, run ctest please

@alexmiller-apple alexmiller-apple added this to the 6.2 milestone Jul 5, 2019
And refactor some code to make adding more TLogVersions easier.
Formerly, they would prefer to peek from the primary's logs.  Testing of
a failed region rejoining the cluster revealed that this becomes quite a
strain on the primary logs when extremely large volumes of peek requests
are coming from the Log Routers.  It happens that we have satellites
that contain the same mutations with Log Router tags, that have no other
peeking load, so we can prefer to use the satellite to peek rather than
the primary to distribute load across TLogs better.

Unfortunately, this revealed a latent bug in how tagged mutations in the
KnownCommittedVersion->RecoveryVersion gap were copied across
generations when the number of log router tags were decreased.
Satellite TLogs would be assigned log router tags using the
team-building based logic in getPushLocations(), whereas TLogs would
internally re-index tags according to tag.id%logRouterTags.  This
mismatch would mean that we could have:

    Log0 -2:0 ----- -2:0  Log 0

    Log1 -2:1 \
               >--- -2:1,-2:0 (-2:2 mod 2 becomes -2:0)  Log 1
    Log2 -2:2 /

And now we have data that's tagged as -2:0 on a TLog that's not the
preferred location for -2:0, and therefore a BestLocationOnly cursor
would miss the mutations.

This was never noticed before, as we never
used a satellite as a preferred location to peek from.  Merge cursors
always peek from all locations, and thus a peek for -2:0 that needed
data from the satellites would have gone to both TLogs and merged the
results.

We now take this mod-based re-indexing into account when assigning which
TLogs need to recover which tags from the previous generation, to make
sure that tag.id%logRouterTags always results in the assigned TLog being
the preferred location.

Unfortunately, previously existing will potentially have existing
satellites with log router tags indexed incorrectly, so this transition
needs to be gated on a `log_version` transition.  Old LogSets will have
an old LogVersion, and we won't prefer the sattelite for peeking.  Log
Sets post-6.2 (opt-in) or post-6.3 (default) will be indexed correctly,
and therefore we can safely offload peeking onto the satellites.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants