KAFKA-15010 ZK migration failover support #13758

mumrah · 2023-05-24T22:28:09Z

Previously, if the KRaft controller failed over while metadata changes were pending in the KRaftMigrationDriver queue, the writes to ZK would be lost. This patch adds snapshot reconciliation so that the controller can ensure a consistent state with ZK after a failover.

Internally this adds a new state SYNC_KRAFT_TO_ZK to the KRaftMigrationDriver state machine. The controller passes through this state after the initial migration and each time a controller becomes active thereafter.

…ystem test

…n-failover

…into KAKFA-15010-migration-failover

…n-failover

mumrah · 2023-05-26T03:17:43Z

metadata/src/main/java/org/apache/kafka/metadata/migration/KRaftMigrationZkWriter.java

-            Map<String, Double> quotaMap = clientQuotasImage.entities().get(entity).quotaMap();
+            Map<String, Double> quotaMap = getClientQuotaMapForEntity(clientQuotasImage, entity);


There was an NPE here prior to this patch. If the SCRAM credentials changed, but ClientQuotas did not, the get(entity) call would return a null.

…n-failover

cmccabe · 2023-05-30T20:50:03Z

core/src/main/scala/kafka/zk/ZkMigrationClient.scala

    new util.HashSet[Integer](zkClient.getSortedBrokerList.map(Integer.valueOf).toSet.asJava)
  }

+  override def readProducerId(): util.Optional[java.lang.Long] = {


This doesn't seem to be returning nextProducerId, it's returning the current producer id.

Why not just have writeProducerId take a ProducerIdsBlock object so we don't have to do a bunch of dubious translation?

Oh right, in KRaft when we give out a new block, we persist the next ID in the log.

cmccabe · 2023-05-30T20:52:49Z

metadata/src/main/java/org/apache/kafka/metadata/migration/KRaftMigrationDriver.java

+        applyMigrationOperation("Recovering migration state from ZK", zkMigrationClient::getOrCreateMigrationRecoveryState);
        String maybeDone = migrationLeadershipState.zkMigrationComplete() ? "done" : "not done";
-        log.info("Recovered migration state {}. ZK migration is {}.", migrationLeadershipState, maybeDone);
+        log.info("ZK migration is {}.", maybeDone);


I'd really prefer to say something like "Initial ZK load is done" / "Initial ZK load is not done"

We should also change ZkMigrationLeadershipState.zkMigrationComplete -> ZkMigrationLeadershipState.initialZkLoadComplete

After all, the whole process here is technically "ZK migration" not just the initial load

cmccabe · 2023-05-30T20:54:39Z

metadata/src/main/java/org/apache/kafka/metadata/migration/KRaftMigrationDriver.java

                    offsetAndEpochAfterMigration.epoch());
-                applyMigrationOperation("Finished migrating ZK data", state -> zkMigrationClient.setMigrationRecoveryState(newState));
-                transitionTo(MigrationDriverState.KRAFT_CONTROLLER_TO_BROKER_COMM);
+                applyMigrationOperation("Finished migrating ZK data to KRaft", state -> zkMigrationClient.setMigrationRecoveryState(newState));


Maybe add a comment here about how we always go through the sync kraft -> zk state here, even immediately after we've loaded from zk.

metadata/src/main/java/org/apache/kafka/metadata/migration/KRaftMigrationDriver.java

cmccabe

LGTM once comments are addressed (produceID comment is the most urgent)

…n-failover

This patch adds snapshot reconciliation during ZK to KRaft migration. This reconciliation happens whenever a snapshot is loaded by KRaft, or during a controller failover. Prior to this patch, it was possible to miss metadata updates coming from KRaft when dual-writing to ZK. Internally this adds a new state SYNC_KRAFT_TO_ZK to the KRaftMigrationDriver state machine. The controller passes through this state after the initial ZK migration and each time a controller becomes active. Logging during dual-write was enhanced to include a count of write operations happening. Reviewers: Colin P. McCabe <[email protected]>

…tream-3.5 * commit 'c2f6f29ca6e1306ac77ec726bac4cd09bd1aa80b': (76 commits) KAFKA-15019: Improve handling of broker heartbeat timeouts (apache#13759) KAFKA-15003: Fix ZK sync logic for partition assignments (apache#13735) MINOR: Add 3.5 upgrade steps for ZK and KRaft (apache#13792) KAFKA-15010 ZK migration failover support (apache#13758) KAFKA-15017 Fix snapshot load in dual write mode for ClientQuotas and SCRAM (apache#13757) MINOR: Update LICENSE-binary following snappy upgrade (apache#13791) Upgrade to snappy v1.1.10.0 (apache#13786) KAFKA-15004: Fix configuration dual-write during migration (apache#13767) KAFKA-8713: JsonConverter replace.null.with.default should prevent emitting default for Struct fields (apache#13781) KAFKA-14996: Handle overly large user operations on the kcontroller (apache#13742) ...

mumrah added 9 commits May 23, 2023 12:22

WIP

1318cd6

Improve logging during dual-write, include a controller failover in s…

78e53e5

…ystem test

Merge remote-tracking branch 'origin/trunk' into KAKFA-15010-migratio…

a1f3ea5

…n-failover

cleanup after merge

09637f6

Add a system test for snapshot reconciliation

415f9e5

Add a system test for snapshot reconciliation

77360ff

Merge remote-tracking branch 'mumrah/KAKFA-15010-migration-failover' …

c20c31b

…into KAKFA-15010-migration-failover

Merge remote-tracking branch 'origin/trunk' into KAKFA-15010-migratio…

b1a8a31

…n-failover

Adjust system test

08822d6

mumrah commented May 26, 2023

View reviewed changes

mumrah added the kraft label May 26, 2023

mumrah added 4 commits May 27, 2023 17:43

Merge remote-tracking branch 'origin/trunk' into KAKFA-15010-migratio…

2dacad9

…n-failover

Fixup after merge

e36da55

Add unit test for KRaftMigrationZkWriter

fbe1b2c

Test updates

96bb357

cmccabe reviewed May 30, 2023

View reviewed changes

metadata/src/main/java/org/apache/kafka/metadata/migration/KRaftMigrationDriver.java Show resolved Hide resolved

cmccabe approved these changes May 30, 2023

View reviewed changes

mumrah added 4 commits May 31, 2023 11:02

Fix producer ID read from ZK, address other PR feedback

ed069df

Merge remote-tracking branch 'origin/trunk' into KAKFA-15010-migratio…

ce2aa1c

…n-failover

fixup after merge

345599d

update system test for new log

779d088

mumrah merged commit d27ba5b into apache:trunk Jun 1, 2023

chia7712 mentioned this pull request Jun 22, 2023

MINOR: fix flaky ZkMigrationIntegrationTest.testNewAndChangedTopicsIn… #13902

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

KAFKA-15010 ZK migration failover support #13758

KAFKA-15010 ZK migration failover support #13758

Uh oh!

mumrah commented May 24, 2023 •

edited

Loading

Uh oh!

mumrah May 26, 2023

Uh oh!

cmccabe May 30, 2023

Uh oh!

mumrah May 31, 2023

Uh oh!

cmccabe May 30, 2023

Uh oh!

cmccabe May 30, 2023

Uh oh!

Uh oh!

cmccabe left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		Map<String, Double> quotaMap = clientQuotasImage.entities().get(entity).quotaMap();
		Map<String, Double> quotaMap = getClientQuotaMapForEntity(clientQuotasImage, entity);

KAFKA-15010 ZK migration failover support #13758

KAFKA-15010 ZK migration failover support #13758

Uh oh!

Conversation

mumrah commented May 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mumrah May 26, 2023

Choose a reason for hiding this comment

Uh oh!

cmccabe May 30, 2023

Choose a reason for hiding this comment

Uh oh!

mumrah May 31, 2023

Choose a reason for hiding this comment

Uh oh!

cmccabe May 30, 2023

Choose a reason for hiding this comment

Uh oh!

cmccabe May 30, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cmccabe left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mumrah commented May 24, 2023 •

edited

Loading