Skip to content

Conversation

@mumrah
Copy link
Member

@mumrah mumrah commented May 26, 2023

This PR builds on top of #13736.

Fixes the following:

  • Topic configs are not sycned while handling snapshot.
  • New broker/topic configs in KRaft that did not exist in ZK will not be sync'd to ZK.
  • The sensitive configs are not encoded while writing them to Zookeeper.
  • Handle topic configs in ConfigMigrationClient and KRaftMigrationZkWriter#handleConfigsSnapshot

Added tests to ensure we no longer have the above mentioned issues.

Co-Authored-By: Akhilesh C [email protected]

akhileshchg and others added 5 commits May 19, 2023 15:13
…we might

miss updating configs or incorrectly update them in Zookeeper.

Issue#1: Topic configs are not sycned while handling snapshot.
Issue#2: Any broker or topic config resource that is new and not already present in Zookeeper will never be updated in Zookeeper.
Issue#3: The senstive configs are not encoded while writing them to Zookeeper.
Add readTopicConfigs to ConfigMigrationClient
topicClient.iterateTopics(
util.EnumSet.allOf(classOf[TopicVisitorInterest]),
new TopicVisitor() {
override def visitTopic(topicName: String, topicId: Uuid, assignments: util.Map[Integer, util.List[Integer]]): Unit = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just fixing the indentation here

@mumrah mumrah marked this pull request as ready for review May 26, 2023 17:15
@mumrah mumrah added the kraft label May 26, 2023
partitionRegistration.addingReplicas.foreach(brokerIdConsumer.accept(_))
topicBatch.add(new ApiMessageAndVersion(record, 0.toShort))
}
// This breaks the abstraction a bit, but the topic configs belong in the topic batch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really required for the topic config records to come right after the topics. It would be OK to do it in a separate section as we do with snapshots.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was considering the fact that we don't atomically apply the migration records during the migration. I think it's possible for the controller or broker to publish the migration metadata before it's all committed. In this case, I think it's probably safer to include the config records with the topic batch.

This won't be an issue once we implement KIP-868.

Copy link
Contributor

@cmccabe cmccabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left two minor comments but maybe we can address those in a follow-on if needed.

@mumrah
Copy link
Member Author

mumrah commented May 27, 2023

Got a reasonable test run:

[Build / JDK 17 and Scala 2.13 / org.apache.kafka.tools.MetadataQuorumCommandTest.[6] Type=Raft-Isolated, Name=testDescribeQuorumReplicationSuccessful, MetadataVersion=3.5-IV2, Security=PLAINTEXT](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-13767/6/testReport/junit/org.apache.kafka.tools/MetadataQuorumCommandTest/Build___JDK_17_and_Scala_2_13____6__Type_Raft_Isolated__Name_testDescribeQuorumReplicationSuccessful__MetadataVersion_3_5_IV2__Security_PLAINTEXT/)
[Build / JDK 8 and Scala 2.12 / kafka.admin.DescribeConsumerGroupTest.testDescribeStateOfExistingGroupWithRoundRobinAssignor()](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-13767/6/testReport/junit/kafka.admin/DescribeConsumerGroupTest/Build___JDK_8_and_Scala_2_12___testDescribeStateOfExistingGroupWithRoundRobinAssignor__/)
[Build / JDK 11 and Scala 2.13 / integration.kafka.server.FetchFromFollowerIntegrationTest.testRackAwareRangeAssignor()](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-13767/6/testReport/junit/integration.kafka.server/FetchFromFollowerIntegrationTest/Build___JDK_11_and_Scala_2_13___testRackAwareRangeAssignor__/)
[Build / JDK 11 and Scala 2.13 / kafka.admin.TopicCommandIntegrationTest.testDescribeUnderMinIsrPartitionsMixed(String).quorum=zk](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-13767/6/testReport/junit/kafka.admin/TopicCommandIntegrationTest/Build___JDK_11_and_Scala_2_13___testDescribeUnderMinIsrPartitionsMixed_String__quorum_zk/)
[Build / JDK 11 and Scala 2.13 / org.apache.kafka.trogdor.coordinator.CoordinatorTest.testTaskRequestWithOldStartMsGetsUpdated()](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-13767/6/testReport/junit/org.apache.kafka.trogdor.coordinator/CoordinatorTest/Build___JDK_11_and_Scala_2_13___testTaskRequestWithOldStartMsGetsUpdated__/)

@mumrah mumrah merged commit 7a679af into apache:trunk May 27, 2023
mumrah added a commit that referenced this pull request May 31, 2023
This patch fixes several small bugs with configuration dual-write during migration.

* Topic configs are not written back to ZK while handling snapshot.
* New broker/topic configs in KRaft that did not exist in ZK will not be written to ZK.
* The sensitive configs are not encoded while writing them to Zookeeper.
* Handle topic configs in ConfigMigrationClient and KRaftMigrationZkWriter#handleConfigsSnapshot

Added tests to ensure we no longer have the above mentioned issues.

Co-authored-by: Akhilesh Chaganti <[email protected]>
Reviewers: Colin P. McCabe <[email protected]>
emissionnebula added a commit to confluentinc/kafka that referenced this pull request Jun 16, 2023
…tream-3.5

* commit 'c2f6f29ca6e1306ac77ec726bac4cd09bd1aa80b': (76 commits)
  KAFKA-15019: Improve handling of broker heartbeat timeouts (apache#13759)
  KAFKA-15003: Fix ZK sync logic for partition assignments (apache#13735)
  MINOR: Add 3.5 upgrade steps for ZK and KRaft (apache#13792)
  KAFKA-15010 ZK migration failover support (apache#13758)
  KAFKA-15017 Fix snapshot load in dual write mode for ClientQuotas and SCRAM  (apache#13757)
  MINOR: Update LICENSE-binary following snappy upgrade (apache#13791)
  Upgrade to snappy v1.1.10.0 (apache#13786)
  KAFKA-15004: Fix configuration dual-write during migration (apache#13767)
  KAFKA-8713: JsonConverter replace.null.with.default should prevent emitting default for Struct fields (apache#13781)
  KAFKA-14996: Handle overly large user operations on the kcontroller (apache#13742)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants