Skip to content

KAFKA-2983: Remove Scala consumers and related code#5230

Merged
ijuma merged 16 commits intoapache:trunkfrom
ijuma:kafka-2983-remove-scala-consumers
Jun 19, 2018
Merged

KAFKA-2983: Remove Scala consumers and related code#5230
ijuma merged 16 commits intoapache:trunkfrom
ijuma:kafka-2983-remove-scala-consumers

Conversation

@ijuma
Copy link
Copy Markdown
Member

@ijuma ijuma commented Jun 14, 2018

  • Removed Scala consumers (SimpleConsumer and ZooKeeperConsumerConnector)
    and their tests.
  • Removed Scala request/response/message classes.
  • Removed any mention of new consumer or new producer in the code
    with the exception of MirrorMaker where the new.consumer option was
    never deprecated so we have to keep it for now. The non-code
    documentation has not been updated either, that will be done
    separately.
  • Removed a number of tools that only made sense in the context
    of the Scala consumers (see upgrade notes).
  • Updated some tools that worked with both Scala and Java consumers
    so that they only support the latter (see upgrade notes).
  • Removed BaseConsumer and related classes apart from BaseRecord
    which is used in MirrorMakerMessageHandler. The latter is a pluggable
    interface so effectively public API.
  • Removed ZkUtils methods that were only used by the old consumers.
  • Removed ZkUtils.registerBroker and ZKCheckedEphemeral since
    the broker now uses the methods in KafkaZkClient and no-one else
    should be using that method.
  • Updated system tests so that they don't use the Scala consumers except
    for multi-version tests.
  • Updated LogDirFailureTest so that the consumer offsets topic would
    continue to be available after all the failures. This was necessary for it
    to work with the Java consumer.
  • Some multi-version system tests had not been updated to include
    recently released Kafka versions, fixed it.
  • Updated findBugs and checkstyle configs not to refer to deleted
    classes and packages.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 14, 2018

@lindong28, do you have time to take a look? I started a system tests run, but maybe you could take an initial pass, if you have time.

@lindong28
Copy link
Copy Markdown
Member

@ijuma Sure! I will review it.

@lindong28 lindong28 self-assigned this Jun 14, 2018
@ijuma ijuma force-pushed the kafka-2983-remove-scala-consumers branch from 81407be to 92da9bd Compare June 14, 2018 22:13
@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 14, 2018

@lindong28 I rebased as a PR had been merged that had broken this branch. Also added a description to the pull request of what is included in the PR.

@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 14, 2018

@rajinisivaram, @lindong28 said that he will be able to review this PR.

Copy link
Copy Markdown
Member

@lindong28 lindong28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch! Just have some minor comments. LGTM. I think we can commit the patch after it passes all system tests.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally we need two while loops because mirrorMakerConsumer.receive will block infinitely for old consumer if there is no data. I think it should be safe to just remote the second while loop.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. I'll keep the existing code (will just update the comment), but will file a JIRA for simplifying MirrorMaker.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is OK.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. What I was wondering about is whether there were cases where TimeoutException is thrown that we don't want to catch and continue. ConsumerTimeoutException is only thrown by NewShinyConsumer if poll doesn't return any records. Maybe to be safe, I should maintain the exact behaviour and then we can a separate PR to simplify MirrorMaker (instead of being in the middle of this huge one).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that Blacklist is only used in test. Do we still need this?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, we only need to keep Whitelist.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since broker is still using zookeeper, would this test class still be useful?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The broker doesn't use ZkUtils anymore and hence why I removed the ZKCheckedEphemeral code and tests. There is a KafkaZkClient.CheckedEphemeral that hasn't been removed and it's still used.

@andrasbeni
Copy link
Copy Markdown
Contributor

@ijuma nit: could you please also remove ZookeeperConsumerConnector related <Match> from gradle/findbugs-exclude.xml ?

Copy link
Copy Markdown
Contributor

@omkreddy omkreddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove options/code related to old metrics reporter?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also remove "Legacy APIs" , "Old Consumer Configs" sections from docs.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as I said in the PR description the documentation hasn't been updated and we can do that separately.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can remove "New" from comment. also ProducerConfig constructor is public now

Copy link
Copy Markdown
Contributor

@omkreddy omkreddy Jun 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also remove ImportZkOffsets?

Also we have not removed ZK based offset management. that means we are going to maintain ZK based offset management some more time? Existing users can continue to use older release scala consumers if they want?

In that case, Can we deprecate ZK based offset management and mention in docs?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, removed ImportZkOffsets. What do you mean by "we have not removed ZK based offset management?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we want the broker to still support everything for now. We need a KIP to discuss removal and deprecation of broker functionality related to the old consumer. I restored a change in ZkData as I hadn't realised that the broker was relying on that functionality itself.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we include ExportZkOffsets

@ijuma ijuma force-pushed the kafka-2983-remove-scala-consumers branch from 92da9bd to 6f10455 Compare June 15, 2018 14:10
@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 15, 2018

Rebased and addressed the comments so far. Filed KAFKA-7062 and KAFKA-7063 for follow-ups. If anyone wants to pick them up, feel free. :)

@ijuma ijuma force-pushed the kafka-2983-remove-scala-consumers branch 2 times, most recently from aceb212 to dabfa0d Compare June 17, 2018 16:32
@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 17, 2018

@lindong28, I fixed a few system test issues. One of the remaining ones is log_dir_failure_test.py. 3 of 4 are failing since they have been switched to use the new consumer. I took an initial look and the issue is that the coordinator is not available for the duration of the produce/consume test for the code below:

# Verify the following:
            # 1) The broker with offline directory is the only in-sync broker of the partition of topic2
            # 2) Messages can still be produced and consumed from topic2
            self.producer = VerifiableProducer(self.test_context, self.num_producers, self.kafka, self.topic2,
                                               throughput=self.producer_throughput, offline_nodes=offline_nodes)
            self.consumer = ConsoleConsumer(self.test_context, self.num_consumers, self.kafka, self.topic2, group_id="test-consumer-group-2",
                                            consumer_timeout_ms=90000, message_validator=is_int)
            self.consumer_start_timeout_sec = 90
            self.start_producer_and_consumer()

Since you wrote this test, is there a chance you could take a look? It's a bit worrying that this doesn't work with the Java consumer.

@lindong28
Copy link
Copy Markdown
Member

@ijuma Sure, I will look into this issue.

@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 17, 2018

I'm running the broker system tests in Jenkins again (https://jenkins.confluent.io/view/All/job/system-test-kafka-branch-builder/1803/console), but I believe the only remaining failing test is log_dir_failure_test.py.

@ijuma ijuma force-pushed the kafka-2983-remove-scala-consumers branch from dabfa0d to 7a8493f Compare June 18, 2018 07:24
@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 18, 2018

@lindong28 do you think you'll have time to look into this today? If not, I'll partially disable the test, file a JIRA and merge this PR. These changes don't introduce the issue after all, they are just exposing the fact that the test as it stands doesn't work with the new consumer.

@lindong28
Copy link
Copy Markdown
Member

@ijuma Yes! I am looking into this now. Sorry for the delay.

@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 18, 2018

@lindong28 Thanks! To be clear, there was no delay on your part, it's just that the timeline for this PR to make it into 2.0 is very short. :) So, I was trying to understand your availability to figure out the best path forward. I will wait in case there's an easy tweak to the test to make it pass with the new consumer.

@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 18, 2018

@lindong28 interestingly, only 1 of 4 log dir failure tests failed in the run I started last night:

https://jenkins.confluent.io/view/All/job/system-test-kafka-branch-builder/1804/consoleFull

So, it looks flaky instead of failing deterministically.

@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 18, 2018

@lindong28 Looking at the test a bit more, when we "Shutdown all other brokers so that the broker with offline log dir is the only online broker", how do we ensure that there are no consumer_offsets partitions that become offline?

@lindong28
Copy link
Copy Markdown
Member

@ijuma Sure thing. I would be more than happy to make this into 2.0 to make it shiny instead of disabling the test :) Mostly likely I will be able to fix it today.

That is a very good point. The default offsets.topic.replication.factor in Kafka server is 3. Since we start three brokers in the test before we start producer and consumer, the offset topic will be created with RF=3. And I verified that the consumer offset topic is replicated across all three brokers and the test still failed.

@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 18, 2018

Right. Do I understand correctly that we have 2 out of 3 brokers down and the remaining online broker has one log dir offline? It seems like some consumer offset partitions could be in the offline log dir, which would cause the coordinator to never become available again.

@lindong28
Copy link
Copy Markdown
Member

@ijuma Ah I see you point. Yes you are right. If the consumer offset topic is in the offline log directory of the only online broker then it will cause the CoordinatorNotAvailableException. Let me try to fix this problem. Thanks!

@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 18, 2018

@lindong28 yes :)

@ijuma ijuma force-pushed the kafka-2983-remove-scala-consumers branch from 7a8493f to 2e94ea1 Compare June 18, 2018 22:02
@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 19, 2018

@lindong28
Copy link
Copy Markdown
Member

lindong28 commented Jun 19, 2018

@ijuma Great! The patch looks good overall when I reviewed a few days ago. A few minor commits have been made to ensure that we pass integration tests and system tests. Do you need me to go over this again?

@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 19, 2018

@lindong28, it's enough for you to review the commits added since your last review

Copy link
Copy Markdown
Member

@lindong28 lindong28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left only one minor comment.

.ofType(classOf[java.lang.Integer])
val skipMessageOnErrorOpt = parser.accepts("skip-message-on-error", "If there is an error when processing a message, " +
"skip it instead of halt.")
val csvMetricsReporterEnabledOpt = parser.accepts("csv-reporter-enabled", "If set, the CSV metrics reporter will be enabled")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is csvMetricsReporterEnabledOpt only used with scala consumer? It seems that KafkaCSVMetricsReporter is not deprecated or removed.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KafkaCSVMetricsReporter is based on Yammer Metrics. This is still used by the broker and hence why it's not deprecated. The Java clients use Kafka Metrics instead. So, it seems to me that there will be no metrics reported if this is enabled with the Java consumer making it useless. Am I missing something?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. I didn't know that KafkaCSVMetricsReporter is based on Yammer metrics. Then it makes sense to remove it since only scala clients use Yammer metrics. Thanks for the explanation. I have no further comments.

@ijuma
Copy link
Copy Markdown
Member Author

ijuma commented Jun 19, 2018

Thanks for the reviews. Merging to trunk and 2.0.

@ijuma ijuma merged commit cc4dce9 into apache:trunk Jun 19, 2018
@ijuma ijuma deleted the kafka-2983-remove-scala-consumers branch June 19, 2018 14:33
ijuma added a commit that referenced this pull request Jun 19, 2018
- Removed Scala consumers (`SimpleConsumer` and `ZooKeeperConsumerConnector`)
and their tests.
- Removed Scala request/response/message classes.
- Removed any mention of new consumer or new producer in the code
with the exception of MirrorMaker where the new.consumer option was
never deprecated so we have to keep it for now. The non-code
documentation has not been updated either, that will be done
separately.
- Removed a number of tools that only made sense in the context
of the Scala consumers (see upgrade notes).
- Updated some tools that worked with both Scala and Java consumers
so that they only support the latter (see upgrade notes).
- Removed `BaseConsumer` and related classes apart from `BaseRecord`
which is used in `MirrorMakerMessageHandler`. The latter is a pluggable
interface so effectively public API.
- Removed `ZkUtils` methods that were only used by the old consumers.
- Removed `ZkUtils.registerBroker` and `ZKCheckedEphemeral` since
the broker now uses the methods in `KafkaZkClient` and no-one else
should be using that method.
- Updated system tests so that they don't use the Scala consumers except
for multi-version tests.
- Updated LogDirFailureTest so that the consumer offsets topic would
continue to be available after all the failures. This was necessary for it
to work with the Java consumer.
- Some multi-version system tests had not been updated to include
recently released Kafka versions, fixed it.
- Updated findBugs and checkstyle configs not to refer to deleted
classes and packages.

Reviewers: Dong Lin <[email protected]>, Manikumar Reddy <[email protected]>
ying-zheng pushed a commit to ying-zheng/kafka that referenced this pull request Jul 6, 2018
- Removed Scala consumers (`SimpleConsumer` and `ZooKeeperConsumerConnector`)
and their tests.
- Removed Scala request/response/message classes.
- Removed any mention of new consumer or new producer in the code
with the exception of MirrorMaker where the new.consumer option was
never deprecated so we have to keep it for now. The non-code
documentation has not been updated either, that will be done
separately.
- Removed a number of tools that only made sense in the context
of the Scala consumers (see upgrade notes).
- Updated some tools that worked with both Scala and Java consumers
so that they only support the latter (see upgrade notes).
- Removed `BaseConsumer` and related classes apart from `BaseRecord`
which is used in `MirrorMakerMessageHandler`. The latter is a pluggable
interface so effectively public API.
- Removed `ZkUtils` methods that were only used by the old consumers.
- Removed `ZkUtils.registerBroker` and `ZKCheckedEphemeral` since
the broker now uses the methods in `KafkaZkClient` and no-one else
should be using that method.
- Updated system tests so that they don't use the Scala consumers except
for multi-version tests.
- Updated LogDirFailureTest so that the consumer offsets topic would
continue to be available after all the failures. This was necessary for it
to work with the Java consumer.
- Some multi-version system tests had not been updated to include
recently released Kafka versions, fixed it.
- Updated findBugs and checkstyle configs not to refer to deleted
classes and packages.

Reviewers: Dong Lin <[email protected]>, Manikumar Reddy <[email protected]>
jonlee2 added a commit to linkedin/kafka that referenced this pull request Jan 8, 2019
xiowu0 pushed a commit to linkedin/kafka that referenced this pull request Jun 11, 2019
…hich was removed by "KAFKA-2983: Remove Scala consumers and related code (apache#5230)" (commit cc4dce9).

TICKET =
LI_DESCRIPTION =

EXIT_CRITERIA = MANUAL ["describe exit criteria"]
xiowu0 pushed a commit to xiowu0/kafka that referenced this pull request Jul 10, 2019
…hich was removed by "KAFKA-2983: Remove Scala consumers and related code (apache#5230)" (commit cc4dce9).

TICKET =
LI_DESCRIPTION =

EXIT_CRITERIA = MANUAL ["describe exit criteria"]
xiowu0 pushed a commit to linkedin/kafka that referenced this pull request Jul 17, 2019
…hich was removed by "KAFKA-2983: Remove Scala consumers and related code (apache#5230)" (commit cc4dce9).

TICKET =
LI_DESCRIPTION =
This patch included fix by Jon lee from 2.0-li branch and
a follow up fix by xiongqi welsey wu for 2.3 branch.
EXIT_CRITERIA = MANUAL ["When old scala consumer is no longer needed"]
xiowu0 pushed a commit to linkedin/kafka that referenced this pull request Aug 22, 2019
…hich was removed by "KAFKA-2983: Remove Scala consumers and related code (apache#5230)" (commit cc4dce9) in upstream.

TICKET =
LI_DESCRIPTION =
This patch included fix by Jon Lee from 2.0-li branch and
a follow up fix by Xiongqi Welsey Wu for 2.3-li branch.
EXIT_CRITERIA = MANUAL ["When old Scala consumer is no longer needed"]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants