Skip to content

Conversation

@ijuma
Copy link
Member

@ijuma ijuma commented Jun 24, 2019

Scala 2.13 support was added to build via #5454. This PR adjusts the code so that
it compiles with 2.11, 2.12 and 2.13.

Changes:

  • Add scala-collection-compat dependency.
  • Import scala.collection.Seq in a number of places for consistent behavior between
    Scala 2.11, 2.12 and 2.13.
  • Remove wildcard imports that were causing the Java classes to have priority over the
    Scala ones, related Scala issue: Explicit imports now shadow locally defined identifiers scala/scala#6589.
  • Replace parallel collection usage with Future. The former is no longer included by
    default in the standard library.
  • Replace val _: Unit workaround with one that is more concise and works with Scala 2.13
  • Replace filterKeys with filter when we expect a Map. filterKeys returns a view
    that doesn't implement the Map trait in Scala 2.13.
  • Replace mapValues with map or add a toMap as an additional transformation
    when we expect a Map. mapValues returns a view that doesn't implement the
    Map trait in Scala 2.13.
  • Replace breakOut with iterator and to, breakOut was removed in Scala
    2.13.
  • Replace to() with toMap, toIndexedSeq and toSet
  • Replace mutable.Buffer.-- with filterNot.
  • ControlException is an abstract class in Scala 2.13.
  • Variable arguments can only receive arrays or immutable.Seq in Scala 2.13.
  • Use Factory instead of CanBuildFrom in DecodeJson. CanBuildFrom behaves
    a bit differently in Scala 2.13 and it's been deprecated. Factory has the behavior
    we need and it's available via the compat library.
  • Fix failing tests due to behavior change in Scala 2.13,
    "Map.values.map is not strict in Scala 2.13" (Map.values.map is not strict in Scala 2.13 scala/bug#11589).
  • Use Java collections instead of Scala ones in StreamResetter (a Java class).
  • Adjust CheckpointFile.write to take an Iterable instead of Seq to avoid
    unnecessary collection copies.
  • Fix DelayedElectLeader to use a Map instead of Set and avoid to call that
    doesn't work in Scala 2.13.
  • Use unordered map for mapping in SimpleAclAuthorizer, mapping of ordered
    maps require an Ordering in Scala 2.13 for safety reasons.
  • Adapt ConsumerGroupCommand to compile with Scala 2.13.
  • CoreUtils.min takes an Iterable instead of TraversableOnce, the latter does
    not exist in Scala 2.13.
  • Replace Unit with () in a couple places. Scala 2.13 is stricter when it expects
    a value instead of a type.
  • Fix bug in CustomQuotaCallbackTest where we did not necessarily set partitionRatio
    correctly, forall can terminate early.
  • Add a couple of spotbugs exclusions that are needed by code generated by Scala 2.13
  • Remove unused variables, simplify some code and remove procedure syntax in a few
    places.
  • Remove unused CoreUtils.JSONEscapeString.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@ijuma
Copy link
Member Author

ijuma commented Jun 24, 2019

retest this please

@ijuma ijuma closed this Jun 24, 2019
@ijuma ijuma reopened this Jun 24, 2019
@ijuma ijuma closed this Jun 24, 2019
@ijuma ijuma reopened this Jun 24, 2019
@ijuma
Copy link
Member Author

ijuma commented Jun 24, 2019

retest this please

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hachikuji @jsancio Do you know why this was a Set? It seems to me that a Map makes more sense, but I thought I'd ask for a second opinion and I know you have made changes to related code recently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, a Map makes more sense. Thanks for the change.

@ijuma
Copy link
Member Author

ijuma commented Jun 24, 2019

Not sure why the PR builder for Scala 2.13 is not triggering automatically. I started it manually:

https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/16/

Copy link
Member

@jsancio jsancio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks for doing this @ijuma

What your thought on removing our usage of deprecated classes and methods like JavaConverters, filterKeys, etc?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the suggested way of doing this do that it has the same behaviour in Scala 2.11, 2.12 and 2.13 is ...asScala.view.filterKeys(_.startsWith("...")).toMap.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Views have a lot of problems in 2.11 and 2.12 so we generally stay away from them (they were completely rewritten in Scala 2.13). For code like this where performance doesn't matter, it's fine to go with the simple approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you remember what are these issues?
I am aware of GC pressure for long lived "view" objects since they hold on to the underlining collection. I don't think this issue applies here since filterKeys creates a "view" which is converted to a strict Map and we never store a reference to the "view".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several bugs, particularly in 2.11. You can search the bug tracker if you like. People working on the standard library don't recommend views, they suggest iterator instead. The migration notes for 2.13 also don't mention views as the replacement for the idiom, the example given:

mapValues and filterKeys now return a MapView instead of a Map kvs.mapValues(f) kvs.mapValues(f).toMap

https://docs.scala-lang.org/overviews/core/collections-migration-213.html

A recent article talks about two issues: https://www.scala-lang.org/blog/2017/11/28/view-based-collections.html

I also recall an article showing that views didn't even perform better than the strict equivalent in many scenarios they were designed for.

Personally, I don't think it's worth the additional complexity (particularly around debugging).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Thanks for the information. The deprecation notice for filterKeys says:

(Since version 2.13.0) Use .view.filterKeys(f). A future version will include a strict version of this method (for now, .view.filterKeys(p).toMap).

Looks like there may be some inconsistency in the documentation. Like you say, in this particular code it doesn't matter much but maybe we should do what you did in a few other places:

scalaMap.filter { case (key, value) => predicate(key) }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I changed filterKeys to filter here and a few other places.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified by merging these two lines to getCommittedOffsets(groupId).keys.toSeq.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think importing scala.collection.{Set, Seq, Map, etc} causes a lot of issues. I think this is going to be even more problematic with 2.13. I think we should ban these imports and only allow the Predef imports, scala.collection.mutable and scala.collection.immutable. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on the issues you are thinking of? The Predef imports are all about immutable collections and they have different performance characteristics and I don't think it makes sense to make that change for Scala 2.13 support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we take Map as an example, the issue is that collection.Map, collection.immutable.Map and collection.mutable.Map are all different types. collection.Map is a parent both mutable and immutable but the reality is that collection.Map cannot be safely downcasted to collection.immutable.Map or collection.mutable.Map.

[info] Compiling 1 Scala source to target/scala-2.12/classes ...
[error] TestCollection.scala:16:22: type mismatch;
[error]  found   : Map[String,String] (in scala.collection)
[error]  required: Map[String,String] (in scala.collection.immutable)
[error]   expectImmutableMap(map)
[error]                      ^
[error] TestCollection.scala:18:22: type mismatch;
[error]  found   : Map[String,String] (in scala.collection.mutable)
[error]  required: Map[String,String] (in scala.collection.immutable)
[error]   expectImmutableMap(mutableMap)
[error]                      ^
[error] TestCollection.scala:20:20: type mismatch;
[error]  found   : Map[String,String] (in scala.collection)
[error]  required: Map[String,String] (in scala.collection.mutable)
[error]   expectMutableMap(map)
[error]                    ^
[error] TestCollection.scala:21:20: type mismatch;
[error]  found   : Map[String,String] (in scala.collection.immutable)
[error]  required: Map[String,String] (in scala.collection.mutable)
[error]   expectMutableMap(immutableMap)
[error]                    ^
[error] four errors found

It is very rare that we have code that is generic over mutable vs immutable collection. I think that this conversions are simplify is we require a concrete version like collection.immtable.Map or collection.mutable.Map.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Predef imports are all about immutable collections

This is true except for Seq but they are fixing this in Scala 2.13. In the other Scala version Seq was a alias for collection.Seq which is the parent class for collection.immutable.Seq and collection.mutable.Seq.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth understanding the context. The Kafka code was written with the assumption that the Seq in Predef was not immutable. In Scala 2.13 it has become immutable. The simplest change to maintain existing assumptions is to add import scala.collection.Seq everywhere. In the future, we can consider other changes, I don't think they should be in the Scala 2.13 PR.

In this particular class, I did add an import to scala.collection.Map and I can remove that if that's what you are objecting to.

It is very rare that we have code that is generic over mutable vs immutable collection.

I don't think this is true, a lot of code just does map/filter/flatMap/foreach/etc. Such code doesn't care whether the collection is mutable or immutable.

but the reality is that collection.Map cannot be safely downcasted to collection.immutable.Map or collection.mutable.Map

I don't think you should ever downcast collections outside of very specialized collection utility code (e.g. a method that converts from one collection to another could do this for best performance).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also. I think the recommend way to guarantee the same behaviour across Scala version (lazy eval) is to do ...all.get().asScala.view.mapValues(_.asScala).toMap

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I responded to the other comment. Let's discuss it there and we'll do here whatever we agree there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about proposedReplicaAssignment.view.filterKeys(...).toMap?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment about views being discouraged in Scala 2.11 and Scala 2.12.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about newConfig.originals.asScala.view.filterKeys(...).filterKeys(...).toMap?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss this in the other comment about views. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find that one line imports like some.package.{One, Two, Three} lead to unnecessary merge conflicts. How about encouraging one line imports like:

import some.package.One
import some.package.Two
...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the current style. If we want to change it, let's start a discussion in the mailing list.

@ijuma
Copy link
Member Author

ijuma commented Jun 26, 2019

3 test failures remaining:

kafka.api.CustomQuotaCallbackTest.testCustomQuotaCallback
org.apache.kafka.streams.scala.StreamToTableJoinScalaIntegrationTestImplicitSerdes.testShouldCountClicksPerRegionWithNamedRepartitionTopic
org.apache.kafka.streams.scala.StreamToTableJoinScalaIntegrationTestImplicitSerdes.testShouldCountClicksPerRegion

@debasishg Any idea why the Streams tests are failing with a ClassCastException?

Exception in thread "stream-table-join-scala-integration-test-ad01dbce-3a61-4cd8-97a7-af0bb5f57698-StreamThread-1" java.lang.ClassCastException: class [B cannot be cast to class java.lang.String ([B and java.lang.String are in module java.base of loader 'bootstrap')
at org.apache.kafka.streams.scala.FunctionsCompatConversions$MapperFromFunction$$anon$4.apply(FunctionsCompatConversions.scala:51)
at org.apache.kafka.streams.kstream.internals.KStreamKTableJoinProcessor.process(KStreamKTableJoinProcessor.java:80)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:117)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:201)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:180)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:133)
at org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:87)
at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:348)
at org.apache.kafka.streams.processor.internals.AssignedStreamsTasks.process(AssignedStreamsTasks.java:199)
at org.apache.kafka.streams.processor.internals.TaskManager.process(TaskManager.java:420)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:866)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:781)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:750)

@ijuma
Copy link
Member Author

ijuma commented Jun 27, 2019

I figured out the reason for the Streams failure and pushed a fix (8139e52d2aa92322504). I'm not sure if this is an intentional change in Scala 2.13, but the behavior of something like the following has changed:

package org.apache.kafka.streams.scala.utils
import org.apache.kafka.streams._
import org.apache.kafka.streams.scala.kstream._

Assume the existance of org.apache.kafka.streams.StreamsBuilder and org.apache.kafka.streams.scala.StreamsBuilder. In 2.11 and 2.12, the latter is in scope. In 2.13, it's the former. It leads to hard to debug behavior.

@ennru
Copy link

ennru commented Jun 27, 2019

Sounds like "Explicit imports now shadow locally defined identifiers" scala/scala#6589

@debasishg
Copy link
Contributor

apologies @ijuma - I am offsite in some team meeting and could not take a look. Thanks for the fix.

@ijuma
Copy link
Member Author

ijuma commented Jun 28, 2019

Thanks for the reference @ennru. No worries @debasishg. :)

@ijuma
Copy link
Member Author

ijuma commented Jun 29, 2019

I fixed the bug causing the remaining test to fail. Rebased and pushed.

@ijuma ijuma requested a review from omkreddy June 29, 2019 16:10
@ijuma
Copy link
Member Author

ijuma commented Jun 29, 2019

@ijuma
Copy link
Member Author

ijuma commented Jun 29, 2019

retest this please

@ijuma
Copy link
Member Author

ijuma commented Jun 30, 2019

Scala 2.11 and 2.12 builds are green. Scala 2.13 had one seemingly unrelated test failure:
https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/23/

Started another 2.13 build just in case. This is ready for review.

@ijuma
Copy link
Member Author

ijuma commented Jun 30, 2019

Last run for Scala 2.13 had 2 flaky Connect failures that passed in the previous run:

org.apache.kafka.connect.integration.ExampleConnectIntegrationTest.testSourceConnector
org.apache.kafka.connect.integration.RebalanceSourceConnectorsIntegrationTest.testReconfigConnector

@ijuma ijuma requested a review from gwenshap July 1, 2019 14:00
Copy link
Contributor

@omkreddy omkreddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ijuma
Copy link
Member Author

ijuma commented Jul 2, 2019

Thanks for the review, another run of the Scala 2.13 had another unrelated flaky test failure:

kafka.admin.TopicCommandWithAdminClientTest.testDeleteInternalTopic

Merging to master.

@ijuma ijuma merged commit 6dd4ebc into apache:trunk Jul 2, 2019
@ijuma ijuma deleted the fix-scala-2.13 branch July 2, 2019 13:29
@SethTisue
Copy link

Is there a ticket I can link to from https://github.com/scala/make-release-notes/blob/2.13.x/projects-2.13.md#pending that people can subscribe to track progress on a release that includes 2.13 support?

@ijuma
Copy link
Member Author

ijuma commented Jul 3, 2019

@SethTisue This will be part of 2.4, the next release. We don't have a ticket for upcoming releases, we will have a release plan wiki page linked from https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan (we do releases every 4 months).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants