Skip to content

Conversation

@lrytz
Copy link
Member

@lrytz lrytz commented Nov 22, 2018

Add Java/Scala converters for options, function types and Java stream, as known from the java8-compat library. Rearrange the existing collection converters to fit into the new scheme.

Naming overview:

  • new package scala.jdk with objects CollectionConverters (classic Java collections, similar to collection.JavaConverters in 2.12), StreamConverters, FunctionConverters and OptionConverters
  • These object have explicit conversion methods, e.g., OptionConverters.toJava(Some("hi")). These methods are recommended when writing Java code.
  • Each XConverter has a nested object Ops providing extension methods (asJava, toScala, etc). Using these extension methods is recommended when writing Scala code.
  • Naming convention:
    • as... when a wrapper is created (collection converters), to... when a new independend object is created (options)
    • Plain o.asJava creates the most specific possible Java type for o. Other possible target types can be created with o.asJavaX

Note that the split between explicit conversions and extension methods (in Ops) is new. In 2.12, JavaConverters contained both. Also the function converters in java8-compat mixes them up. We decided to split this systematically now.

Function converters

java8-compat provided full-named explicit conversions (now available as FunctionConverters.asJavaDoubleUnaryOperator, for example) and only two extension methods: asJava and asScala. The asJava extension method creates the most specialized possible Java function type.

If a different Java function type is required, java8-compat required to fall back on the explicit conversion. Here, we instead add additional extension methods for the cases where a more generic version than asJava is available, for example ((x: Int) => x > 0).asJavaFunction.

Option converters

Trivial

Collection Converters

Functionality is the same as in 2.12. The extension methods have been renamed for simplicity. Now, asJava is always avaliable and creates a wrapper for the most common and specialized Java type. Conversions to other Java types have longer extension method names (asJavaDictionary).

scala.collection.convert.ImplicitConversions is deprecated (no replacement, use extension methods). collection.JavaConverters is deprecated in favor of the new CollectionConverters.

Stream Converters

  • scala.collection.Stepper[A] is small interface that provides iterator-like methods hasStep and nextStep. Steppers are specialized for Int, Long and Double (like Java Streams), so consuming a primitive collection with a Stepper doesn't box the elements. Steppers can be converted to java.util.Spliterator, java.util.Iterator and scala.Iterator
  • IterableOnce gets a stepper[S](shape: StepperShape[S]): S virtual method (In java8-compat, this was an extension method). The StepperShape ensures that collections of Int/Long/Double produce primitive steppers.
  • IndexedSeqOps overrides stepper to return S with EfficientSplit.
  • Collections with stepper: Stepper get a asJavaSeqStream extension method
  • Collections with stepper: Stepper with EfficientSplit get a asJavaParStream extension method
  • Arrays and Strings get stepper, asJavaSeqStream and asJavaParStream extension methods
  • A new collection.convert.Accumulator collection type is added, similar to UnrolledBuffer, but with hand-specialized Int/Long/DoubleAccumulator and a generic AnyAccumulator subclasses.
    • The accumulator stepper support efficient splitting, so it's a good collection to run a parallel stream on. Users could convert for example a List[Int] to an IntAccumulator for processing a parallel stream. (Converting to an Array[Int] is also a good option).
    • The collection is a good target collection to collect the result of a parallel stream, because it supports efficient merging, so it can be built in parallel.
    • Accumulator extends Iterable (in java8-compat, the collection was not integrated into the hierarchy). Converting a collection.to(Accumulator) automatically builds a primitive accumulator for primitive element types (java8-compat has an accumulate extension methods to convert from Scala collections).
  • Java streams get a toScala extension method that works the same as the to method in collections. (java8-compat has an accumulate extension method to convert to an Accumulator. This is now done with .toScala(Accumulator).)
  • MapOps gets keyStepper and valueStepper virtual methods. They can be overridden in subclasses to return steppers with EfficientSplit.
  • MapOps gets asJavaSeqKeyStream / asJavaSeqValueStream and asJavaParKeyStream / asJavaParValueStream extension methods

Duration Converters

Convert between Java and Scala duration types, straightforward

Future Converters

...

TODO

Done

  • Decide: move Stepper class(es) to scala.collection package
  • Integrate future converters
  • Rename asJavaSeqStream to only seqStream (and similar for all stream conversions)?, No, keep the asJava part (decided on team meeting)
  • Keep the Accumulator classes in package scala.collection.convert? Yes, the main use case is parallel building, so it's really useful for Streams interop (decided on team meeting)
  • Integrate duration converters? Yes
  • Keep the structure and package name in scala.jdk? Yes (decided on team meeting).
  • port Stepper implementation for concrete collection types
    • they should live close(r) to the collection implementations so that internals don't need to be exposed
    • to do (rough list - i didn't check in detail. need to see which are necessary / useful): Vector, immutable.HashMap/Set (champ - necessary?), mutable.HashMap/HashSet (wait for Stefan's new implementation to be merged), mutable.LinkedHashMap/Set, NumericRange, Range, BitSet, String. done: Iterable, IndexedSeq, Array
  • decide on naming - what should the imports look like for the various java converters (function, options, traditional java collections, java streams, java concurrent)?
    • make sure that the java converter objects (used to import the conversions/extension methods in Scala code) provide consistently named explicit conversion methods that can be used in Java (or make separate objects for that purpose). this is already done in the converters for traditional collection, options, functions, but needs to be done for java streams.
  • clean up the Accumulator classes
    • there might be some unnecessary methods in there now that they're Iterables. Though need to be careful with boxing, the operations in accumulator might avoid boxing when using the hand-specialized subclasses
    • check integration with collections (companion / factory?)
    • check serialization
  • clean up Stepper classes (check operations, are they necessary to prevent boxing? otherwise .iterator.op would do the same)

Non-blockers

@lrytz lrytz added the WIP label Nov 22, 2018
@scala-jenkins scala-jenkins added this to the 2.13.0-RC1 milestone Nov 22, 2018
@smarter
Copy link
Member

smarter commented Nov 22, 2018

Maybe having scala.JavaConverters is too confusing.

Yes I think so, it's way to easy to use the wrong one and not understand why it's not working, there should either be only one JavaConverters that brings everything with it, or two objects with different names.

@SethTisue SethTisue added the library:collections PRs involving changes to the standard collection library label Nov 23, 2018
@lrytz
Copy link
Member Author

lrytz commented Nov 26, 2018

There are a number of decisions to be made about naming.

  • Names of the objects holding the implicit conversions (to provide extension methods, like asJava, asJavaDictionary). Some suggestions:
    • scala.JavaConvertersFunction._, scala.JavaConvertersCollection._
    • Or without Java in the name: scala.FunctionConverters._, scala.CollectionConverters._
    • Or in a package: scala.convert.Functions._, scala.convert.Collections._
    • Keep the one for collections in the scala.collection package? There's a risk of having "collection" twice in its path/name (e.g., scala.collection.CollectionConverters). The current scala.collection.JavaConverters is probably no longer ideal, see comments above.
  • Names of the objects holding the explicit conversion methods (like scala.collection.JavaConverters.mapAsJavaMap(m)). These are mostly intended to be used when writing Java code.
    • The current collections.JavaConverters holds both, explicit conversions and implicit extension method providers. The API is then not very clean: users see a mix of all methods, it's not trivial to see only explicit conversion methods.
  • Keep the generic asJava extension methods which don't mention the target type in the name?
    • For example, the current collections.JavaConverters provides scalaMap.asJava to create a java.util.Map, but also .asJavaDictionary and .asJavaCollection to target different Java types
    • I personally think it's good to have a default asJava for the most common conversion (if there's one). For collections this would mean keep the 2.12 behavior and make asJava convert to "standard" Java collections (not Java 8 streams).

Opinions welcome, I'll try to bring this together in a consistent way.

@lrytz
Copy link
Member Author

lrytz commented Dec 3, 2018

I started adding commits for the stream converters.

As suggested in our team meeting, i made the stepper method a member of IterableOnceOps instead of an extension method. So the core collections depend on the Stepper interface (which itself refers to java.util.Spliterator).

The reasons for this choice:

  • The virtual method would allow adding "better" (faster / more low-level) spliterators in a minor release for specific collection types
  • The stepper implementation can live close to the collection types, which means implementation details don't need to be exposed
  • 3rd party collections can override stepper to provide a better implementation, without requiring users to do an import to get a "better" extension method

The parStream / seqStream methods remain extension methods, so that converting to a stream requires an import SomethingConverters._ (and the core collections API remains independent of the Java streams API)

cc @szeiger @retronym @Ichoran

@sjrd
Copy link
Member

sjrd commented Dec 3, 2018

Hum, I don't like this bit about the streams.

Java collections are one thing, we've implemented them. Function converters, well, we haven't implemented Java's function types, but we could do it. Streams, that's another story. It's a big chunk of code. It's unlikely that we'll implement them in Scala.js anytime soon, if ever.

That means that this PR is reintroducing in the standard library APIs that will compile in Scala.js but won't link, because of missing transitive classes. This is one of the sources of confusion most often seen in Scala.js.

I was looking forward to a future with a standard library that compiles iff it links (e.g., with scala.sys and io moved out in modules). This change is a step in the other direction. :-(

@lrytz
Copy link
Member Author

lrytz commented Dec 3, 2018

I understand your position.

The thinking behind this PR: Scala provides interop with Java out of the box. But in 2.12 we ship integration only with traditional Java collections, even though we require the Java 8 runtime. We should keep our interop up to date with the (core) platform, and not fall (stay) behind.

@lrytz lrytz force-pushed the java8-converters branch 4 times, most recently from aad1a31 to 7f9db09 Compare December 6, 2018 08:40
@lrytz
Copy link
Member Author

lrytz commented Dec 7, 2018

Pushed a few changes to ensure conversions between primitive arrays, steppers, java streams and accumulators don't box. I used https://github.com/google/allocation-instrumenter for that, test code here: https://gist.github.com/lrytz/900dcc6aba5479ae615098bb0d74b1ee.

@lrytz lrytz force-pushed the java8-converters branch 2 times, most recently from 2655a87 to 848c86d Compare December 10, 2018 20:33
@lrytz
Copy link
Member Author

lrytz commented Dec 11, 2018

The core of this PR is done now

  • Added function converters (manually invoked source generator, similar to genprod)
  • Added option converters
  • Added core of stream integration

Function and option converters are the same as in java8-copmpat. For the stream integration there are a few differences in the design, I'll point them out in the summary below.

Details about stream integration:

  • Stepper[A] is a blend of Iterator and Spliterator, it can be converted to both. There are hand-specialized Int/Long/DoubleStepper and a generic AnyStepper[T] subclasses.
  • IterableOnceOps gets a stepper[S](shape: StepperShape[S]): S virtual method. The StepperShape ensures that collections of Int/Long/Double produce primitive steppers. In java8-compat, this was an extension method. See Integrate converters from java8-compat [ci: last-only] #7458 (comment).
  • IndexedSeqOps overrides stepper to return S with EfficientSubstep.
  • IterableOnceOps gets seqStream and parStream extension methods. The parStream method can only be used on collections where stepper returns an S with EfficientSubstep.
  • Similarly, Stepper gets a seqStream and Stepper with EfficientSubstep a parStream extension method. In java8-compat, they were member methods.
  • The extension methods are available when importing XJavaConverters._ (naming is to be decided)
  • ArrayOps gets stepper, and arrays get seqStream/parStream extension methods
  • A new collection.convert.Accumulator collection type is added, similar to UnrolledBuffer, but with hand-specialized Int/Long/DoubleAccumulator and a generic AnyAccumulator subclasses.
    • The accumulator stepper support efficient splitting, so it's a good collection to run a parallel stream on. Users could convert for example a List[Int] to an IntAccumulator for processing a parallel stream. (Converting to an Array[Int] is also a good option).
    • The collection is a good target collection to collect the result of a parallel stream, because it supports efficient merging.
    • Accumulator extends Iterable (in java8-compat, the collection was not integrated into the hierarchy). Converting a collection.to(Accumulator) automatically builds a primitive accumulator for primitive element types. (java8-compat has an accumulate extension methods to convert from Scala collections).
  • Java streams get a toScala extension method that works the same as the to method in collections. (java8-compat has an accumulate extension method to convert to an Accumulator. This is now done with .toScala(Accumulator).)
  • Java spliterrators get a stepper extension method
  • MapOps gets keyStepper and valueStepper virtual methods. They can be overridden in subclasses to return steppers with EfficientSubstep.
  • MapOps gets seqKeyStream / seqValueStream and parKeyStream / parValueStream extension methods. The par versions can only be used on maps where key/valueStepper methods return a stepper with EfficientSubstep.

It would be good to get reviewing going for the parts that are done so far, and of course I'd be very happy if someone is motivated to help me finishing this up :)

@szeiger
Copy link
Contributor

szeiger commented Dec 11, 2018

How about adding stepper to IterableOnce, not just IterableOnceOps. Parallel collections only implement IterableOnce. It would be useful if they could provide an efficient stepper instead of only a sequential iterator.

@lrytz
Copy link
Member Author

lrytz commented Dec 11, 2018

That works for me, I didn't have a good reason to pick one over the other so far.

@Ichoran Ichoran self-assigned this Jan 10, 2019
@adriaanm adriaanm added the prio:blocker release blocker (used only by core team, only near release time) label Jan 10, 2019
@Ichoran
Copy link
Contributor

Ichoran commented Jan 14, 2019

@lrytz - Do you need help finishing this up? It looks like you've made great progress so far, so you're probably the best situated to wrap up the more intricate of the remaining steps. But I still mostly remember how the code works, so I could also try if I get the time.

@adriaanm adriaanm assigned adriaanm and SethTisue and unassigned adriaanm Jan 16, 2019
@adriaanm
Copy link
Contributor

@Ichoran this would be a good PR for you to adopt. Lukas is on leave.

@adriaanm adriaanm changed the title Integrate converters from java8-compat Integrate converters from java8-compat [ci: last-only] Jan 24, 2019
@adriaanm
Copy link
Contributor

In the mean time, I've done a quick rebase

@lrytz lrytz force-pushed the java8-converters branch from 1f39fe1 to c858e42 Compare March 27, 2019 08:02
@lrytz
Copy link
Member Author

lrytz commented Mar 27, 2019 via email

@lrytz
Copy link
Member Author

lrytz commented Mar 27, 2019

@Ichoran for me, this is ready to go. I pushed the fix to BitSetStepper that you suggested, other bugfixes can happen in a follow-up PR. There's still time to RC1, some other PRs are still active.

Should we merge? It would unblock some projects in the community build.

@sjrd
Copy link
Member

sjrd commented Mar 27, 2019

FWIW, I just tested this PR with Scala.js, and

  • tests pass out of the box
  • our code size benchmark is not affected (somehow)

@Ichoran
Copy link
Contributor

Ichoran commented Mar 27, 2019

@lrytz - Sounds good. I didn't get to the tree stepper thing last night, but I'll at least file a bug report and maybe fix it tonight.

sjrd added a commit to sjrd/scala-js that referenced this pull request Apr 1, 2019
sjrd added a commit to sjrd/scala-js that referenced this pull request Apr 1, 2019
adriaanm added a commit to adriaanm/scala that referenced this pull request Apr 2, 2019
Avoid unexpected class casts due to potential misuse in
subclasses of the following cunning trick, which allowed
for centralized implementation of common factory methods,
assuming the right overrides are provided in subclasses:
`protected type IterableCC[X] = CC[X] @uncheckedVariance`

Risk of not overriding correctly illustrated by
https://gist.github.com/lrytz/fe09791208969f07e211c5c8601db125.
It's basically an encoding of MyType, which is only sound under
pretty severe restrictions, which are not enforced by the
compiler.

Instead, we mix in a default implementation of the various
factory methods whenever we are in a trait where the
collection's type is known (and of the regular shape CC[A] or
MapCC[K, V]). It would be nice if we could conditionally mix in
this trait into the *Ops traits whenever `C =:= CC[A]`, but we
don't have that kind of inheritance (private inheritance would
be nice too, I guess, since the `*FactoryDefaults` traits don't
really add much utility to the public signature of these
collection classes).

While we're at it, also add an `empty` method at the top of the
hierarchy to facilitate the switch to dotty's for comprehensions
(since we want the stdlibs between Scala 2 & 3 to be the same,
and 2.14 will try to only make backwards compatible changes from
2.13).

Motivated by the intersection of scala#7458, scala#7929, scala#7800 (better
refchecks and compiling with dotc revealed part of the problem)

🤯

Co-authored-by: Lukas Rytz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

library:collections PRs involving changes to the standard collection library prio:blocker release blocker (used only by core team, only near release time) release-notes worth highlighting in next release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.