-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Add immutable map that maintains insertion order (VectorMap/SeqMap) #6854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@julienrf Here is the PR for adding the new collection type. Also I have signed the CLA, but for some reason the check seems to be failing |
| * @define coll immutable vector map | ||
| * @define Coll `immutable.VectorMap` | ||
| */ | ||
| final class VectorMap[K, +V] private[immutable] ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an Insertion Ordered [Vector] Map. It happens to be based on a Vector. I think that the naming should reflect the interface and contract, not the implementation detail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you come up with a better name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mkeskells The name of the trait (LinkedMap) is based on the contract, but this is a concrete implementation. It is quite common for implementations to be named based on their structure (see ArraySeq, ChampHash{Map,Set}, ArrayDeque, TrieMap, Tree{Map,Set}, and many more)
| */ | ||
| final class VectorMap[K, +V] private[immutable] ( | ||
| val fields: Vector[K], | ||
| val underlying: Map[K, (Int, V)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make the fields private - they are not part of the public interface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woops missed this, thanks for picking it up, will fix it tonight
julienrf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mdedetrich! See my comments below.
| * functionality for the abstract methods in `LinkedMap`: | ||
| * | ||
| * @tparam A | ||
| * @tparam B |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are named K and V. Can you also add one line of doc for each?
| extends Map[K, V] | ||
| with Iterable[(K, V)] | ||
| with MapOps[K, V, LinkedMap, LinkedMap[K, V]] | ||
| with Equals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think only the with MapOps[…] line is useful, all the other with … are already inherited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note that LinkedMap inherit from Map equality, meaning that LinkedMap(1 -> "foo", 2 -> "bar") == LinkedMap(2 -> "bar", 1 -> "foo"). (This is also the case for s.c.m.LinkedHashMap)
We should probably document that clearly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked @Ichoran and this is intentional. Apparently even if we do have LinkedMap's which maintain insertion order, we should not take this into account when checking for equality, i.e. order equality is not a property for any type that implements immutable.Map.
I will document this though
| */ | ||
|
|
||
| trait LinkedMap[K, +V] | ||
| extends Map[K, V] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extends AbstractMap[K, V] yields a smaller bytecode.
| override def foreach[U](f: ((K, V)) => U): Unit = { | ||
| f((key1, value1)) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t think we should have these optimizations yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain what you mean exactly? Are you talking about the specialization's specifically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the small size specialized cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why we shouldn't have these optimizations, even more so in the case of VectorMap the performance/memory usage are issues, and specialization helps in the most common cases (maps of sizes 1-4)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason is to not bloat the scala-library jar. But ok this code is probably very small compared to size of the whole jar…
|
|
||
| override def updated[V1 >: V](key: K, value: V1): VectorMap[K, V1] = { | ||
| if (underlying.get(key).isDefined) { | ||
| val oldKey = underlying(key)._1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lookup in underlying is supposed to be fast but still, I guess it would be more efficient to store the result of underlying.get(key) instead of performing two lookups?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, can you rename oldKey to oldIndex (or just index)?
| } | ||
|
|
||
| override def tail: VectorMap[K, V] = { | ||
| val tail = fields.last |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tail is probably not a good name as it suggests a collection type. What do you think about lastKey?
| override def tail: VectorMap[K, V] = { | ||
| val tail = fields.last | ||
|
|
||
| new VectorMap(fields.slice(1, fields.size), underlying.remove(tail)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fields.tail is probably both more readable and more efficient than using .slice.
| override def init: VectorMap[K, V] = { | ||
| val head = fields.head | ||
|
|
||
| new VectorMap(fields.slice(1, fields.size - 0), underlying.remove(head)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of using fields.init here?
| // Only care about content, not ordering for equality | ||
| override def equals(that: Any): Boolean = | ||
| that match { | ||
| case vmap: VectorMap[K, V] => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be vmap: VectorMap[_, _] => to not have a warning.
| def newBuilder[K, V]: Builder[(K, V), LinkedMap[K, V]] = | ||
| new ImmutableBuilder[(K, V), LinkedMap[K, V]](empty) { | ||
| def addOne(elem: (K, V)): this.type = { elems = elems + elem; this } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should extend MapFactory.Delegate[LinkedMap](VectorMap) instead of implementing empty, from and newBuilder.
|
@julienrf Thanks for the comments, will address these tonight |
|
Is this really necessary? Looks like it could exist in a separate library without any downside. |
|
@sjrd If you read scala/collection-strawman#277, you can see the reasoning why its ideal to put this collection in the stdlib (the current It was also discussed in various SLIP meetings with scala center that it makes sense to add this collection into Scala for 2.13 (i.e. when the collections are redesigned). Arguably the biggest advantage of this PR is more the Note that I am already providing this as a separate library for Scala 2.10 to 2.12 (see https://github.com/mdedetrich/linked-map), however we lose the benefit that the case class JObject(values: LinkedMap[String, JValue])To ensure that key insertion ordering isn't lost, however this ship has already sailed (as discussed in the last SLIP meeting) |
|
I think this is well worth having in the standard library. Whenever possible, we should supply at least one concrete collection in the matrix (mutable, immutable) x (set, map) x ( Nondeterministic ordering can be a real nuisance for reproducibility, particularly in tests, but also in general application code. |
|
I just think there's been way too much of "let's add x to the collections library, because why not?" lately. It's especially infuriating to me because I firmly believe that "let's not add x to the std lib, because why?" is the better approach. All this stuff will need to be supported forever onwards! |
|
@julienrf I have either replied and/or applied your suggestions. Also let me know if you want me to squash my commits.
What do you mean? Afaik no single new collection (apart from this one) has been added in 2.13, instead a lot has been redesigned (and I think some things may have actually been removed). In any case, I agree with @SethTisue, a stdlib should have a general purpose collection that is "good" in all operations while maintaining insertion order, and we currently don't have that for |
790f618 to
a53a8d8
Compare
|
A bunch of methods have been added. |
I think thats a world of difference compared to adding a new collection, they aren't really comparable.
Well yes, its designed to be a replaced for
You are right that In any case, during the last SLIP meeting no one had any real objections to this. Not sure why its being brought up now. |
So if I'm already tense about added methods, you can imagine how afraid I am about entire new collections.
Because I'm not a member of the SLIP committee, so I'm not attending those. I only monitor PRs. I'm just a regular chap raising my anxiousness about the future, as far as the library goes. You're of course free to disregard my comments (you = yourself, the SLIP committee, and scala/scala maintainers). |
Yes, which will hopefully be fixed
You were there at the meeting and I don't remember anyone (including you) raising such concerns.
I am just curious as to why this wasn't raised earlier. |
|
@julienrf I just added a test to verify that the |
5312fda to
e942959
Compare
e942959 to
76b3be0
Compare
| } | ||
|
|
||
| def newBuilder[K, V]: Builder[(K, V), LinkedMap[K, V]] = | ||
| new MapFactory.Delegate[LinkedMap](VectorMap).newBuilder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just call VectorMap.newBuilder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, will add this tonight
|
@mdedetrich Thanks a lot for your work. That looks good to me. We need some time to decide whether we should include this new collection type in the standard library or not. We will come back to you soon. |
|
We have just had a discussion on Gitter with @mdedetrich: https://gitter.im/scala/collection-strawman?at=5b39e80af166440661120e50 We see several solutions:
I’m in favor of solution 2. The impact on the standard library seems reasonable. What do other people think? |
|
@jvican I think you also had an opinion on this matter which came from the SLIP meeting (in context of ScalaJson where it was suggested to add such a collection to stdlib) |
|
Option 2 seems reasonable to me. Side-track: can we find a better name for |
I agree, this is the best name I could come up with that didn't conflict with what we already have (that also isn't too long). I came up with I am open to better suggestions for name, does |
|
|
|
Ah, no, the scabot doesn't talk to travis.. I restarted the travis build. But in any case jenkins is what matters (for now), we're just testing travis. |
66e768c to
07b7358
Compare
|
@julienrf I have added |
Why not. You can just use |
|
@julienrf is this ready for merge, in your opinion? can this make M5? |
|
tentatively marking as M5 blocker since if we're going to ship this, I think we want to get it out there for people to test |
|
@jvican Was this discussed at a SIP meeting? I'm generally in favor of adding this and the implementation looks good after a cursory review. This type of collection comes up frequently in data formats such as JSON or XML but is sorely missing from many collection libraries. I would like to see this added to the standard library eventually but if it is deemed to big a change without a more formal process we can defer the decision until 2.14. |
|
@szeiger We couldn't discuss it at a SIP meeting yet. I'm super happy that there are so many requests to the SIP Committee (we've never been so busy), so I've proposed to increase the frequency of our SIP meetings in the next three months after vacation to cope up with this demand. However, it would make some sense to me to merge this in M5 and remove it in RC1 if the Committee finally votes against it (@adriaanm). We're already doing that for symbol literals, and we could use the experience our users gain from using it to nitpick on its current design in RC1 in case it stays. |
|
@jvican @szeiger @SethTisue Thanks for the update guys, on another note there hasn't been extensive micro bench marking on this collection because I noticed that @retronym was doing it on basic collections that we depend on, so I wanted to wait for this to get merged before attempting to do this, is this okay with you guys? |
|
I lean towards putting this in an external library, but in case you want to merge it in the standard library, then yes, this is good to merge! |
|
My opinion basically mirrors what is said quite eloquently here #6854 (comment). I agree with @jvican that merged for |
|
Thanks, @mdedetrich! 🍰 |
|
@adriaanm Should I delete the branch or do you want it as a reference? |
|
Deleting is fine. We use the merge commits for reference. |
| } | ||
|
|
||
| @SerialVersionUID(3L) | ||
| final class LinkedMap1[K, +V](key1: K, value1: V) extends LinkedMap[K,V] with Serializable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have been private, along with all the other ...N's, IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah this must have been overlooked, should I make a PR to fix this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mdedetrich That would be awesome!
|
@mdedetrich Is there any reason that there is no |
|
@hepin1989 Don't think so, anyone is free to contribute such a collection (although I suspect its like for Scala 2.13) |
This PR adds an implementation of scala/collection-strawman#277. Currently there isn't any known immutable map which also maintains key insertion order while keeping effectively constant lookup time on key, so the only known implementations are done by combining a Vector with a HasMap (or in Scala's case HashMap/ChampHashMap)
Some other notes on the PR
private val useBaseLinein a very stupid way, I can refactor this into a package private variable however I am under the impression this is some workaround which will get removed latervaluesmethod on aVectorMap)SeqMapwas added. ASeqMapis a type that identifies whether or not aMapmaintains key insertion order, the currentListMapextendsSeqMapas well as the newVectorMapmutable.SeqMaphowever this just reuses current collections instead of introducing a new one (i.e. it usesmutable.LinkedHashMap)