Skip to content

Conversation

@odd
Copy link
Contributor

@odd odd commented Sep 10, 2018

Add tombstones to mark removed slots in the fields vector. The tombstones carry a distance that points to the next occupied slot. If the distance is zero any following slots are also tombstones. This scheme adds at most one indirection for lookup (if slot s is requested and slot s is a tombstone with distance d, return slot s + d instead), but will require adjusting the distance for all immediately preceding tombstones to the current slot upon removal. It has similar or slightly better performance compared to the current VectorMap implementation for all operations except remove where this implementation has a significant advantage (especially for removal of non-consecutive elements) since no shifting of all following elements is necessary as in the current implementation.

@scala-jenkins scala-jenkins added this to the 2.13.0-RC1 milestone Sep 10, 2018
@odd
Copy link
Contributor Author

odd commented Sep 10, 2018

It should be noted that this performance improvement is not enough to make VectorMap as fast as OrderedMap (#7146) for removal of consecutive elements (my benchmarking runs show that for removing 2048 consecutive elements out of a map of size 4096, OrderedMap is over 200 times faster than this implementation of VectorMap and over 1000 times faster than the current implementation of VectorMap).

@odd odd force-pushed the vectormap-tombstones branch from c87b3ff to 3417fae Compare September 10, 2018 21:37
@mdedetrich
Copy link
Contributor

Awesome work, I actually had a local branch open with slightly different implementation, just didn't get around to finishing it (been a bit busy in the last week unfortunately).

My implementation basically stored the tombstones as null values in the underlying map similar to yours, albeit there are some differences.

I will likely have time to compare this weekend (which is when I will review it), but I suspect that this implementation may be a bit faster.

@odd

This comment has been minimized.

@mdedetrich mdedetrich requested a review from Ichoran September 27, 2018 16:52
@mdedetrich
Copy link
Contributor

Unfortunately I haven't had much time to look into this in extensive detail (I have been very busy this weekend). I did some modifications on this PR around a week ago but it appears that as usual its a matter of tradeoffs. I think that consecutive removes are likely to be the least common operation, so personally I am satisfied with the tradeoff. @Ichoran would you be able to look into this?

@Ichoran
Copy link
Contributor

Ichoran commented Sep 28, 2018

I generally don't think the Tombstone with distance information is the right tradeoff to make. If you aren't allowed to do indexing, there's nothing wrong with linear search and the algorithms are easier. You just keep track of the number of tombstones in the Vector and when you have N/log32 N of them, you do a full rebuild of both vector and map costing N log32 N time, which gives an amortized removal time of O((log32 N)^2) per element. Since log32 N is apparently "effectively constant", this should be amortized effectively effectively constant :P

However, with the limited look I've taken, this looks good to me for this approach, and the approach I describe doesn't exist right now. So this PR is better than not, I think.

@lrytz lrytz requested a review from szeiger October 1, 2018 12:35
extends AbstractMap[K, V]
final class VectorMap[K, +V] private (
private[immutable] val fields: Vector[Any],
private[immutable] val underlying: Map[K, (Int, V)], dummy: Boolean)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of the dummy parameter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine to disambiguate with the secondary constructor

  private[immutable] def this(fields: Vector[K], underlying: Map[K, (Int, V)]) = {
    this(fields, underlying, false)
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szeiger, @hrhino Yes, thats right, the dummy parameter is there to make the erased signatures of the primary and secondary constructors different from each other. Only the secondary constructor having the ostensible correct type of the fields vector should be available to other members of the immutable package, while the primary constructor using the relaxed type of the fields vector should only be available to the VectorMap class itself.

@szeiger szeiger merged commit a84ce5b into scala:2.13.x Oct 19, 2018
@xuwei-k
Copy link
Contributor

xuwei-k commented Oct 20, 2018

scala/bug#11217

@xuwei-k
Copy link
Contributor

xuwei-k commented Oct 20, 2018

scala/bug#11218

@xuwei-k
Copy link
Contributor

xuwei-k commented Oct 21, 2018

scala/bug#11220

@SethTisue
Copy link
Member

looks like this needs a run through scala-collections-laws

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants