Skip to content

Conversation

@szeiger
Copy link
Contributor

@szeiger szeiger commented Oct 19, 2018

  • The old HashSet was based on open addressing. Since the similar OpenHashMap is deprecated to lower the maintenance burden (and nobody seems to be complaining about that deprecation) I removed it (along with the actual implementing class FlatHashTable) instead of adding yet another collection implementation (OpenHashSet).
  • The new HashSet implementation caches the hash codes and uses linked lists for chaining. This is consistent with mutable.HashMap, the immutable HashSet/HashMap, and Java’s mutable HashSet/HashMap.
  • The linked lists are kept ordered by hash code. This costs a few percent of performance for well-distributed hashes but is significantly faster in cases with lots of bucket collisions
  • Hash set computation is fixed to properly implement Scala’s equality semantics for numeric types.
  • The new HashMap is based on the new HashSet.

Fixes scala/scala-dev#567

Note: The following performance data is from the original version of this PR. See the other comments for newer data.

Performance is similar to Java and much better than the old implementation (ohs = OpenHashSet, i.e. the old version plus the equality fix).

Checking for elements not in the set:

[info] Benchmark                           (size)  Mode  Cnt          Score         Error  Units
[info] HashSetBenchmark.hsContainsFalse        10  avgt   40        748.287 ±       7.918  ns/op
[info] HashSetBenchmark.hsContainsFalse       100  avgt   40        714.973 ±       7.860  ns/op
[info] HashSetBenchmark.hsContainsFalse      1000  avgt   40       8567.290 ±     103.751  ns/op
[info] HashSetBenchmark.hsContainsFalse     10000  avgt   40     125550.369 ±    1638.980  ns/op
[info] HashSetBenchmark.javaContainsFalse      10  avgt   40        863.094 ±       8.875  ns/op
[info] HashSetBenchmark.javaContainsFalse     100  avgt   40        716.200 ±       8.042  ns/op
[info] HashSetBenchmark.javaContainsFalse    1000  avgt   40       8317.895 ±     106.585  ns/op
[info] HashSetBenchmark.javaContainsFalse   10000  avgt   40     120763.998 ±    1510.056  ns/op
[info] HashSetBenchmark.ohsContainsFalse       10  avgt   40       1370.404 ±      12.125  ns/op
[info] HashSetBenchmark.ohsContainsFalse      100  avgt   40       1435.686 ±      13.547  ns/op
[info] HashSetBenchmark.ohsContainsFalse     1000  avgt   40      13464.047 ±     144.327  ns/op
[info] HashSetBenchmark.ohsContainsFalse    10000  avgt   40     153473.412 ±    1710.699  ns/op

Checking for elements in the set:

[info] Benchmark                           (size)  Mode  Cnt          Score         Error  Units
[info] HashSetBenchmark.hsContainsTrue         10  avgt   40         71.711 ±       0.687  ns/op
[info] HashSetBenchmark.hsContainsTrue        100  avgt   40        758.101 ±       8.063  ns/op
[info] HashSetBenchmark.hsContainsTrue       1000  avgt   40       8397.221 ±      96.340  ns/op
[info] HashSetBenchmark.hsContainsTrue      10000  avgt   40      92881.757 ±    1073.014  ns/op
[info] HashSetBenchmark.javaContainsTrue       10  avgt   40         75.937 ±       0.962  ns/op
[info] HashSetBenchmark.javaContainsTrue      100  avgt   40        779.305 ±       8.185  ns/op
[info] HashSetBenchmark.javaContainsTrue     1000  avgt   40       8513.105 ±      94.085  ns/op
[info] HashSetBenchmark.javaContainsTrue    10000  avgt   40     100899.011 ±    1037.182  ns/op
[info] HashSetBenchmark.ohsContainsTrue        10  avgt   40         97.382 ±       0.752  ns/op
[info] HashSetBenchmark.ohsContainsTrue       100  avgt   40       1210.185 ±      12.070  ns/op
[info] HashSetBenchmark.ohsContainsTrue      1000  avgt   40      12817.008 ±     122.612  ns/op
[info] HashSetBenchmark.ohsContainsTrue     10000  avgt   40     242145.988 ±    2563.727  ns/op

Iterating over all elements with an Iterator:

[info] Benchmark                           (size)  Mode  Cnt          Score         Error  Units
[info] HashSetBenchmark.hsIterate              10  avgt   40         74.911 ±       0.866  ns/op
[info] HashSetBenchmark.hsIterate             100  avgt   40        744.871 ±       7.981  ns/op
[info] HashSetBenchmark.hsIterate            1000  avgt   40       7286.783 ±      75.084  ns/op
[info] HashSetBenchmark.hsIterate           10000  avgt   40      90625.678 ±     694.380  ns/op
[info] HashSetBenchmark.javaIterate            10  avgt   40         68.800 ±       0.608  ns/op
[info] HashSetBenchmark.javaIterate           100  avgt   40        657.670 ±       6.781  ns/op
[info] HashSetBenchmark.javaIterate          1000  avgt   40       8047.419 ±      87.539  ns/op
[info] HashSetBenchmark.javaIterate         10000  avgt   40      88477.533 ±    1060.863  ns/op
[info] HashSetBenchmark.ohsIterate             10  avgt   40         87.981 ±       0.883  ns/op
[info] HashSetBenchmark.ohsIterate            100  avgt   40        789.141 ±       7.862  ns/op
[info] HashSetBenchmark.ohsIterate           1000  avgt   40      12361.567 ±     146.162  ns/op
[info] HashSetBenchmark.ohsIterate          10000  avgt   40     154239.533 ±    1687.955  ns/op

Building a pre-allocated set without the need to regrow the table: The old implementation doesn't support size hints and is therefore much slower (even more so than in the other benchmarks)

[info] Benchmark                           (size)  Mode  Cnt          Score         Error  Units
[info] HashSetBenchmark.hsBuild                10  avgt   40        226.209 ±       1.676  ns/op
[info] HashSetBenchmark.hsBuild               100  avgt   40       1242.171 ±       8.344  ns/op
[info] HashSetBenchmark.hsBuild              1000  avgt   40      13998.848 ±     143.376  ns/op
[info] HashSetBenchmark.hsBuild             10000  avgt   40     151962.485 ±    2464.036  ns/op
[info] HashSetBenchmark.javaBuild              10  avgt   40        164.747 ±       1.307  ns/op
[info] HashSetBenchmark.javaBuild             100  avgt   40       1409.319 ±      19.177  ns/op
[info] HashSetBenchmark.javaBuild            1000  avgt   40      14951.739 ±     313.738  ns/op
[info] HashSetBenchmark.javaBuild           10000  avgt   40     138224.730 ±    1063.879  ns/op
[info] HashSetBenchmark.ohsBuild               10  avgt   40        304.800 ±       2.440  ns/op
[info] HashSetBenchmark.ohsBuild              100  avgt   40       3530.671 ±      24.469  ns/op
[info] HashSetBenchmark.ohsBuild             1000  avgt   40      50794.295 ±     608.518  ns/op
[info] HashSetBenchmark.ohsBuild            10000  avgt   40     723370.224 ±    8220.495  ns/op

Building the same set without pre-allocating:

[info] Benchmark                           (size)  Mode  Cnt          Score         Error  Units
[info] HashSetBenchmark.hsFillRegular          10  avgt   40         94.806 ±       0.605  ns/op
[info] HashSetBenchmark.hsFillRegular         100  avgt   40       2524.589 ±      24.035  ns/op
[info] HashSetBenchmark.hsFillRegular        1000  avgt   40      25745.204 ±     244.135  ns/op
[info] HashSetBenchmark.hsFillRegular       10000  avgt   40     232893.781 ±    4670.449  ns/op
[info] HashSetBenchmark.javaFillRegular        10  avgt   40        156.766 ±       1.816  ns/op
[info] HashSetBenchmark.javaFillRegular       100  avgt   40       2286.323 ±      26.367  ns/op
[info] HashSetBenchmark.javaFillRegular      1000  avgt   40      23164.913 ±     223.570  ns/op
[info] HashSetBenchmark.javaFillRegular     10000  avgt   40     244691.347 ±    5273.426  ns/op
[info] HashSetBenchmark.ohsFillRegular         10  avgt   40         73.261 ±       0.573  ns/op
[info] HashSetBenchmark.ohsFillRegular        100  avgt   40       2727.078 ±      19.698  ns/op
[info] HashSetBenchmark.ohsFillRegular       1000  avgt   40      50067.958 ±     474.006  ns/op
[info] HashSetBenchmark.ohsFillRegular      10000  avgt   40     688574.114 ±    7641.043  ns/op

Building a set with objects whose hashes are reduced to only 4 significant bits to simulate a badly distributed hash function with many collisions: Keeping the linked lists sorted by hash code provides a significant advantage over the Java implementation:

[info] Benchmark                           (size)  Mode  Cnt          Score         Error  Units
[info] HashSetBenchmark.hsFillColliding        10  avgt   40        290.810 ±       3.183  ns/op
[info] HashSetBenchmark.hsFillColliding       100  avgt   40      10125.033 ±     217.269  ns/op
[info] HashSetBenchmark.hsFillColliding      1000  avgt   40     518557.636 ±    6309.267  ns/op
[info] HashSetBenchmark.hsFillColliding     10000  avgt   40   27453068.847 ±  435504.862  ns/op
[info] HashSetBenchmark.javaFillColliding      10  avgt   40        312.097 ±       7.824  ns/op
[info] HashSetBenchmark.javaFillColliding     100  avgt   40      27746.938 ±     342.631  ns/op
[info] HashSetBenchmark.javaFillColliding    1000  avgt   40    1245683.805 ±   16718.405  ns/op
[info] HashSetBenchmark.javaFillColliding   10000  avgt   40   83481714.270 ±  942107.149  ns/op
[info] HashSetBenchmark.ohsFillColliding       10  avgt   40        357.261 ±       7.076  ns/op
[info] HashSetBenchmark.ohsFillColliding      100  avgt   40      25547.829 ±     343.646  ns/op
[info] HashSetBenchmark.ohsFillColliding     1000  avgt   40    2212242.151 ±   22979.979  ns/op
[info] HashSetBenchmark.ohsFillColliding    10000  avgt   40  161370140.826 ± 1450524.756  ns/op

The next step is comparing the new mutable.HashSet with mutable.HashMap, possibly replacing one with the other. After that I'll look into ordering-aware implementations to address scala/bug#11203. As suggested by @Ichoran I'll keep these separate so they can make use of an Ordering which is more idiomatic in Scala than having the objects themselves implement Comparable. It should also make the implementations simpler and faster.

@scala-jenkins scala-jenkins added this to the 2.13.0-RC1 milestone Oct 19, 2018
@adriaanm
Copy link
Contributor

adriaanm commented Oct 19, 2018

I grepped the jenkins log for "contains object and package with same name", and this hasn't come up in a while at least. Perhaps has something to do with iteration-order-sensitivity in the compiler?

@szeiger
Copy link
Contributor Author

szeiger commented Oct 19, 2018

Ah, right, that could be the issue. I assumed it's some change in my code that trips up the compiler but the failing tests are ones that use the new compiler and library and therefore the new HashSet. So it could be either a bug that the regular tests don't catch or the compiler relying on undocumented behavior of the old version.

@szeiger szeiger force-pushed the wip/mutable-hashset2 branch from 3e16f04 to 4e6d9df Compare October 22, 2018 14:40
@szeiger
Copy link
Contributor Author

szeiger commented Oct 22, 2018

Here's a benchmark for the current mutable.HashMap: It looks like the best course of action is to replace with with the new HashSet implementation, too:

[info] Benchmark                             (size)  (stringsOnly)  Mode  Cnt         Score         Error  Units
[info] HashMapBenchmark2.hmBuild                 10           true  avgt   40       198.147 ±       1.461  ns/op
[info] HashMapBenchmark2.hmBuild                100           true  avgt   40      4468.367 ±      37.205  ns/op
[info] HashMapBenchmark2.hmBuild               1000           true  avgt   40     53254.593 ±    1756.088  ns/op
[info] HashMapBenchmark2.hmBuild              10000           true  avgt   40    595413.189 ±    5949.386  ns/op
[info] HashMapBenchmark2.hmFillColliding         10           true  avgt   40       339.469 ±       2.742  ns/op
[info] HashMapBenchmark2.hmFillColliding        100           true  avgt   40     10647.041 ±      89.089  ns/op
[info] HashMapBenchmark2.hmFillColliding       1000           true  avgt   40    436712.886 ±    4029.167  ns/op
[info] HashMapBenchmark2.hmFillColliding      10000           true  avgt   40  30377856.191 ±  568581.086  ns/op
[info] HashMapBenchmark2.hmFillRegular           10           true  avgt   40       189.827 ±       1.254  ns/op
[info] HashMapBenchmark2.hmFillRegular          100           true  avgt   40      3784.346 ±      33.190  ns/op
[info] HashMapBenchmark2.hmFillRegular         1000           true  avgt   40     45613.680 ±     349.471  ns/op
[info] HashMapBenchmark2.hmFillRegular        10000           true  avgt   40    544986.162 ±    3805.696  ns/op
[info] HashMapBenchmark2.hmGetExisting           10           true  avgt   40       131.165 ±       1.496  ns/op
[info] HashMapBenchmark2.hmGetExisting          100           true  avgt   40      1408.612 ±      17.144  ns/op
[info] HashMapBenchmark2.hmGetExisting         1000           true  avgt   40     16864.645 ±     303.033  ns/op
[info] HashMapBenchmark2.hmGetExisting        10000           true  avgt   40    207976.336 ±    3965.489  ns/op
[info] HashMapBenchmark2.hmGetNone               10           true  avgt   40      1349.524 ±      18.218  ns/op
[info] HashMapBenchmark2.hmGetNone              100           true  avgt   40      1284.957 ±      13.222  ns/op
[info] HashMapBenchmark2.hmGetNone             1000           true  avgt   40     14666.559 ±     165.990  ns/op
[info] HashMapBenchmark2.hmGetNone            10000           true  avgt   40    262150.290 ±    2593.403  ns/op
[info] HashMapBenchmark2.hmIterateEntries        10           true  avgt   40       116.187 ±       0.909  ns/op
[info] HashMapBenchmark2.hmIterateEntries       100           true  avgt   40      1277.117 ±      10.004  ns/op
[info] HashMapBenchmark2.hmIterateEntries      1000           true  avgt   40     13795.134 ±     125.189  ns/op
[info] HashMapBenchmark2.hmIterateEntries     10000           true  avgt   40    201888.734 ±    2998.804  ns/op
[info] HashMapBenchmark2.hmIterateKeys           10           true  avgt   40        85.813 ±       0.880  ns/op
[info] HashMapBenchmark2.hmIterateKeys          100           true  avgt   40       962.237 ±      24.899  ns/op
[info] HashMapBenchmark2.hmIterateKeys         1000           true  avgt   40     11091.241 ±     128.050  ns/op
[info] HashMapBenchmark2.hmIterateKeys        10000           true  avgt   40    187876.268 ±    3764.055  ns/op
[info] HashMapBenchmark2.hsBuild                 10           true  avgt   40       222.691 ±       1.796  ns/op
[info] HashMapBenchmark2.hsBuild                100           true  avgt   40      1222.176 ±       8.702  ns/op
[info] HashMapBenchmark2.hsBuild               1000           true  avgt   40     14609.603 ±     162.673  ns/op
[info] HashMapBenchmark2.hsBuild              10000           true  avgt   40    155206.517 ±    1969.871  ns/op
[info] HashMapBenchmark2.hsContainsFalse         10           true  avgt   40       748.925 ±       5.280  ns/op
[info] HashMapBenchmark2.hsContainsFalse        100           true  avgt   40       755.388 ±      20.851  ns/op
[info] HashMapBenchmark2.hsContainsFalse       1000           true  avgt   40      8845.421 ±     217.852  ns/op
[info] HashMapBenchmark2.hsContainsFalse      10000           true  avgt   40    123012.407 ±    1304.482  ns/op
[info] HashMapBenchmark2.hsContainsTrue          10           true  avgt   40        70.622 ±       0.691  ns/op
[info] HashMapBenchmark2.hsContainsTrue         100           true  avgt   40       744.912 ±       9.023  ns/op
[info] HashMapBenchmark2.hsContainsTrue        1000           true  avgt   40      9109.007 ±     451.839  ns/op
[info] HashMapBenchmark2.hsContainsTrue       10000           true  avgt   40     92709.332 ±    1768.244  ns/op
[info] HashMapBenchmark2.hsFillColliding         10           true  avgt   40       279.265 ±       2.421  ns/op
[info] HashMapBenchmark2.hsFillColliding        100           true  avgt   40     11517.848 ±      98.005  ns/op
[info] HashMapBenchmark2.hsFillColliding       1000           true  avgt   40    613951.739 ±    6338.090  ns/op
[info] HashMapBenchmark2.hsFillColliding      10000           true  avgt   40  34523033.981 ±  962620.675  ns/op
[info] HashMapBenchmark2.hsFillRegular           10           true  avgt   40        94.644 ±       0.729  ns/op
[info] HashMapBenchmark2.hsFillRegular          100           true  avgt   40      2464.642 ±      22.182  ns/op
[info] HashMapBenchmark2.hsFillRegular         1000           true  avgt   40     24845.702 ±     123.177  ns/op
[info] HashMapBenchmark2.hsFillRegular        10000           true  avgt   40    222687.689 ±    4293.883  ns/op
[info] HashMapBenchmark2.hsIterate               10           true  avgt   40        72.626 ±       0.589  ns/op
[info] HashMapBenchmark2.hsIterate              100           true  avgt   40       742.590 ±       6.430  ns/op
[info] HashMapBenchmark2.hsIterate             1000           true  avgt   40      7245.587 ±      55.877  ns/op
[info] HashMapBenchmark2.hsIterate            10000           true  avgt   40     90038.617 ±     907.301  ns/op
[info] HashMapBenchmark2.javaBuild               10           true  avgt   40       174.356 ±       1.368  ns/op
[info] HashMapBenchmark2.javaBuild              100           true  avgt   40      1537.434 ±       9.147  ns/op
[info] HashMapBenchmark2.javaBuild             1000           true  avgt   40     16231.568 ±     131.413  ns/op
[info] HashMapBenchmark2.javaBuild            10000           true  avgt   40    152196.650 ±   11341.081  ns/op
[info] HashMapBenchmark2.javaFillColliding       10           true  avgt   40       330.123 ±       5.207  ns/op
[info] HashMapBenchmark2.javaFillColliding      100           true  avgt   40     27454.670 ±     438.167  ns/op
[info] HashMapBenchmark2.javaFillColliding     1000           true  avgt   40   1224476.917 ±   12419.393  ns/op
[info] HashMapBenchmark2.javaFillColliding    10000           true  avgt   40  81818556.068 ± 2456671.420  ns/op
[info] HashMapBenchmark2.javaFillRegular         10           true  avgt   40       160.923 ±       1.307  ns/op
[info] HashMapBenchmark2.javaFillRegular        100           true  avgt   40      2371.674 ±      25.878  ns/op
[info] HashMapBenchmark2.javaFillRegular       1000           true  avgt   40     23974.757 ±     220.219  ns/op
[info] HashMapBenchmark2.javaFillRegular      10000           true  avgt   40    265580.503 ±    3230.095  ns/op
[info] HashMapBenchmark2.javaGetExisting         10           true  avgt   40        90.856 ±       2.367  ns/op
[info] HashMapBenchmark2.javaGetExisting        100           true  avgt   40      1186.387 ±      37.586  ns/op
[info] HashMapBenchmark2.javaGetExisting       1000           true  avgt   40     15866.093 ±     117.440  ns/op
[info] HashMapBenchmark2.javaGetExisting      10000           true  avgt   40    169240.259 ±    1341.993  ns/op
[info] HashMapBenchmark2.javaGetNone             10           true  avgt   40      1427.607 ±     102.196  ns/op
[info] HashMapBenchmark2.javaGetNone            100           true  avgt   40      1124.080 ±       9.363  ns/op
[info] HashMapBenchmark2.javaGetNone           1000           true  avgt   40     12415.260 ±      95.176  ns/op
[info] HashMapBenchmark2.javaGetNone          10000           true  avgt   40    137307.804 ±    1148.475  ns/op
[info] HashMapBenchmark2.javaIterateEntries      10           true  avgt   40        62.912 ±       0.539  ns/op
[info] HashMapBenchmark2.javaIterateEntries     100           true  avgt   40       657.961 ±       6.162  ns/op
[info] HashMapBenchmark2.javaIterateEntries    1000           true  avgt   40      7185.347 ±      85.826  ns/op
[info] HashMapBenchmark2.javaIterateEntries   10000           true  avgt   40    100833.197 ±    5036.191  ns/op
[info] HashMapBenchmark2.javaIterateKeys         10           true  avgt   40        65.991 ±       0.764  ns/op
[info] HashMapBenchmark2.javaIterateKeys        100           true  avgt   40       668.754 ±       7.057  ns/op
[info] HashMapBenchmark2.javaIterateKeys       1000           true  avgt   40      7329.816 ±      64.975  ns/op
[info] HashMapBenchmark2.javaIterateKeys      10000           true  avgt   40     96768.064 ±    4602.843  ns/op

@javax-swing
Copy link

Not sure if this is helpful, but I know in java land you can transform any Map<K,Boolean> into a Set<K>. Not sure if we lose some performance benefits due to the wrapping, but might be quite handy (also we could use Unit). It also means you only have to encode behaviours once, for example.

TreeMap -> TreeSet, LinkedHashMap -> LinkedHashSet (ListMap in scala??) HashMap -> HashSet etc etc

I guess it could be worth a benchmark

@szeiger
Copy link
Contributor Author

szeiger commented Oct 23, 2018

Sharing an implementation between Set and Map is the obvious thing to do. Java does it (HashSet simply wraps HashMap). Our old design was also prepared for this by separating the actual hash table implementation from HashMap and HashSet (but HashSet uses a different implementation anyway). You're wasting one pointer per element if you do it this way, which shouldn't be too bad. Another option would be to have different node implementations for maps and sets. This will need some benchmarking to make sure HotSpot can optimize it properly.

@javax-swing
Copy link

I suppose what I’m wondering is if there should be something like abstract class MapBasedSet[A](backedBy : mutable.Map[A,Unit]) extends mutable.Set[A] to make it more convenient and uniform to implement the different flavours of Set.

@szeiger
Copy link
Contributor Author

szeiger commented Oct 23, 2018

@javax-swing: Are there other cases than HashMap / HashSet where you would want to use it? Just for HashSet my take is that we should share the code (either by extending a common base class with private methods, wrapping a HashMap, or delegating the functionality to some static methods like we do for TreeMap and TreeSet) if the performance impact is negligible, otherwise we can reuse it by copy & paste.

@szeiger
Copy link
Contributor Author

szeiger commented Oct 23, 2018

BTW, the first performance results show that my new HashMap is much faster than the old one except in collision resistance. Maybe the old Scala hash mapping provides an advantage over Java's simpler one (that I'm using). Investigating further.

@javax-swing
Copy link

@szeiger not really, just wondering whether the abstraction is useful. It could simplify the implementations for some of the other kinds of sets (for example TreeSet). In java you can do Collections.newSetFromMap for Map types that have no equivalent Set implementation.

@szeiger szeiger force-pushed the wip/mutable-hashset2 branch from 4e6d9df to 8640fd8 Compare October 23, 2018 22:22
@szeiger
Copy link
Contributor Author

szeiger commented Oct 24, 2018

TreeSet is a SortedSet. A wrapper for Map -> Set wouldn't be very useful for it. You'd need a different one for SortedMap -> SortedSet.

@szeiger szeiger force-pushed the wip/mutable-hashset2 branch from 8640fd8 to a724245 Compare October 24, 2018 13:03
@javax-swing
Copy link

Ah ok makes sense :)

@szeiger
Copy link
Contributor Author

szeiger commented Oct 25, 2018

I did some benchmarks on an abstracted version that allows different Node implementations so that HashSet could have a Node without the value. Some variations are worse than others but in all versions that I tried there was a significant performance penalty. So if we want to it be fast, the choices are separate implementations or wrapping a Map in a Set (and thus wasting one pointer per element).

@szeiger
Copy link
Contributor Author

szeiger commented Oct 25, 2018

Here's the new mutable.HashMap implementation. Firstly, the original HashMap compared to my new HashSet. There's clearly a lot of room for improvement:

[info] Benchmark                             (size)  (stringsOnly)  Mode  Cnt         Score         Error  Units
[info] HashMapBenchmark2.hmBuild                 10           true  avgt   40       198.147 ±       1.461  ns/op
[info] HashMapBenchmark2.hmBuild                100           true  avgt   40      4468.367 ±      37.205  ns/op
[info] HashMapBenchmark2.hmBuild               1000           true  avgt   40     53254.593 ±    1756.088  ns/op
[info] HashMapBenchmark2.hmBuild              10000           true  avgt   40    595413.189 ±    5949.386  ns/op
[info] HashMapBenchmark2.hmFillColliding         10           true  avgt   40       339.469 ±       2.742  ns/op
[info] HashMapBenchmark2.hmFillColliding        100           true  avgt   40     10647.041 ±      89.089  ns/op
[info] HashMapBenchmark2.hmFillColliding       1000           true  avgt   40    436712.886 ±    4029.167  ns/op
[info] HashMapBenchmark2.hmFillColliding      10000           true  avgt   40  30377856.191 ±  568581.086  ns/op
[info] HashMapBenchmark2.hmFillRegular           10           true  avgt   40       189.827 ±       1.254  ns/op
[info] HashMapBenchmark2.hmFillRegular          100           true  avgt   40      3784.346 ±      33.190  ns/op
[info] HashMapBenchmark2.hmFillRegular         1000           true  avgt   40     45613.680 ±     349.471  ns/op
[info] HashMapBenchmark2.hmFillRegular        10000           true  avgt   40    544986.162 ±    3805.696  ns/op
[info] HashMapBenchmark2.hmGetExisting           10           true  avgt   40       131.165 ±       1.496  ns/op
[info] HashMapBenchmark2.hmGetExisting          100           true  avgt   40      1408.612 ±      17.144  ns/op
[info] HashMapBenchmark2.hmGetExisting         1000           true  avgt   40     16864.645 ±     303.033  ns/op
[info] HashMapBenchmark2.hmGetExisting        10000           true  avgt   40    207976.336 ±    3965.489  ns/op
[info] HashMapBenchmark2.hmGetNone               10           true  avgt   40      1349.524 ±      18.218  ns/op
[info] HashMapBenchmark2.hmGetNone              100           true  avgt   40      1284.957 ±      13.222  ns/op
[info] HashMapBenchmark2.hmGetNone             1000           true  avgt   40     14666.559 ±     165.990  ns/op
[info] HashMapBenchmark2.hmGetNone            10000           true  avgt   40    262150.290 ±    2593.403  ns/op
[info] HashMapBenchmark2.hmIterateEntries        10           true  avgt   40       116.187 ±       0.909  ns/op
[info] HashMapBenchmark2.hmIterateEntries       100           true  avgt   40      1277.117 ±      10.004  ns/op
[info] HashMapBenchmark2.hmIterateEntries      1000           true  avgt   40     13795.134 ±     125.189  ns/op
[info] HashMapBenchmark2.hmIterateEntries     10000           true  avgt   40    201888.734 ±    2998.804  ns/op
[info] HashMapBenchmark2.hmIterateKeys           10           true  avgt   40        85.813 ±       0.880  ns/op
[info] HashMapBenchmark2.hmIterateKeys          100           true  avgt   40       962.237 ±      24.899  ns/op
[info] HashMapBenchmark2.hmIterateKeys         1000           true  avgt   40     11091.241 ±     128.050  ns/op
[info] HashMapBenchmark2.hmIterateKeys        10000           true  avgt   40    187876.268 ±    3764.055  ns/op
[info] HashMapBenchmark2.hsBuild                 10           true  avgt   40       222.691 ±       1.796  ns/op
[info] HashMapBenchmark2.hsBuild                100           true  avgt   40      1222.176 ±       8.702  ns/op
[info] HashMapBenchmark2.hsBuild               1000           true  avgt   40     14609.603 ±     162.673  ns/op
[info] HashMapBenchmark2.hsBuild              10000           true  avgt   40    155206.517 ±    1969.871  ns/op
[info] HashMapBenchmark2.hsContainsFalse         10           true  avgt   40       748.925 ±       5.280  ns/op
[info] HashMapBenchmark2.hsContainsFalse        100           true  avgt   40       755.388 ±      20.851  ns/op
[info] HashMapBenchmark2.hsContainsFalse       1000           true  avgt   40      8845.421 ±     217.852  ns/op
[info] HashMapBenchmark2.hsContainsFalse      10000           true  avgt   40    123012.407 ±    1304.482  ns/op
[info] HashMapBenchmark2.hsContainsTrue          10           true  avgt   40        70.622 ±       0.691  ns/op
[info] HashMapBenchmark2.hsContainsTrue         100           true  avgt   40       744.912 ±       9.023  ns/op
[info] HashMapBenchmark2.hsContainsTrue        1000           true  avgt   40      9109.007 ±     451.839  ns/op
[info] HashMapBenchmark2.hsContainsTrue       10000           true  avgt   40     92709.332 ±    1768.244  ns/op
[info] HashMapBenchmark2.hsFillColliding         10           true  avgt   40       279.265 ±       2.421  ns/op
[info] HashMapBenchmark2.hsFillColliding        100           true  avgt   40     11517.848 ±      98.005  ns/op
[info] HashMapBenchmark2.hsFillColliding       1000           true  avgt   40    613951.739 ±    6338.090  ns/op
[info] HashMapBenchmark2.hsFillColliding      10000           true  avgt   40  34523033.981 ±  962620.675  ns/op
[info] HashMapBenchmark2.hsFillRegular           10           true  avgt   40        94.644 ±       0.729  ns/op
[info] HashMapBenchmark2.hsFillRegular          100           true  avgt   40      2464.642 ±      22.182  ns/op
[info] HashMapBenchmark2.hsFillRegular         1000           true  avgt   40     24845.702 ±     123.177  ns/op
[info] HashMapBenchmark2.hsFillRegular        10000           true  avgt   40    222687.689 ±    4293.883  ns/op
[info] HashMapBenchmark2.hsIterate               10           true  avgt   40        72.626 ±       0.589  ns/op
[info] HashMapBenchmark2.hsIterate              100           true  avgt   40       742.590 ±       6.430  ns/op
[info] HashMapBenchmark2.hsIterate             1000           true  avgt   40      7245.587 ±      55.877  ns/op
[info] HashMapBenchmark2.hsIterate            10000           true  avgt   40     90038.617 ±     907.301  ns/op

Notice the FillColliding results. My new HashSet easily beats the old HashSet and the Java versions in this test but the old HashMap is still a bit better. This appears to be because of the different improvement algorithm. The results vary a lot by hash distribution though. In particular, a well-distributed set does much worse with our old improvement algorithm than the new one (which is the same on that Java uses). I didn't find a way to keep the slightly better performance in this test without sacrificing a lot more performance in other tests.

The performance data from Java is pretty much the same as it was for HashSet, which is to be expected because HashSet is simply a wrapper for HashMap:

[info] HashMapBenchmark2.javaBuild               10           true  avgt   40       174.356 ±       1.368  ns/op
[info] HashMapBenchmark2.javaBuild              100           true  avgt   40      1537.434 ±       9.147  ns/op
[info] HashMapBenchmark2.javaBuild             1000           true  avgt   40     16231.568 ±     131.413  ns/op
[info] HashMapBenchmark2.javaBuild            10000           true  avgt   40    152196.650 ±   11341.081  ns/op
[info] HashMapBenchmark2.javaFillColliding       10           true  avgt   40       330.123 ±       5.207  ns/op
[info] HashMapBenchmark2.javaFillColliding      100           true  avgt   40     27454.670 ±     438.167  ns/op
[info] HashMapBenchmark2.javaFillColliding     1000           true  avgt   40   1224476.917 ±   12419.393  ns/op
[info] HashMapBenchmark2.javaFillColliding    10000           true  avgt   40  81818556.068 ± 2456671.420  ns/op
[info] HashMapBenchmark2.javaFillRegular         10           true  avgt   40       160.923 ±       1.307  ns/op
[info] HashMapBenchmark2.javaFillRegular        100           true  avgt   40      2371.674 ±      25.878  ns/op
[info] HashMapBenchmark2.javaFillRegular       1000           true  avgt   40     23974.757 ±     220.219  ns/op
[info] HashMapBenchmark2.javaFillRegular      10000           true  avgt   40    265580.503 ±    3230.095  ns/op
[info] HashMapBenchmark2.javaGetExisting         10           true  avgt   40        90.856 ±       2.367  ns/op
[info] HashMapBenchmark2.javaGetExisting        100           true  avgt   40      1186.387 ±      37.586  ns/op
[info] HashMapBenchmark2.javaGetExisting       1000           true  avgt   40     15866.093 ±     117.440  ns/op
[info] HashMapBenchmark2.javaGetExisting      10000           true  avgt   40    169240.259 ±    1341.993  ns/op
[info] HashMapBenchmark2.javaGetNone             10           true  avgt   40      1427.607 ±     102.196  ns/op
[info] HashMapBenchmark2.javaGetNone            100           true  avgt   40      1124.080 ±       9.363  ns/op
[info] HashMapBenchmark2.javaGetNone           1000           true  avgt   40     12415.260 ±      95.176  ns/op
[info] HashMapBenchmark2.javaGetNone          10000           true  avgt   40    137307.804 ±    1148.475  ns/op
[info] HashMapBenchmark2.javaIterateEntries      10           true  avgt   40        62.912 ±       0.539  ns/op
[info] HashMapBenchmark2.javaIterateEntries     100           true  avgt   40       657.961 ±       6.162  ns/op
[info] HashMapBenchmark2.javaIterateEntries    1000           true  avgt   40      7185.347 ±      85.826  ns/op
[info] HashMapBenchmark2.javaIterateEntries   10000           true  avgt   40    100833.197 ±    5036.191  ns/op
[info] HashMapBenchmark2.javaIterateKeys         10           true  avgt   40        65.991 ±       0.764  ns/op
[info] HashMapBenchmark2.javaIterateKeys        100           true  avgt   40       668.754 ±       7.057  ns/op
[info] HashMapBenchmark2.javaIterateKeys       1000           true  avgt   40      7329.816 ±      64.975  ns/op
[info] HashMapBenchmark2.javaIterateKeys      10000           true  avgt   40     96768.064 ±    4602.843  ns/op

This is with the new HashMap: As expected, it's almost the same as the new HashSet but a bit slower overall because it has to manage the values in addition to the keys.

[info] Benchmark                           (size)  (stringsOnly)  Mode  Cnt         Score         Error  Units
[info] HashMapBenchmark2.hmBuild               10           true  avgt   40       185.009 ±       1.081  ns/op
[info] HashMapBenchmark2.hmBuild              100           true  avgt   40      1254.796 ±       7.530  ns/op
[info] HashMapBenchmark2.hmBuild             1000           true  avgt   40     15242.737 ±     360.208  ns/op
[info] HashMapBenchmark2.hmBuild            10000           true  avgt   40    165898.071 ±    2754.387  ns/op
[info] HashMapBenchmark2.hmFillColliding       10           true  avgt   40       286.895 ±       2.651  ns/op
[info] HashMapBenchmark2.hmFillColliding      100           true  avgt   40     10862.676 ±      82.548  ns/op
[info] HashMapBenchmark2.hmFillColliding     1000           true  avgt   40    737324.837 ±    5171.888  ns/op
[info] HashMapBenchmark2.hmFillColliding    10000           true  avgt   40  36404860.695 ±  325866.057  ns/op
[info] HashMapBenchmark2.hmFillRegular         10           true  avgt   40       101.064 ±       1.184  ns/op
[info] HashMapBenchmark2.hmFillRegular        100           true  avgt   40      2515.200 ±      32.176  ns/op
[info] HashMapBenchmark2.hmFillRegular       1000           true  avgt   40     21288.364 ±     159.990  ns/op
[info] HashMapBenchmark2.hmFillRegular      10000           true  avgt   40    232007.148 ±    5381.845  ns/op
[info] HashMapBenchmark2.hmGetExisting         10           true  avgt   40        84.468 ±       0.485  ns/op
[info] HashMapBenchmark2.hmGetExisting        100           true  avgt   40       970.738 ±      11.597  ns/op
[info] HashMapBenchmark2.hmGetExisting       1000           true  avgt   40     11222.019 ±     355.519  ns/op
[info] HashMapBenchmark2.hmGetExisting      10000           true  avgt   40    113588.278 ±     784.311  ns/op
[info] HashMapBenchmark2.hmGetNone             10           true  avgt   40       818.958 ±       6.737  ns/op
[info] HashMapBenchmark2.hmGetNone            100           true  avgt   40       779.973 ±       8.717  ns/op
[info] HashMapBenchmark2.hmGetNone           1000           true  avgt   40      9020.509 ±      76.060  ns/op
[info] HashMapBenchmark2.hmGetNone          10000           true  avgt   40    131132.682 ±    2733.884  ns/op
[info] HashMapBenchmark2.hmIterateEntries      10           true  avgt   40        96.126 ±       0.633  ns/op
[info] HashMapBenchmark2.hmIterateEntries     100           true  avgt   40      1029.361 ±       8.173  ns/op
[info] HashMapBenchmark2.hmIterateEntries    1000           true  avgt   40     10124.667 ±      61.876  ns/op
[info] HashMapBenchmark2.hmIterateEntries   10000           true  avgt   40    117623.404 ±    1289.257  ns/op
[info] HashMapBenchmark2.hmIterateKeys         10           true  avgt   40        74.530 ±       0.670  ns/op
[info] HashMapBenchmark2.hmIterateKeys        100           true  avgt   40       742.909 ±       6.941  ns/op
[info] HashMapBenchmark2.hmIterateKeys       1000           true  avgt   40      7656.561 ±      74.257  ns/op
[info] HashMapBenchmark2.hmIterateKeys      10000           true  avgt   40     89944.618 ±    1404.339  ns/op

@szeiger szeiger force-pushed the wip/mutable-hashset2 branch 2 times, most recently from 8addfaa to 0c8e94a Compare October 25, 2018 17:47
@szeiger
Copy link
Contributor Author

szeiger commented Oct 26, 2018

It's probably best to start a new PR for the order-aware implementations. I consider this PR ready.

val e = table.findEntry(key)
if (e eq null) default else e.value
}
override def put(key: K, value: V): Option[V] = put0(key, value, true) match {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put0 could probably return None rather than null. This would avoid the extra pattern match here. Unless it’s faster to return null in put0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I ran a benchmark for this. I assumed that null would be faster in general and the extra null check when you do care about the result (which is only in put) is cheap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, did some benchmarks. They're not entirely conclusive but if anything, using null is a little bit faster.

@javax-swing
Copy link

@szeiger not sure if this would be interesting to you, but i had a crack at implementing the more efficient grow function that you hint at in the comments.

https://scastie.scala-lang.org/javax-swing/atAzPs8oTiWvRvuQcNlyxw/1

@lrytz
Copy link
Member

lrytz commented Nov 1, 2018

Since OpenHashMap is deprecated, does it make sense to introduce OpenHashSet?

@lrytz
Copy link
Member

lrytz commented Nov 1, 2018

(could also summarize why OpenHashMap was deprecated? scala/collection-strawman#388)

@lrytz
Copy link
Member

lrytz commented Nov 1, 2018

if we want to it be fast, the choices are separate implementations or wrapping a Map in a Set (and thus wasting one pointer per element)

You went for the separate implemenation, I think it's the right choice. À la limite we could have a template and some code gen, but it's probably not worth the effort. Having separate classes might allow the JIT to do a better job optimizing each one individually.

Copy link
Member

@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work! A couple of questions in review comments.

* @define willNotTerminateInf
*/
class HashMap[K, V]
class HashMap[K, V](initialCapacity: Int, loadFactor: Double)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it final..? I guess the main use case for extending it is to override default. We can make that default function an assignable field (withDefaultValue?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 3 other subclasses in the compiler: QuantMap in scala.reflect.internal.util.Statistics, result in scala.tools.nsc.doc.model.IndexModelFactory and ReponseMap in scala.tools.nsc.interactive.Global. Plus some more in test cases. Making it final would probably break a lot of code if we already have that much in our own codebase.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should make it final / deprecatedOverriding. The 3 examples don't do much more than adding defaults, I think people should use composition not inheritance here. Allowing subclasses of HashMap makes it hard for us to change things, to optimize the implementaiton, and even changes that look safe might break subclasses in unexpected ways - you cannot anticipate what implementation details subclasses rely on.

For convenience, we could provide a non-final DelegatingMap that people could extend to add methods to a map. But just doing it for mutable.Map seems unsystematic. We have many other final collection types in the library, I don't see why mutable.HashMap should be an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the default method is not specific to HashMap, it is defined in collection.Map and we already have withDefault to create a delegate.

@szeiger
Copy link
Contributor Author

szeiger commented Nov 2, 2018

"Template and code gen" - Something like C++ templates would be useful here. I suppose we'd have to settle for something more ad-hoc and more horrifying.

I think the reason for deprecating OpenHashMap was simply to have fewer seldom-used stuff to maintain in the future. So far nobody has complained about the deprecation. I left OpenHashSet in the PR for now to make it easy to benchmark but it's probably better to delete it.

@szeiger
Copy link
Contributor Author

szeiger commented Nov 2, 2018

@javax-swing Did you run any benchmarks on this algorithm? I assume it would be beneficial to keep the separate version for the common 2x case but maybe the generic one can be just as fast.

@javax-swing
Copy link

@szeiger no I didn’t actually. I will have a go at that when I find some free time. I’ve never used jmh before so it might be fun!

I initially hashed it out on my phone to combat boredom 😁 and the solution seems surprisingly small. The caveat being that the array of last values could be quite large and therefore take up a fair chunk of memory of the difference in the size of the tables is big enough

@javax-swing
Copy link

@szeiger I’ve attempted some benchmarking https://github.com/javax-swing/hashtable-grow-brenchmarking

From initial tests there is a performance degradation when the size doubles, I will post some results once I’m happy with it.i suspect that it’s a combination of the array access and the null checks, I made a second variant using pre nodes to try and avoid the null checks.

I’m yet to test other grow factors but hopefully will find time this evening

@SethTisue SethTisue added the library:collections PRs involving changes to the standard collection library label Nov 8, 2018
@rgwilton
Copy link

Is OpenHashSet/OpenHashMap a good name for the old implementations? Using "Open" in the name makes it hard to know whether it is referring to "Open Hashing" or "Open Addressing (aka closed hashing).

@javax-swing
Copy link

javax-swing commented Nov 14, 2018

@szeiger i managed to get some more time for running some benchmarks. The generic grow algorithm shows some improvements, but only when the number of elements is large or the grow factor is large.

Below
32 * larger seems to show improvements on most tests
8 * larger seems to show improvements when there are around 10k elements

[info] GrowSizeBenchmark.newGrowTable             8       0           16  avgt    5     156.161 ▒     4.419  ns/op
[info] GrowSizeBenchmark.newGrowTable             8       0           64  avgt    5     436.776 ▒    25.040  ns/op
[info] GrowSizeBenchmark.newGrowTable             8     128           16  avgt    5    2852.723 ▒    63.419  ns/op
[info] GrowSizeBenchmark.newGrowTable             8     128           64  avgt    5    3270.859 ▒   263.706  ns/op
[info] GrowSizeBenchmark.newGrowTable             8    1024           16  avgt    5   20385.663 ▒  4410.996  ns/op
[info] GrowSizeBenchmark.newGrowTable             8    1024           64  avgt    5   19983.632 ▒  1376.031  ns/op
[info] GrowSizeBenchmark.newGrowTable             8   10000           16  avgt    5  200773.121 ▒  3204.177  ns/op
[info] GrowSizeBenchmark.newGrowTable             8   10000           64  avgt    5  220680.034 ▒  8924.886  ns/op
[info] GrowSizeBenchmark.newGrowTable            32       0           16  avgt    5     412.579 ▒    13.749  ns/op
[info] GrowSizeBenchmark.newGrowTable            32       0           64  avgt    5    1021.043 ▒    25.310  ns/op
[info] GrowSizeBenchmark.newGrowTable            32     128           16  avgt    5    3266.072 ▒   174.989  ns/op
[info] GrowSizeBenchmark.newGrowTable            32     128           64  avgt    5    4785.978 ▒    47.669  ns/op
[info] GrowSizeBenchmark.newGrowTable            32    1024           16  avgt    5   19973.773 ▒   520.525  ns/op
[info] GrowSizeBenchmark.newGrowTable            32    1024           64  avgt    5   25127.110 ▒  1003.695  ns/op
[info] GrowSizeBenchmark.newGrowTable            32   10000           16  avgt    5  203457.902 ▒  2180.770  ns/op
[info] GrowSizeBenchmark.newGrowTable            32   10000           64  avgt    5  221839.585 ▒  1478.719  ns/op
[info] GrowSizeBenchmark.oldGrowTable             8       0           16  avgt    5     151.684 ▒     5.976  ns/op
[info] GrowSizeBenchmark.oldGrowTable             8       0           64  avgt    5     487.589 ▒    23.563  ns/op
[info] GrowSizeBenchmark.oldGrowTable             8     128           16  avgt    5    2734.678 ▒    24.681  ns/op
[info] GrowSizeBenchmark.oldGrowTable             8     128           64  avgt    5    3560.635 ▒    35.069  ns/op
[info] GrowSizeBenchmark.oldGrowTable             8    1024           16  avgt    5   19443.320 ▒   164.319  ns/op
[info] GrowSizeBenchmark.oldGrowTable             8    1024           64  avgt    5   20569.981 ▒  1118.741  ns/op
[info] GrowSizeBenchmark.oldGrowTable             8   10000           16  avgt    5  304450.817 ▒  5200.511  ns/op
[info] GrowSizeBenchmark.oldGrowTable             8   10000           64  avgt    5  349487.762 ▒  5840.703  ns/op
[info] GrowSizeBenchmark.oldGrowTable            32       0           16  avgt    5     476.227 ▒     2.758  ns/op
[info] GrowSizeBenchmark.oldGrowTable            32       0           64  avgt    5    1661.838 ▒   101.561  ns/op
[info] GrowSizeBenchmark.oldGrowTable            32     128           16  avgt    5    4420.828 ▒    49.427  ns/op
[info] GrowSizeBenchmark.oldGrowTable            32     128           64  avgt    5    6540.070 ▒   123.674  ns/op
[info] GrowSizeBenchmark.oldGrowTable            32    1024           16  avgt    5   25933.232 ▒   509.800  ns/op
[info] GrowSizeBenchmark.oldGrowTable            32    1024           64  avgt    5   31897.182 ▒   266.313  ns/op
[info] GrowSizeBenchmark.oldGrowTable            32   10000           16  avgt    5  459758.960 ▒  7342.867  ns/op
[info] GrowSizeBenchmark.oldGrowTable            32   10000           64  avgt    5  506833.684 ▒ 57157.718  ns/op

Below are the results from doubling in size

[info] GrowSizeBenchmark.newGrowTable             2       0           16  avgt    5      73.067 ▒    0.413  ns/op
[info] GrowSizeBenchmark.newGrowTable             2       0           64  avgt    5     191.984 ▒    3.914  ns/op
[info] GrowSizeBenchmark.newGrowTable             2     128           16  avgt    5    2375.526 ▒   55.216  ns/op
[info] GrowSizeBenchmark.newGrowTable             2     128           64  avgt    5    2624.187 ▒   29.425  ns/op
[info] GrowSizeBenchmark.newGrowTable             2    1024           16  avgt    5   18870.624 ▒  956.631  ns/op
[info] GrowSizeBenchmark.newGrowTable             2    1024           64  avgt    5   18812.588 ▒  341.455  ns/op
[info] GrowSizeBenchmark.newGrowTable             2   10000           16  avgt    5  198778.298 ▒ 6770.789  ns/op
[info] GrowSizeBenchmark.newGrowTable             2   10000           64  avgt    5  208260.078 ▒ 2953.858  ns/op
[info] GrowSizeBenchmark.oldGrowTable             2       0           16  avgt    5      65.140 ▒    1.249  ns/op
[info] GrowSizeBenchmark.oldGrowTable             2       0           64  avgt    5     189.794 ▒    2.241  ns/op
[info] GrowSizeBenchmark.oldGrowTable             2     128           16  avgt    5    1759.949 ▒  151.061  ns/op
[info] GrowSizeBenchmark.oldGrowTable             2     128           64  avgt    5    1987.609 ▒   56.253  ns/op
[info] GrowSizeBenchmark.oldGrowTable             2    1024           16  avgt    5   14416.956 ▒   60.558  ns/op
[info] GrowSizeBenchmark.oldGrowTable             2    1024           64  avgt    5   14289.335 ▒  303.965  ns/op
[info] GrowSizeBenchmark.oldGrowTable             2   10000           16  avgt    5  204695.216 ▒ 8549.972  ns/op
[info] GrowSizeBenchmark.oldGrowTable             2   10000           64  avgt    5  192506.925 ▒  979.378  ns/op

From these numbers i'm not sure it's worth changing it, since 99% of the time i imagine that the hashtable will double in size. And people could avoid a costly call to grow the hashtable by supplying the size hint on construction. What do you think?

@szeiger szeiger force-pushed the wip/mutable-hashset2 branch from 0c8e94a to 7b40d5c Compare November 22, 2018 16:15
@szeiger
Copy link
Contributor Author

szeiger commented Nov 22, 2018

Updated to address review comments. Important changes:

  • No OpenHashSet, the old implementation is removed
  • HashMap is now marked @deprecatedInheritance

@javax-swing While working on OrderAwareHashMap I noticed that growing the table would iterate through all buckets even when the map is empty. I fixed it there and now I've backported this change to HashSet and HashMap. When the collection is empty it will simply allocate a new array of the correct size. The only remaining use case for growing to more than double the size is bulk-appending a large collection with known size to a non-empty set/map (or explicitly setting a size hint on a non-empty set/map). Since this happens only in specific methods (in particular, not when appending single elements) no hot code paths would need to be changed. You could always go through the normal growTable with the 2x algorithm and only check for larger factors in sizeHint. This would allow the use of a a better algorithm in these cases without sacrificing any peformance in other cases. It's still a lot of extra complexity though, so I'm not sure it's worth optimizing for.

- The old HashSet was based on open addressing. Since the similar
  `OpenHashMap` is deprecated I removed it (along with the actual
  implementing class `FlatHashTable`) instead of adding yet another
  collection implementation (`OpenHashSet`).
- The new HashSet implementation caches the hash codes and uses linked
  lists for chaining. This is consistent with mutable.HashMap, the
  immutable HashSet/HashMap, and Java’s mutable HashSet/HashMap.
- The linked lists are kept ordered by hash code. This costs a few
  percent of performance for well-distributed hashes but is
  significantly faster in cases with lots of bucket collisions
- Hash set computation is fixed to properly implement Scala’s equality
  semantics for numeric types.
- The new HashMap is based on the new HashSet.

Fixes scala/scala-dev#567
@szeiger szeiger force-pushed the wip/mutable-hashset2 branch from 7b40d5c to 3a6f637 Compare November 22, 2018 17:12
override def addAll(xs: IterableOnce[A]): this.type = {
sizeHint(xs.knownSize)
super.addAll(xs)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could take advantage of the cached hashcodes in both this implementation of mutable.HashSet as well as in immutable.HashSet.

Maybe an other optimization:

When case xs: mutable.HashMap[A] if xs.table.length >= this.table.length, then each of those buckets maps onto exactly one bucket in this.table. Maybe each bucket in that could be "merge joined" into the bucket in this, since they're both sorted linked lists? Maybe it would get complicated with the extra housekeeping of making sure to resize the table when this gets too big

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could check for equal table sizes after growing this table. If you check before growing and delay growing until you've added the new data you could end up adding a lot of data in an inefficient way to linked lists even though you know you'll have to split them afterwards.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm but whether or not this table should grow depends on the overlap of the two HashMaps. It's probably not worth doing I guess.

def iterator: Iterator[(K, V)] = {
if (isEmpty) Iterator.empty
else table.entriesIterator.map(e => (e.key, e.value))
override def addAll(xs: IterableOnce[(K, V)]): this.type = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to move this override to mutable.MapOps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sizeHint comes from Builder which extends Growable. It would make sense to override addAll (which is concrete in Growable) in Builder to set the size hint.

@lrytz
Copy link
Member

lrytz commented Dec 11, 2018

I forgot to mention: could you update the PR description? Say that OpenHashMap and OpenHashSet are removed. And it would be good to give an explanation why OpenHashMap was deprecated.

@szeiger szeiger changed the title A new mutable.HashSet New mutable.HashSet and mutable.HashMap implementations Dec 11, 2018
@szeiger szeiger merged commit 051dc0b into scala:2.13.x Dec 11, 2018
@joshlemer
Copy link
Contributor

@szeiger Should the equivalent changes to this also be done for LinkedHashMap and LinkedHashSet?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

library:collections PRs involving changes to the standard collection library release-notes worth highlighting in next release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Issues with mutable.HashMap

8 participants