Skip to content

Conversation

@benaadams
Copy link
Member

@benaadams benaadams commented Sep 7, 2020

Resolves #41939

Fast .Clone() extension method can be made on top of this change as

public static class DictionaryExtensions
{
    public static Dictionary<TKey, TValue> Clone<TKey, TValue>(this Dictionary<TKey, TValue> source) where TKey : notnull
    {
        return new Dictionary<TKey, TValue>(source, source.Comparer);
    }
}

String keys are the biggest improvement since they cost most to rehash

Method NetFx InitialCount Mean Allocated Ratio Change
Clone_Int_Full NetFx 3000 62.09 us 66.1 KB 2.45 x 0.4
Clone_Int_Full Base 3000 25.32 us 65.9 KB 1.00 -
Clone_Int_Full PR 3000 10.56 us 65.9 KB 0.42 x 2.4
Clone_Int_HalfRemoved NetFx 3000 32.43 us 31.4 KB 2.73 x 0.4
Clone_Int_HalfRemoved Base 3000 11.88 us 31.3 KB 1.00 -
Clone_Int_HalfRemoved PR 3000 6.86 us 31.3 KB 0.58 x 1.7
Clone_String_Full NetFx 3000 111.84 us 92.4 KB 1.86 x 0.5
Clone_String_Full Base 3000 60.01 us 92.3 KB 1.00 -
Clone_String_Full PR 3000 26.84 us 92.3 KB 0.45 x 2.2
Clone_String_HalfRemoved NetFx 3000 58.25 us 43.9 KB 1.39 x 0.7
Clone_String_HalfRemoved Base 3000 42.04 us 43.8 KB 1.00 -
Clone_String_HalfRemoved PR 3000 14.73 us 43.8 KB 0.35 x 2.9

https://github.com/dotnet/performance/compare/master...benaadams:DictionaryClone?expand=1

@ghost
Copy link

ghost commented Sep 7, 2020

Tagging subscribers to this area: @eiriktsarpalis, @jeffhandley
See info in area-owners.md if you want to be subscribed.

@mjsabby
Copy link
Contributor

mjsabby commented Sep 7, 2020

@danmosemsft it would be nice if we also made this improvement in DictionarySlim. For specific cache-like use cases we roll our own dictionary and it would be nice to not have to do that.

@benaadams
Copy link
Member Author

runtime (Libraries Test Run release coreclr Windows_NT x86 Release) issue #41494

@benaadams
Copy link
Member Author

Helix gateway timeout on "runtime (Libraries Test Run release coreclr Windows_NT x86 Release)"

@benaadams
Copy link
Member Author

Broken compress variant, but I'm quite impressed there are tests for these scenarios :)

@benaadams benaadams force-pushed the Dcitionary-clone branch 2 times, most recently from 0eea740 to 36abff2 Compare September 8, 2020 05:15
@benaadams benaadams force-pushed the Dcitionary-clone branch 2 times, most recently from 387ec43 to dc6f15e Compare September 8, 2020 05:48
@jkotas
Copy link
Member

jkotas commented Sep 8, 2020

It would be nice to share performance numbers for the different cases.

@jkotas jkotas added the tenet-performance Performance related issue label Sep 8, 2020
@danmoseley
Copy link
Member

@danmosemsft it would be nice if we also made this improvement in DictionarySlim. For specific cache-like use cases we roll our own dictionary and it would be nice to not have to do that.

@mjsabby the copy constructor specifically? It would be easy to add if someone would offer a PR.

I am unsure what to do with DictionarySlim. It's encouraging that you're using it, or plan to. That may suggest we should remove the "Experimental" or even put it into the core collections. But @GrabYourPitchforks had concerns about it's "unsafeness" (I think related to mutating the entry in place). Also the name may suggest to people it's optimized for small number of items, which it's not (I happened to just run across @benaadams type with the same name)

@GrabYourPitchforks
Copy link
Member

The code LGTM. But we don't have evidence that this is a hot path for anybody.

I'm somewhat paranoid that we're going to use this code as a new performance profile, then in the future we'll discover some obscure edge case which means we need to go down a somewhat slower path, then some automated system will kick in and say "Ah, but you regressed this method's performance by 17%!", and then we'll spend days chasing it down - all while continuing not to have evidence that this is actually a hot path for real-world apps. Basically, I'm afraid that it's signing us up for continuing maintenance work for a code path that we didn't want to invest significant resources into.

@benaadams
Copy link
Member Author

all while continuing not to have evidence that this is actually a hot path for real-world apps.

Signals Rosyln would move away from ImmuntableDictionary for a fast Dictionary clone method dotnet/roslyn#47503 (comment)

Yes, ImmutableDictionary is really bad. We just haven't had cases yet that fell into the patterns for which we have production-ready alternatives. The main scenario that we have the ability to improve today is collections with the following characteristics:

  • The dictionary is built up from empty, as opposed to built by transforming a previous collection through add/remove operations
  • The collection has more than a few elements, but less than ~10K elements

For this we could switch to an immutable wrapper around Dictionary<TKey, TValue>, similar to how we use ImmutableArray<T> instead of ImmutableList<T>.

/cc @sharwell

and

My gut tells me this should be a slam dunk win. The vast majority (probably nearly all) of cases where we use ImmutableDict, it's not to benefit from what ID is decent at, but just to ensure it will not be mutated post creation. For that purpose it's actually a pretty terrible choice and adds a tremendous amount of overhead for functionality that we're not needing in those scenarios.

/cc @CyrusNajmabadi

Also for lockless read Dictionary, with very infrequent updates (where reallocating the entire dictionary is more lightweight than taking a lock on every read, or using ConcurrentDictionary) e.g.

class LookupTable
{
    private Dictionary<int, string>? _lookUp;
    private readonly object _lock = new ();

    public bool TryGetValue(int key, [MaybeNullWhen(false)] out string value)
    {
        // No locks required for read
        var lookUp = _lookUp;
        if (lookUp is null)
        {
            value = default;
            return false;
        }

        return lookUp.TryGetValue(key, out value);
    }

    public void Add(int key, string value)
    {
        lock (_lock)
        {
            // Clone + Add
            var dict = _lookUp?.Clone() ?? new (capacity: 1);
            dict.Add(key, value);
            // Switch reference
            _lookUp = dict;
        }
    }

    public void AddRange(IEnumerable<KeyValuePair<int, string>> collection)
    {
        lock (_lock)
        {
            // Clone + Add
            var dict = _lookUp?.Clone() ?? new(capacity: 8);
            foreach (var entry in collection)
            {
                dict.Add(entry.Key, entry.Value);
            }
            // Switch reference
            _lookUp = dict;
        }
    }
}

@benaadams
Copy link
Member Author

Its also why I wanted to add a clone extension method in #41939

public static class DictionaryExtensions
{
    public static Dictionary<TKey, TValue> Clone<TKey, TValue>(this Dictionary<TKey, TValue> source) where TKey : notnull
    {
        return new Dictionary<TKey, TValue>(source, source.Comparer);
    }
}

Because it's not clear that passing a Dictionary as an IDictionary to the Dictionary constructor is a fast way to copy it and on Full Framework it really isn't https://referencesource.microsoft.com/#mscorlib/system/collections/generic/dictionary.cs,106

@benaadams
Copy link
Member Author

Released TrimExcess and Copy might as well share the same entity copying code

@jkotas
Copy link
Member

jkotas commented Sep 14, 2020

Released TrimExcess and Copy might as well share the same entity copying code

Nice!

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@eiriktsarpalis
Copy link
Member

Is there any pending feedback on this PR? If not I'll just merge.

@eiriktsarpalis eiriktsarpalis added this to the 6.0.0 milestone Sep 30, 2020
@eiriktsarpalis eiriktsarpalis self-assigned this Sep 30, 2020
@benaadams
Copy link
Member Author

Don't think so; CI wasn't behaving (one job failed in unrelated area), when it was was going to rebase for a clean set of greens, but looking at the open PRs doesn't look like CI is there yet...

@benaadams
Copy link
Member Author

Squashed and Rebased

@eiriktsarpalis
Copy link
Member

Failing tests seem like unrelated timeouts.

@eiriktsarpalis eiriktsarpalis merged commit a4b566a into dotnet:master Oct 13, 2020
@eiriktsarpalis
Copy link
Member

Thanks @benaadams!

@benaadams benaadams deleted the Dcitionary-clone branch October 13, 2020 16:37
@benaadams
Copy link
Member Author

Aside it should also improve SparseTensor<T> .Clone() and .ToSparseTensor()

public override Tensor<T> Clone()
{
var valueCopy = new Dictionary<int, T>(values);
return new SparseTensor<T>(valueCopy, dimensions, IsReversedStride);
}

public override SparseTensor<T> ToSparseTensor()
{
var valueCopy = new Dictionary<int, T>(values);
return new SparseTensor<T>(valueCopy, dimensions, IsReversedStride);
}

@DrewScoggins
Copy link
Member

Just wanted to update that we are seeing these wins in the performance lab for both strings and ints.

Run Information

Architecture x64
OS Windows 10.0.18362
Changes diff

Regressions in System.Collections.CtorFromCollection

Benchmark Baseline Test Test/Base Modality Baseline Outlier Baseline ETL Comapre ETL
Dictionary 15.46 μs 4.37 μs 0.28 True

graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Collections.CtorFromCollection<String>*'
Details

Histogram

System.Collections.CtorFromCollection.Dictionary(Size: 512)

[ 3227.117 ;  5690.502) | @@@@@@@@@@@@@@@@@@@@
[ 5690.502 ;  7941.971) | 
[ 7941.971 ; 10193.440) | 
[10193.440 ; 12444.908) | 
[12444.908 ; 14336.416) | 
[14336.416 ; 16587.885) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

Run Information

Architecture x64
OS Windows 10.0.18362
Changes diff

Regressions in System.Collections.CtorFromCollection

Benchmark Baseline Test Test/Base Modality Baseline Outlier Baseline ETL Comapre ETL
Dictionary 6.25 μs 1.94 μs 0.31 True

graph
Historical Data in Reporting System

Repro

git clone https://github.com/dotnet/performance.git
py .\performance\scripts\benchmarks_ci.py -f netcoreapp5.0 --filter 'System.Collections.CtorFromCollection<Int32>*'
Details

Histogram

System.Collections.CtorFromCollection.Dictionary(Size: 512)

[1496.277 ; 2414.652) | @@@@@@@@@@@@@@@@@@@@
[2414.652 ; 3307.499) | 
[3307.499 ; 4200.346) | 
[4200.346 ; 5093.193) | 
[5093.193 ; 5775.253) | 
[5775.253 ; 6668.100) | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
[6668.100 ; 7562.448) | @@@@@@@@@@@@@

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

@danmoseley
Copy link
Member

Love it - thank you @DrewScoggins for following up with that positive data.

@ghost ghost locked as resolved and limited conversation to collaborators Dec 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fast Dictionary<K,V>.Clone() api