-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Background and Motivation
There isn't a fast way to copy a Dictionary, either for snapshotting dotnet/roslyn#47503 (comment); or where its mostly entirely read only with an occasional write where updating it via cloning + adds and then switching the dictionary reference would allow a lock free read path that may be more suitable than switching to a ConcurrentDictionary.
Roslyn currently uses .ToImmutableDictionary dotnet/roslyn#47503 (comment) for snapshotting which is quite expensive, when a fast .Clone and readonly wrapper would be more suitable (as its snapshots)
Proposed API
namespace System.Collections.Generic
{
public static class DictionaryExtensions
{
public static Dictionary<TKey, TValue> Clone<TKey, TValue>(this Dictionary<TKey, TValue> source) where TKey : notnull
}
}Usage Examples
For lockless reads, with very infrequent updates (where reallocating the entire dictionary is more lightweight than taking a lock on every read, or using ConcurrentDictionary)
class LookupTable
{
private Dictionary<int, string>? _lookUp;
private readonly object _lock = new ();
public bool TryGetValue(int key, [MaybeNullWhen(false)] out string value)
{
// No locks required for read
var lookUp = _lookUp;
if (lookUp is null)
{
value = default;
return false;
}
return lookUp.TryGetValue(key, out value);
}
public void Add(int key, string value)
{
lock (_lock)
{
// Clone + Add
var dict = _lookUp?.Clone() ?? new (capacity: 1);
dict.Add(key, value);
// Switch reference
_lookUp = dict;
}
}
public void AddRange(IEnumerable<KeyValuePair<int, string>> collection)
{
lock (_lock)
{
// Clone + Add
var dict = _lookUp?.Clone() ?? new(capacity: 8);
foreach (var entry in collection)
{
dict.Add(entry.Key, entry.Value);
}
// Switch reference
_lookUp = dict;
}
}
}Alternative Designs
Add to System.Linq.Enumerable and add ToDictionary overload without having to specify both a Func<KeyValue<TKey, TElement>, TKey> keySelector and Func<KeyValue<TKey, TElement>, TElement> elementSelector; however dict.ToDictionary() seems an unclear way of copying?
namespace System.Linq
{
public static partial class Enumerable
{
public static ToDictionary<TKey, TValue> Clone<TKey, TValue>(this Dictionary<TKey, TValue> source) where TKey : notnull
}
}Risks
Code bloat
Adding directly to Dictionary as an instance method would cause generic asm code bloat as it would be use for particular dictionaries rather than every dictionary; hence an extension method would be better.
Performance
Currently can copy via foreach on the dictionary with an .Add for each element to an empty (correctly sized) Dictionary or via Linq.Enumerable.ToDictionary; which uses IEnumerable, however these methods are slow as need to reshash and bucket all the elements and second is additionally alocatey.
In order to create a fast Dictionary Clone method it would be better to do it in bulk duplicating its internal arrays, so it would need to be in the same assembly to have access to these internals e.g.
public static class DictionaryExtensions
{
public static Dictionary<TKey, TValue> Clone<TKey, TValue>(this Dictionary<TKey, TValue> source) where TKey : notnull
{
Dictionary<TKey, TValue> dict = new ();
int[]? buckets = source._buckets;
if (buckets is not null)
{
dict._buckets = new int[buckets.Length];
buckets.CopyTo(dict._buckets.AsSpan());
}
Dictionary<TKey, TValue>.Entry[]? entries = source._entries;
if (entries is not null)
{
dict._entries = new Dictionary<TKey, TValue>.Entry[entries.Length];
entries.CopyTo(dict._entries.AsSpan());
}
#if TARGET_64BIT
dict._fastModMultiplier = source._fastModMultiplier;
#endif
dict._count = source._count;
dict._freeList = source._freeList;
dict._freeCount = source._freeCount;
dict._version = source._version;
dict._comparer = source._comparer;
return dict;
}
}