Don't box every single value type dictionary key. #46460

lezzi · 2020-12-30T02:28:12Z

Deserialization of dictionaries from Stream always causes keys boxing for value types because ReadStackFrame.DictionaryKey is an object. Saving key in the state (boxing) is only needed when value can't be fully read (Read call returns false).

In the best-case scenario when an entire dictionary fits in a buffer - it should completely remove unnecessary heap allocations for keys.

ahsonkhan · 2020-12-30T20:38:35Z

This is great. How did you detect it?

In the best-case scenario when an entire dictionary fits in a buffer - it should completely remove unnecessary heap allocations for keys.

Can we run some benchmarks to get a sense of how much heap allocations we save? It might help modifying and running this benchmark on a dictionary input that you expect to get performance wins on (with value type keys). Please share the benchmarkdotnet results when you have them:
https://github.com/dotnet/performance/blob/0b094042e7dc114c8cd038b1ae90dad0e5696fa2/src/benchmarks/micro/libraries/System.Text.Json/Serializer/ReadJson.cs#L57-L64

....Json/src/System/Text/Json/Serialization/Converters/Collection/DictionaryDefaultConverter.cs

lezzi · 2020-12-31T01:08:19Z

This is great. How did you detect it?

In the best-case scenario when an entire dictionary fits in a buffer - it should completely remove unnecessary heap allocations for keys.

Can we run some benchmarks to get a sense of how much heap allocations we save? It might help modifying and running this benchmark on a dictionary input that you expect to get performance wins on (with value type keys). Please share the benchmarkdotnet results when you have them:
https://github.com/dotnet/performance/blob/0b094042e7dc114c8cd038b1ae90dad0e5696fa2/src/benchmarks/micro/libraries/System.Text.Json/Serializer/ReadJson.cs#L57-L64

I am currently working on the high-performance dynamodb library which involves some low-level optimized JSON deserialization. I was able to detect several inefficiencies during some internal benchmark comparisons.

Here is the benchmark result for Dictionary<int, string> with 100 items. I think the most important column for this case is Allocated, where we can see that the 2KB difference disappears after the PR. It also becomes slightly faster.

BenchmarkDotNet=v0.12.1.1466-nightly, OS=Windows 10.0.19041.572 (2004/May2020Update/20H1)
Intel Core i7-7820X CPU 3.60GHz (Kaby Lake), 1 CPU, 6 logical and 3 physical cores
.NET SDK=6.0.100-alpha.1.20630.6
  [Host]     : .NET 6.0.0 (6.0.20.62903), X64 RyuJIT
  Job-NDNXAQ : .NET 6.0.0 (42.42.42.42424), X64 RyuJIT
  Job-EVXCXL : .NET 6.0.0 (42.42.42.42424), X64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:DebugType=portable  IterationTime=250.0000 ms  
MaxIterationCount=20  MinIterationCount=15  WarmupCount=1

Method	Job	Toolchain	Mean	Error	StdDev	Median	Min	Max	Ratio	RatioSD	Gen 0	Gen 1	Gen 2	Allocated
DeserializeFromString	Job-NDNXAQ	master	19.96 μs	0.156 μs	0.138 μs	19.93 μs	19.77 μs	20.20 μs	1.00	0.00	3.2861	0.2347	-	19 KB
DeserializeFromString	Job-EVXCXL	PR	20.05 μs	0.304 μs	0.312 μs	19.94 μs	19.77 μs	20.84 μs	1.01	0.02	3.2727	0.1596	-	19 KB

DeserializeFromUtf8Bytes	Job-NDNXAQ	master	19.56 μs	0.232 μs	0.194 μs	19.54 μs	19.30 μs	19.97 μs	1.00	0.00	3.2396	0.1580	-	19 KB
DeserializeFromUtf8Bytes	Job-EVXCXL	PR	19.52 μs	0.138 μs	0.122 μs	19.52 μs	19.34 μs	19.78 μs	1.00	0.01	3.2233	0.1572	-	19 KB

DeserializeFromStream	Job-NDNXAQ	master	21.66 μs	0.187 μs	0.175 μs	21.58 μs	21.49 μs	22.17 μs	1.00	0.00	3.6509	0.2608	-	21 KB
DeserializeFromStream	Job-EVXCXL	PR	21.25 μs	0.201 μs	0.178 μs	21.27 μs	20.90 μs	21.57 μs	0.98	0.01	3.2669	0.1675	-	19 KB

It is also worth noting another potential optimization that I am considering to create a PR for. On the benchmark, you can see that deserialization from Stream is visibly slower than from bytes array even when entire JSON fits in a single buffer. I suspect it happens because we don't use UseFastPath for a single buffer case. But it is unrelated and probably not worth discussing in this thread.

During slow path deserialization fallback to storing dictionary key as object only when Read returns false.

steveharter

Thanks!

ericstj · 2021-02-17T23:37:41Z

Looks like we have a couple approvals: Is this ready to merge? Might want to get fresh test results (current are from Jan 5)

layomia

Thanks for the PR; this good to merge. The perf results shouldn't be stale given the changes.

It is also worth noting another potential optimization that I am considering to create a PR for. On the benchmark, you can see that deserialization from Stream is visibly slower than from bytes array even when entire JSON fits in a single buffer. I suspect it happens because we don't use UseFastPath for a single buffer case. But it is unrelated and probably not worth discussing in this thread.

We'd be happy to take a PR exploring this if we can detect a single buffer case.

Dotnet-GitSync-Bot added the area-System.Text.Json label Dec 30, 2020

lezzi force-pushed the json-dict-key-boxing branch 2 times, most recently from da5acaa to 43ef1d7 Compare December 30, 2020 14:11

lezzi marked this pull request as ready for review December 30, 2020 16:00

lezzi requested review from ahsonkhan, eiriktsarpalis, jozkee, layomia and steveharter as code owners December 30, 2020 16:00

ahsonkhan reviewed Dec 30, 2020

View reviewed changes

....Json/src/System/Text/Json/Serialization/Converters/Collection/DictionaryDefaultConverter.cs Outdated Show resolved Hide resolved

....Json/src/System/Text/Json/Serialization/Converters/Collection/DictionaryDefaultConverter.cs Outdated Show resolved Hide resolved

Don't box every single value type dictionary key.

f2fb38f

During slow path deserialization fallback to storing dictionary key as object only when Read returns false.

lezzi force-pushed the json-dict-key-boxing branch from 43ef1d7 to f2fb38f Compare January 5, 2021 19:58

stephentoub added this to the 6.0.0 milestone Jan 15, 2021

stephentoub approved these changes Jan 15, 2021

View reviewed changes

steveharter added the tenet-performance Performance related issue label Jan 15, 2021

steveharter approved these changes Jan 15, 2021

View reviewed changes

ericstj assigned layomia Feb 17, 2021

Base automatically changed from master to main March 1, 2021 09:07

layomia approved these changes Mar 3, 2021

View reviewed changes

layomia merged commit 6919fc2 into dotnet:main Mar 3, 2021

ghost locked as resolved and limited conversation to collaborators Apr 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Don't box every single value type dictionary key. #46460

Don't box every single value type dictionary key. #46460

Uh oh!

lezzi commented Dec 30, 2020 •

edited

Loading

Uh oh!

ahsonkhan commented Dec 30, 2020

Uh oh!

Uh oh!

Uh oh!

lezzi commented Dec 31, 2020

Uh oh!

steveharter left a comment

Uh oh!

ericstj commented Feb 17, 2021

Uh oh!

layomia left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Don't box every single value type dictionary key. #46460

Don't box every single value type dictionary key. #46460

Uh oh!

Conversation

lezzi commented Dec 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ahsonkhan commented Dec 30, 2020

Uh oh!

Uh oh!

Uh oh!

lezzi commented Dec 31, 2020

Uh oh!

steveharter left a comment

Choose a reason for hiding this comment

Uh oh!

ericstj commented Feb 17, 2021

Uh oh!

layomia left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

lezzi commented Dec 30, 2020 •

edited

Loading