Improve MultipartReader performance #51426

BrennanConroy · 2023-10-16T23:58:18Z

There are 3 "distinct" changes and I'll show benchmark numbers for each of them below, as well as a summary at the end of this comment. Originally, this change was going to be trying to switch our boyer-moore string matching algorithm with IndexOf as described in #49223. But, while testing that out and writing microbenchmarks I noticed some other changes that could significantly improve perf regardless of boyer-moore vs. IndexOf. Switching to IndexOf was a bit more involved than anticipated, so will be a follow-up item since it does show some more perf improvements on top of this PR but will require additional work.

The first commit adds a loop to processing a line where we previously would check if there were items in the buffer on every iteration as well as do a function call to the core loop.

Original perf

Method	BoundarySize	SectionCount	Mean	Error	StdDev	Op/s	Gen 0	Gen 1	Gen 2	Allocated
MultipartReaderParsing	6	1	2.430 us	0.0477 us	0.0446 us	411,568.4	0.0458	-	-	8 KB
MultipartReaderParsingWithRead	6	1	2.348 us	0.0205 us	0.0160 us	425,842.8	0.0458	-	-	8 KB
MultipartReaderParsing	6	2	4.431 us	0.0318 us	0.0298 us	225,666.9	0.0610	-	-	10 KB
MultipartReaderParsingWithRead	6	2	4.354 us	0.0239 us	0.0200 us	229,691.5	0.0610	-	-	10 KB
MultipartReaderParsing	6	3	6.496 us	0.0313 us	0.0277 us	153,936.6	0.0763	-	-	13 KB
MultipartReaderParsingWithRead	6	3	6.500 us	0.0229 us	0.0214 us	153,853.8	0.0763	-	-	13 KB
MultipartReaderParsing	28	1	2.552 us	0.0161 us	0.0126 us	391,809.5	0.0496	-	-	8 KB
MultipartReaderParsingWithRead	28	1	2.461 us	0.0213 us	0.0188 us	406,284.6	0.0458	-	-	8 KB
MultipartReaderParsing	28	2	4.613 us	0.0424 us	0.0331 us	216,794.2	0.0610	-	-	10 KB
MultipartReaderParsingWithRead	28	2	4.566 us	0.0460 us	0.0408 us	219,024.4	0.0610	-	-	10 KB
MultipartReaderParsing	28	3	6.698 us	0.0282 us	0.0235 us	149,304.9	0.0763	-	-	13 KB
MultipartReaderParsingWithRead	28	3	6.646 us	0.0801 us	0.0669 us	150,460.5	0.0763	-	-	13 KB
MultipartReaderParsing	70	1	2.852 us	0.0154 us	0.0136 us	350,661.7	0.0496	-	-	8 KB
MultipartReaderParsingWithRead	70	1	2.824 us	0.0196 us	0.0174 us	354,142.2	0.0496	-	-	8 KB
MultipartReaderParsing	70	2	4.913 us	0.0277 us	0.0232 us	203,548.5	0.0610	-	-	11 KB
MultipartReaderParsingWithRead	70	2	4.801 us	0.0223 us	0.0186 us	208,291.7	0.0610	-	-	10 KB
MultipartReaderParsing	70	3	7.099 us	0.0811 us	0.0677 us	140,870.4	0.0763	-	-	13 KB
MultipartReaderParsingWithRead	70	3	7.167 us	0.0339 us	0.0283 us	139,520.1	0.0763	-	-	13 KB

First commit perf

Method	BoundarySize	SectionCount	Mean	Error	StdDev	Op/s	Gen 0	Gen 1	Gen 2	Allocated
MultipartReaderParsing	6	1	1.930 us	0.0211 us	0.0176 us	518,005.3	0.0458	-	-	8 KB
MultipartReaderParsingWithRead	6	1	1.933 us	0.0301 us	0.0252 us	517,363.7	0.0458	-	-	8 KB
MultipartReaderParsing	6	2	2.959 us	0.0148 us	0.0116 us	337,960.1	0.0610	-	-	10 KB
MultipartReaderParsingWithRead	6	2	3.020 us	0.0457 us	0.0449 us	331,173.2	0.0610	-	-	10 KB
MultipartReaderParsing	6	3	4.281 us	0.0200 us	0.0177 us	233,588.9	0.0763	-	-	13 KB
MultipartReaderParsingWithRead	6	3	4.300 us	0.0360 us	0.0319 us	232,555.0	0.0763	-	-	13 KB
MultipartReaderParsing	28	1	2.069 us	0.0189 us	0.0177 us	483,327.7	0.0496	-	-	8 KB
MultipartReaderParsingWithRead	28	1	1.990 us	0.0169 us	0.0166 us	502,619.3	0.0458	-	-	8 KB
MultipartReaderParsing	28	2	3.165 us	0.0162 us	0.0135 us	315,918.8	0.0648	-	-	10 KB
MultipartReaderParsingWithRead	28	2	3.092 us	0.0114 us	0.0096 us	323,372.1	0.0610	-	-	10 KB
MultipartReaderParsing	28	3	4.403 us	0.0294 us	0.0230 us	227,119.7	0.0763	-	-	13 KB
MultipartReaderParsingWithRead	28	3	4.433 us	0.0366 us	0.0343 us	225,575.7	0.0763	-	-	13 KB
MultipartReaderParsing	70	1	2.369 us	0.0134 us	0.0112 us	422,046.3	0.0496	-	-	8 KB
MultipartReaderParsingWithRead	70	1	2.333 us	0.0401 us	0.0313 us	428,680.4	0.0496	-	-	8 KB
MultipartReaderParsing	70	2	3.446 us	0.0255 us	0.0239 us	290,156.9	0.0648	-	-	11 KB
MultipartReaderParsingWithRead	70	2	3.362 us	0.0162 us	0.0144 us	297,424.8	0.0648	-	-	10 KB
MultipartReaderParsing	70	3	4.921 us	0.0757 us	0.1011 us	203,192.9	0.0763	-	-	13 KB
MultipartReaderParsingWithRead	70	3	4.915 us	0.0467 us	0.0365 us	203,457.4	0.0763	-	-	13 KB

The first commit gives a 25-50% increase.

The second commit builds on top of the first commit. Since we're now looping over the buffered data, we can wait until the end of the looping to write the span of data to the memorystream instead of writing one byte at a time.

Second commit perf

Method	BoundarySize	SectionCount	Mean	Error	StdDev	Op/s	Gen 0	Gen 1	Gen 2	Allocated
MultipartReaderParsing	6	1	1.941 us	0.0185 us	0.0144 us	515,106.4	0.0458	-	-	8 KB
MultipartReaderParsingWithRead	6	1	1.844 us	0.0030 us	0.0025 us	542,203.8	0.0458	-	-	8 KB
MultipartReaderParsing	6	2	2.870 us	0.0343 us	0.0305 us	348,462.1	0.0610	-	-	10 KB
MultipartReaderParsingWithRead	6	2	2.771 us	0.0177 us	0.0148 us	360,920.4	0.0610	-	-	10 KB
MultipartReaderParsing	6	3	3.875 us	0.0397 us	0.0332 us	258,083.7	0.0763	-	-	13 KB
MultipartReaderParsingWithRead	6	3	4.063 us	0.0313 us	0.0261 us	246,119.5	0.0763	-	-	13 KB
MultipartReaderParsing	28	1	2.031 us	0.0219 us	0.0183 us	492,319.1	0.0496	-	-	8 KB
MultipartReaderParsingWithRead	28	1	1.981 us	0.0389 us	0.0325 us	504,839.4	0.0458	-	-	8 KB
MultipartReaderParsing	28	2	3.059 us	0.0296 us	0.0263 us	326,853.8	0.0648	-	-	10 KB
MultipartReaderParsingWithRead	28	2	2.901 us	0.0318 us	0.0266 us	344,709.2	0.0610	-	-	10 KB
MultipartReaderParsing	28	3	4.191 us	0.0324 us	0.0270 us	238,618.2	0.0763	-	-	13 KB
MultipartReaderParsingWithRead	28	3	4.189 us	0.0331 us	0.0310 us	238,740.8	0.0763	-	-	13 KB
MultipartReaderParsing	70	1	2.340 us	0.0357 us	0.0279 us	427,285.7	0.0496	-	-	8 KB
MultipartReaderParsingWithRead	70	1	2.280 us	0.0386 us	0.0379 us	438,583.9	0.0496	-	-	8 KB
MultipartReaderParsing	70	2	3.338 us	0.0119 us	0.0106 us	299,621.5	0.0648	-	-	11 KB
MultipartReaderParsingWithRead	70	2	3.218 us	0.0126 us	0.0099 us	310,736.6	0.0648	-	-	10 KB
MultipartReaderParsing	70	3	4.492 us	0.0382 us	0.0319 us	222,634.3	0.0763	-	-	13 KB
MultipartReaderParsingWithRead	70	3	4.598 us	0.0228 us	0.0190 us	217,469.4	0.0763	-	-	13 KB

The second commit adds 4-7% more perf improvement.

The third commit adds another scenario to the microbenchmarks, a 10m byte payload. And it improves the performance of reading the section by using a 4k buffer size instead of the default 1 (in specific cases).

Third commit perf

Method	BoundarySize	SectionCount	LargePayload	Mean	Error	StdDev	Op/s	Gen 0	Gen 1	Gen 2	Allocated
MultipartReaderParsing	6	1	False	1.823 us	0.0198 us	0.0155 us	548,438.9	0.0477	-	-	8 KB
MultipartReaderParsingWithRead	6	1	False	1.857 us	0.0366 us	0.0476 us	538,527.0	0.0477	-	-	8 KB
MultipartReaderParsing	6	1	True	8,263.851 us	107.2549 us	89.5627 us	121.0	-	-	-	8 KB
MultipartReaderParsingWithRead	6	1	True	6,886.287 us	27.4054 us	24.2942 us	145.2	-	-	-	8 KB
MultipartReaderParsing	6	2	False	2.806 us	0.0123 us	0.0096 us	356,399.3	0.0610	-	-	10 KB
MultipartReaderParsingWithRead	6	2	False	2.724 us	0.0115 us	0.0108 us	367,070.9	0.0610	-	-	10 KB
MultipartReaderParsing	6	2	True	8,179.129 us	39.6340 us	33.0962 us	122.3	-	-	-	10 KB
MultipartReaderParsingWithRead	6	2	True	6,827.206 us	18.2311 us	15.2238 us	146.5	-	-	-	10 KB
MultipartReaderParsing	6	3	False	4.025 us	0.0513 us	0.0454 us	248,425.4	0.0763	-	-	13 KB
MultipartReaderParsingWithRead	6	3	False	3.784 us	0.0224 us	0.0187 us	264,296.3	0.0763	-	-	13 KB
MultipartReaderParsing	6	3	True	8,170.066 us	49.6645 us	41.4721 us	122.4	-	-	-	13 KB
MultipartReaderParsingWithRead	6	3	True	6,834.893 us	23.3473 us	18.2280 us	146.3	-	-	-	13 KB
MultipartReaderParsing	28	1	False	2.008 us	0.0086 us	0.0081 us	498,112.7	0.0496	-	-	8 KB
MultipartReaderParsingWithRead	28	1	False	1.960 us	0.0226 us	0.0200 us	510,094.9	0.0458	-	-	8 KB
MultipartReaderParsing	28	1	True	2,118.427 us	20.4851 us	18.1595 us	472.0	-	-	-	8 KB
MultipartReaderParsingWithRead	28	1	True	1,523.984 us	28.8977 us	27.0309 us	656.2	-	-	-	8 KB
MultipartReaderParsing	28	2	False	3.026 us	0.0387 us	0.0323 us	330,426.7	0.0648	-	-	10 KB
MultipartReaderParsingWithRead	28	2	False	2.877 us	0.0121 us	0.0108 us	347,579.3	0.0610	-	-	10 KB
MultipartReaderParsing	28	2	True	2,030.905 us	34.0778 us	55.0294 us	492.4	-	-	-	10 KB
MultipartReaderParsingWithRead	28	2	True	1,496.762 us	17.8483 us	13.9348 us	668.1	-	-	-	10 KB
MultipartReaderParsing	28	3	False	4.256 us	0.0228 us	0.0214 us	234,956.3	0.0763	-	-	13 KB
MultipartReaderParsingWithRead	28	3	False	3.982 us	0.0306 us	0.0271 us	251,112.9	0.0763	-	-	13 KB
MultipartReaderParsing	28	3	True	1,991.156 us	13.3217 us	11.8094 us	502.2	-	-	-	13 KB
MultipartReaderParsingWithRead	28	3	True	1,510.914 us	7.8215 us	7.3162 us	661.9	-	-	-	13 KB
MultipartReaderParsing	70	1	False	2.391 us	0.0180 us	0.0177 us	418,175.1	0.0496	-	-	8 KB
MultipartReaderParsingWithRead	70	1	False	2.290 us	0.0084 us	0.0070 us	436,597.6	0.0496	-	-	8 KB
MultipartReaderParsing	70	1	True	1,547.745 us	12.4537 us	11.6492 us	646.1	-	-	-	8 KB
MultipartReaderParsingWithRead	70	1	True	1,139.972 us	16.1206 us	14.2905 us	877.2	-	-	-	8 KB
MultipartReaderParsing	70	2	False	3.299 us	0.0137 us	0.0114 us	303,109.8	0.0648	-	-	11 KB
MultipartReaderParsingWithRead	70	2	False	3.164 us	0.0137 us	0.0114 us	316,092.7	0.0648	-	-	10 KB
MultipartReaderParsing	70	2	True	1,542.542 us	21.7543 us	18.1658 us	648.3	-	-	-	11 KB
MultipartReaderParsingWithRead	70	2	True	1,086.460 us	16.1358 us	14.3039 us	920.4	-	-	-	10 KB
MultipartReaderParsing	70	3	False	4.552 us	0.0187 us	0.0156 us	219,675.0	0.0763	-	-	13 KB
MultipartReaderParsingWithRead	70	3	False	4.260 us	0.0189 us	0.0148 us	234,727.6	0.0763	-	-	13 KB
MultipartReaderParsing	70	3	True	1,549.801 us	26.7600 us	33.8428 us	645.2	-	-	-	13 KB
MultipartReaderParsingWithRead	70	3	True	1,108.870 us	15.8322 us	13.2206 us	901.8	-	-	-	13 KB

And because it added a new scenario, I reran the benchmarks from the second commit just for the new scenario

Large read perf w/second commit

Method	BoundarySize	SectionCount	LargePayload	Mean	Error	StdDev	Op/s	Allocated
MultipartReaderParsing	6	1	True	8,254.744 us	117.0584 us	103.7693 us	121.142	8 KB
MultipartReaderParsingWithRead	6	1	True	952,975.147 us	11,113.2019 us	10,395.2960 us	1.049	8 KB
MultipartReaderParsing	6	2	True	8,197.130 us	109.4690 us	102.3973 us	121.994	10 KB
MultipartReaderParsingWithRead	6	2	True	951,177.454 us	5,712.7427 us	4,770.3989 us	1.051	11 KB
MultipartReaderParsing	6	3	True	8,159.616 us	35.4344 us	29.5893 us	122.555	13 KB
MultipartReaderParsingWithRead	6	3	True	827,570.520 us	15,964.3541 us	29,984.9266 us	1.208	13 KB
MultipartReaderParsing	28	1	True	2,116.655 us	16.0088 us	12.4986 us	472.444	8 KB
MultipartReaderParsingWithRead	28	1	True	150,968.389 us	448.1713 us	863.4740 us	6.624	8 KB
MultipartReaderParsing	28	2	True	2,015.013 us	35.7963 us	27.9474 us	496.275	10 KB
MultipartReaderParsingWithRead	28	2	True	150,779.372 us	426.1735 us	355.8742 us	6.632	10 KB
MultipartReaderParsing	28	3	True	2,055.590 us	39.6574 us	44.0791 us	486.478	13 KB
MultipartReaderParsingWithRead	28	3	True	152,140.258 us	235.8942 us	220.6556 us	6.573	13 KB
MultipartReaderParsing	70	1	True	1,598.819 us	30.4808 us	37.4332 us	625.462	8 KB
MultipartReaderParsingWithRead	70	1	True	103,558.622 us	389.8718 us	304.3862 us	9.656	8 KB
MultipartReaderParsing	70	2	True	1,502.007 us	7.5966 us	6.3435 us	665.776	11 KB
MultipartReaderParsingWithRead	70	2	True	105,796.179 us	350.5218 us	327.8783 us	9.452	11 KB
MultipartReaderParsing	70	3	True	1,492.644 us	13.2869 us	11.0952 us	669.952	13 KB
MultipartReaderParsingWithRead	70	3	True	105,299.893 us	204.3570 us	191.1557 us	9.497	13 KB

You can see that the scenarios where we call `section.Body.CopyTo(...)` in the large payload scenario got 100-150x faster.

In summary, the perf changed by 25-70% for the normal scenarios and over 100x in the 10m byte scenario (when application code was reading the section).

davidfowl · 2023-10-17T02:09:56Z

instead of writing one byte at a time.

😢

src/Http/WebUtilities/src/MultipartReaderStream.cs

BrennanConroy added 3 commits October 12, 2023 16:41

MultipartReader microbenchmark and better looping

37e6515

Write span not byte

7f2f6ce

read

96cd2e8

BrennanConroy added the Perf label Oct 16, 2023

BrennanConroy requested review from Tratcher, captainsafia and halter73 as code owners October 16, 2023 23:58

ghost added the area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions label Oct 16, 2023

davidfowl reviewed Oct 17, 2023

View reviewed changes

src/Http/WebUtilities/src/MultipartReaderStream.cs Show resolved Hide resolved

Tratcher approved these changes Oct 19, 2023

View reviewed changes

Update src/Http/WebUtilities/src/MultipartReaderStream.cs

4317767

BrennanConroy merged commit 742cbd9 into main Oct 20, 2023

BrennanConroy deleted the brecon/mpr branch October 20, 2023 01:08

ghost added this to the 9.0-preview1 milestone Oct 20, 2023

BrennanConroy mentioned this pull request Nov 1, 2023

Remove Boyer-Moore in favor of IndexOf #51815

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve MultipartReader performance #51426

Improve MultipartReader performance #51426

Uh oh!

BrennanConroy commented Oct 16, 2023

Uh oh!

davidfowl commented Oct 17, 2023

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Improve MultipartReader performance #51426

Improve MultipartReader performance #51426

Uh oh!

Conversation

BrennanConroy commented Oct 16, 2023

Uh oh!

davidfowl commented Oct 17, 2023

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants