Skip to content

Conversation

@BrennanConroy
Copy link
Member

There are 3 "distinct" changes and I'll show benchmark numbers for each of them below, as well as a summary at the end of this comment. Originally, this change was going to be trying to switch our boyer-moore string matching algorithm with IndexOf as described in #49223. But, while testing that out and writing microbenchmarks I noticed some other changes that could significantly improve perf regardless of boyer-moore vs. IndexOf. Switching to IndexOf was a bit more involved than anticipated, so will be a follow-up item since it does show some more perf improvements on top of this PR but will require additional work.

The first commit adds a loop to processing a line where we previously would check if there were items in the buffer on every iteration as well as do a function call to the core loop.

Original perf
Method BoundarySize SectionCount Mean Error StdDev Op/s Gen 0 Gen 1 Gen 2 Allocated
MultipartReaderParsing 6 1 2.430 us 0.0477 us 0.0446 us 411,568.4 0.0458 - - 8 KB
MultipartReaderParsingWithRead 6 1 2.348 us 0.0205 us 0.0160 us 425,842.8 0.0458 - - 8 KB
MultipartReaderParsing 6 2 4.431 us 0.0318 us 0.0298 us 225,666.9 0.0610 - - 10 KB
MultipartReaderParsingWithRead 6 2 4.354 us 0.0239 us 0.0200 us 229,691.5 0.0610 - - 10 KB
MultipartReaderParsing 6 3 6.496 us 0.0313 us 0.0277 us 153,936.6 0.0763 - - 13 KB
MultipartReaderParsingWithRead 6 3 6.500 us 0.0229 us 0.0214 us 153,853.8 0.0763 - - 13 KB
MultipartReaderParsing 28 1 2.552 us 0.0161 us 0.0126 us 391,809.5 0.0496 - - 8 KB
MultipartReaderParsingWithRead 28 1 2.461 us 0.0213 us 0.0188 us 406,284.6 0.0458 - - 8 KB
MultipartReaderParsing 28 2 4.613 us 0.0424 us 0.0331 us 216,794.2 0.0610 - - 10 KB
MultipartReaderParsingWithRead 28 2 4.566 us 0.0460 us 0.0408 us 219,024.4 0.0610 - - 10 KB
MultipartReaderParsing 28 3 6.698 us 0.0282 us 0.0235 us 149,304.9 0.0763 - - 13 KB
MultipartReaderParsingWithRead 28 3 6.646 us 0.0801 us 0.0669 us 150,460.5 0.0763 - - 13 KB
MultipartReaderParsing 70 1 2.852 us 0.0154 us 0.0136 us 350,661.7 0.0496 - - 8 KB
MultipartReaderParsingWithRead 70 1 2.824 us 0.0196 us 0.0174 us 354,142.2 0.0496 - - 8 KB
MultipartReaderParsing 70 2 4.913 us 0.0277 us 0.0232 us 203,548.5 0.0610 - - 11 KB
MultipartReaderParsingWithRead 70 2 4.801 us 0.0223 us 0.0186 us 208,291.7 0.0610 - - 10 KB
MultipartReaderParsing 70 3 7.099 us 0.0811 us 0.0677 us 140,870.4 0.0763 - - 13 KB
MultipartReaderParsingWithRead 70 3 7.167 us 0.0339 us 0.0283 us 139,520.1 0.0763 - - 13 KB
First commit perf
Method BoundarySize SectionCount Mean Error StdDev Op/s Gen 0 Gen 1 Gen 2 Allocated
MultipartReaderParsing 6 1 1.930 us 0.0211 us 0.0176 us 518,005.3 0.0458 - - 8 KB
MultipartReaderParsingWithRead 6 1 1.933 us 0.0301 us 0.0252 us 517,363.7 0.0458 - - 8 KB
MultipartReaderParsing 6 2 2.959 us 0.0148 us 0.0116 us 337,960.1 0.0610 - - 10 KB
MultipartReaderParsingWithRead 6 2 3.020 us 0.0457 us 0.0449 us 331,173.2 0.0610 - - 10 KB
MultipartReaderParsing 6 3 4.281 us 0.0200 us 0.0177 us 233,588.9 0.0763 - - 13 KB
MultipartReaderParsingWithRead 6 3 4.300 us 0.0360 us 0.0319 us 232,555.0 0.0763 - - 13 KB
MultipartReaderParsing 28 1 2.069 us 0.0189 us 0.0177 us 483,327.7 0.0496 - - 8 KB
MultipartReaderParsingWithRead 28 1 1.990 us 0.0169 us 0.0166 us 502,619.3 0.0458 - - 8 KB
MultipartReaderParsing 28 2 3.165 us 0.0162 us 0.0135 us 315,918.8 0.0648 - - 10 KB
MultipartReaderParsingWithRead 28 2 3.092 us 0.0114 us 0.0096 us 323,372.1 0.0610 - - 10 KB
MultipartReaderParsing 28 3 4.403 us 0.0294 us 0.0230 us 227,119.7 0.0763 - - 13 KB
MultipartReaderParsingWithRead 28 3 4.433 us 0.0366 us 0.0343 us 225,575.7 0.0763 - - 13 KB
MultipartReaderParsing 70 1 2.369 us 0.0134 us 0.0112 us 422,046.3 0.0496 - - 8 KB
MultipartReaderParsingWithRead 70 1 2.333 us 0.0401 us 0.0313 us 428,680.4 0.0496 - - 8 KB
MultipartReaderParsing 70 2 3.446 us 0.0255 us 0.0239 us 290,156.9 0.0648 - - 11 KB
MultipartReaderParsingWithRead 70 2 3.362 us 0.0162 us 0.0144 us 297,424.8 0.0648 - - 10 KB
MultipartReaderParsing 70 3 4.921 us 0.0757 us 0.1011 us 203,192.9 0.0763 - - 13 KB
MultipartReaderParsingWithRead 70 3 4.915 us 0.0467 us 0.0365 us 203,457.4 0.0763 - - 13 KB
The first commit gives a 25-50% increase.

The second commit builds on top of the first commit. Since we're now looping over the buffered data, we can wait until the end of the looping to write the span of data to the memorystream instead of writing one byte at a time.

Second commit perf
Method BoundarySize SectionCount Mean Error StdDev Op/s Gen 0 Gen 1 Gen 2 Allocated
MultipartReaderParsing 6 1 1.941 us 0.0185 us 0.0144 us 515,106.4 0.0458 - - 8 KB
MultipartReaderParsingWithRead 6 1 1.844 us 0.0030 us 0.0025 us 542,203.8 0.0458 - - 8 KB
MultipartReaderParsing 6 2 2.870 us 0.0343 us 0.0305 us 348,462.1 0.0610 - - 10 KB
MultipartReaderParsingWithRead 6 2 2.771 us 0.0177 us 0.0148 us 360,920.4 0.0610 - - 10 KB
MultipartReaderParsing 6 3 3.875 us 0.0397 us 0.0332 us 258,083.7 0.0763 - - 13 KB
MultipartReaderParsingWithRead 6 3 4.063 us 0.0313 us 0.0261 us 246,119.5 0.0763 - - 13 KB
MultipartReaderParsing 28 1 2.031 us 0.0219 us 0.0183 us 492,319.1 0.0496 - - 8 KB
MultipartReaderParsingWithRead 28 1 1.981 us 0.0389 us 0.0325 us 504,839.4 0.0458 - - 8 KB
MultipartReaderParsing 28 2 3.059 us 0.0296 us 0.0263 us 326,853.8 0.0648 - - 10 KB
MultipartReaderParsingWithRead 28 2 2.901 us 0.0318 us 0.0266 us 344,709.2 0.0610 - - 10 KB
MultipartReaderParsing 28 3 4.191 us 0.0324 us 0.0270 us 238,618.2 0.0763 - - 13 KB
MultipartReaderParsingWithRead 28 3 4.189 us 0.0331 us 0.0310 us 238,740.8 0.0763 - - 13 KB
MultipartReaderParsing 70 1 2.340 us 0.0357 us 0.0279 us 427,285.7 0.0496 - - 8 KB
MultipartReaderParsingWithRead 70 1 2.280 us 0.0386 us 0.0379 us 438,583.9 0.0496 - - 8 KB
MultipartReaderParsing 70 2 3.338 us 0.0119 us 0.0106 us 299,621.5 0.0648 - - 11 KB
MultipartReaderParsingWithRead 70 2 3.218 us 0.0126 us 0.0099 us 310,736.6 0.0648 - - 10 KB
MultipartReaderParsing 70 3 4.492 us 0.0382 us 0.0319 us 222,634.3 0.0763 - - 13 KB
MultipartReaderParsingWithRead 70 3 4.598 us 0.0228 us 0.0190 us 217,469.4 0.0763 - - 13 KB
The second commit adds 4-7% more perf improvement.

The third commit adds another scenario to the microbenchmarks, a 10m byte payload. And it improves the performance of reading the section by using a 4k buffer size instead of the default 1 (in specific cases).

Third commit perf
Method BoundarySize SectionCount LargePayload Mean Error StdDev Op/s Gen 0 Gen 1 Gen 2 Allocated
MultipartReaderParsing 6 1 False 1.823 us 0.0198 us 0.0155 us 548,438.9 0.0477 - - 8 KB
MultipartReaderParsingWithRead 6 1 False 1.857 us 0.0366 us 0.0476 us 538,527.0 0.0477 - - 8 KB
MultipartReaderParsing 6 1 True 8,263.851 us 107.2549 us 89.5627 us 121.0 - - - 8 KB
MultipartReaderParsingWithRead 6 1 True 6,886.287 us 27.4054 us 24.2942 us 145.2 - - - 8 KB
MultipartReaderParsing 6 2 False 2.806 us 0.0123 us 0.0096 us 356,399.3 0.0610 - - 10 KB
MultipartReaderParsingWithRead 6 2 False 2.724 us 0.0115 us 0.0108 us 367,070.9 0.0610 - - 10 KB
MultipartReaderParsing 6 2 True 8,179.129 us 39.6340 us 33.0962 us 122.3 - - - 10 KB
MultipartReaderParsingWithRead 6 2 True 6,827.206 us 18.2311 us 15.2238 us 146.5 - - - 10 KB
MultipartReaderParsing 6 3 False 4.025 us 0.0513 us 0.0454 us 248,425.4 0.0763 - - 13 KB
MultipartReaderParsingWithRead 6 3 False 3.784 us 0.0224 us 0.0187 us 264,296.3 0.0763 - - 13 KB
MultipartReaderParsing 6 3 True 8,170.066 us 49.6645 us 41.4721 us 122.4 - - - 13 KB
MultipartReaderParsingWithRead 6 3 True 6,834.893 us 23.3473 us 18.2280 us 146.3 - - - 13 KB
MultipartReaderParsing 28 1 False 2.008 us 0.0086 us 0.0081 us 498,112.7 0.0496 - - 8 KB
MultipartReaderParsingWithRead 28 1 False 1.960 us 0.0226 us 0.0200 us 510,094.9 0.0458 - - 8 KB
MultipartReaderParsing 28 1 True 2,118.427 us 20.4851 us 18.1595 us 472.0 - - - 8 KB
MultipartReaderParsingWithRead 28 1 True 1,523.984 us 28.8977 us 27.0309 us 656.2 - - - 8 KB
MultipartReaderParsing 28 2 False 3.026 us 0.0387 us 0.0323 us 330,426.7 0.0648 - - 10 KB
MultipartReaderParsingWithRead 28 2 False 2.877 us 0.0121 us 0.0108 us 347,579.3 0.0610 - - 10 KB
MultipartReaderParsing 28 2 True 2,030.905 us 34.0778 us 55.0294 us 492.4 - - - 10 KB
MultipartReaderParsingWithRead 28 2 True 1,496.762 us 17.8483 us 13.9348 us 668.1 - - - 10 KB
MultipartReaderParsing 28 3 False 4.256 us 0.0228 us 0.0214 us 234,956.3 0.0763 - - 13 KB
MultipartReaderParsingWithRead 28 3 False 3.982 us 0.0306 us 0.0271 us 251,112.9 0.0763 - - 13 KB
MultipartReaderParsing 28 3 True 1,991.156 us 13.3217 us 11.8094 us 502.2 - - - 13 KB
MultipartReaderParsingWithRead 28 3 True 1,510.914 us 7.8215 us 7.3162 us 661.9 - - - 13 KB
MultipartReaderParsing 70 1 False 2.391 us 0.0180 us 0.0177 us 418,175.1 0.0496 - - 8 KB
MultipartReaderParsingWithRead 70 1 False 2.290 us 0.0084 us 0.0070 us 436,597.6 0.0496 - - 8 KB
MultipartReaderParsing 70 1 True 1,547.745 us 12.4537 us 11.6492 us 646.1 - - - 8 KB
MultipartReaderParsingWithRead 70 1 True 1,139.972 us 16.1206 us 14.2905 us 877.2 - - - 8 KB
MultipartReaderParsing 70 2 False 3.299 us 0.0137 us 0.0114 us 303,109.8 0.0648 - - 11 KB
MultipartReaderParsingWithRead 70 2 False 3.164 us 0.0137 us 0.0114 us 316,092.7 0.0648 - - 10 KB
MultipartReaderParsing 70 2 True 1,542.542 us 21.7543 us 18.1658 us 648.3 - - - 11 KB
MultipartReaderParsingWithRead 70 2 True 1,086.460 us 16.1358 us 14.3039 us 920.4 - - - 10 KB
MultipartReaderParsing 70 3 False 4.552 us 0.0187 us 0.0156 us 219,675.0 0.0763 - - 13 KB
MultipartReaderParsingWithRead 70 3 False 4.260 us 0.0189 us 0.0148 us 234,727.6 0.0763 - - 13 KB
MultipartReaderParsing 70 3 True 1,549.801 us 26.7600 us 33.8428 us 645.2 - - - 13 KB
MultipartReaderParsingWithRead 70 3 True 1,108.870 us 15.8322 us 13.2206 us 901.8 - - - 13 KB

And because it added a new scenario, I reran the benchmarks from the second commit just for the new scenario

Large read perf w/second commit
Method BoundarySize SectionCount LargePayload Mean Error StdDev Op/s Allocated
MultipartReaderParsing 6 1 True 8,254.744 us 117.0584 us 103.7693 us 121.142 8 KB
MultipartReaderParsingWithRead 6 1 True 952,975.147 us 11,113.2019 us 10,395.2960 us 1.049 8 KB
MultipartReaderParsing 6 2 True 8,197.130 us 109.4690 us 102.3973 us 121.994 10 KB
MultipartReaderParsingWithRead 6 2 True 951,177.454 us 5,712.7427 us 4,770.3989 us 1.051 11 KB
MultipartReaderParsing 6 3 True 8,159.616 us 35.4344 us 29.5893 us 122.555 13 KB
MultipartReaderParsingWithRead 6 3 True 827,570.520 us 15,964.3541 us 29,984.9266 us 1.208 13 KB
MultipartReaderParsing 28 1 True 2,116.655 us 16.0088 us 12.4986 us 472.444 8 KB
MultipartReaderParsingWithRead 28 1 True 150,968.389 us 448.1713 us 863.4740 us 6.624 8 KB
MultipartReaderParsing 28 2 True 2,015.013 us 35.7963 us 27.9474 us 496.275 10 KB
MultipartReaderParsingWithRead 28 2 True 150,779.372 us 426.1735 us 355.8742 us 6.632 10 KB
MultipartReaderParsing 28 3 True 2,055.590 us 39.6574 us 44.0791 us 486.478 13 KB
MultipartReaderParsingWithRead 28 3 True 152,140.258 us 235.8942 us 220.6556 us 6.573 13 KB
MultipartReaderParsing 70 1 True 1,598.819 us 30.4808 us 37.4332 us 625.462 8 KB
MultipartReaderParsingWithRead 70 1 True 103,558.622 us 389.8718 us 304.3862 us 9.656 8 KB
MultipartReaderParsing 70 2 True 1,502.007 us 7.5966 us 6.3435 us 665.776 11 KB
MultipartReaderParsingWithRead 70 2 True 105,796.179 us 350.5218 us 327.8783 us 9.452 11 KB
MultipartReaderParsing 70 3 True 1,492.644 us 13.2869 us 11.0952 us 669.952 13 KB
MultipartReaderParsingWithRead 70 3 True 105,299.893 us 204.3570 us 191.1557 us 9.497 13 KB
You can see that the scenarios where we call `section.Body.CopyTo(...)` in the large payload scenario got 100-150x faster.

In summary, the perf changed by 25-70% for the normal scenarios and over 100x in the 10m byte scenario (when application code was reading the section).

@ghost ghost added the area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions label Oct 16, 2023
@davidfowl
Copy link
Member

instead of writing one byte at a time.

😢

@BrennanConroy BrennanConroy merged commit 742cbd9 into main Oct 20, 2023
@BrennanConroy BrennanConroy deleted the brecon/mpr branch October 20, 2023 01:08
@ghost ghost added this to the 9.0-preview1 milestone Oct 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions Perf

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants