-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Use MSVC Intrinsics for TZCNT and BSF in Brotli encoder #35315
Conversation
|
cc @ahsonkhan Full compress perf test results, v1.0.7 base vs this PR (edited for space) System.IO.Compression.Brotli.PerformanceTests.dll | Average | STDEV.S | Min | Max
:----------------------------------------------------------------------:| --------:| -------:| --------:| --------:
- Compress_Canterbury_WithoutState("alice29.txt", Fastest) | 1.219 | 0.146 | 1.107 | 1.868
+ Compress_Canterbury_WithoutState("alice29.txt", Fastest) | 0.905 | 0.092 | 0.842 | 1.462
- Compress_Canterbury_WithoutState("alice29.txt", NoCompression) | 0.968 | 0.139 | 0.862 | 1.549
+ Compress_Canterbury_WithoutState("alice29.txt", NoCompression) | 0.600 | 0.030 | 0.584 | 0.773
- Compress_Canterbury_WithoutState("alice29.txt", Optimal) | 269.169 | 8.673 | 261.588 | 311.027
+ Compress_Canterbury_WithoutState("alice29.txt", Optimal) | 243.610 | 6.038 | 235.890 | 275.083
- Compress_Canterbury_WithoutState("asyoulik.txt", Fastest) | 1.056 | 0.168 | 0.917 | 1.940
+ Compress_Canterbury_WithoutState("asyoulik.txt", Fastest) | 0.844 | 0.164 | 0.731 | 1.678
- Compress_Canterbury_WithoutState("asyoulik.txt", NoCompression) | 0.875 | 0.214 | 0.743 | 2.532
+ Compress_Canterbury_WithoutState("asyoulik.txt", NoCompression) | 0.548 | 0.044 | 0.513 | 0.840
- Compress_Canterbury_WithoutState("asyoulik.txt", Optimal) | 209.849 | 3.078 | 204.861 | 216.708
+ Compress_Canterbury_WithoutState("asyoulik.txt", Optimal) | 192.353 | 2.988 | 187.355 | 202.614
- Compress_Canterbury_WithoutState("cp.html", Fastest) | 0.170 | 0.018 | 0.158 | 0.286
+ Compress_Canterbury_WithoutState("cp.html", Fastest) | 0.120 | 0.014 | 0.112 | 0.189
- Compress_Canterbury_WithoutState("cp.html", NoCompression) | 0.142 | 0.037 | 0.125 | 0.347
+ Compress_Canterbury_WithoutState("cp.html", NoCompression) | 0.097 | 0.003 | 0.094 | 0.113
- Compress_Canterbury_WithoutState("cp.html", Optimal) | 36.413 | 1.355 | 34.136 | 40.683
+ Compress_Canterbury_WithoutState("cp.html", Optimal) | 34.616 | 4.462 | 31.215 | 56.121
- Compress_Canterbury_WithoutState("fields.c", Fastest) | 0.082 | 0.020 | 0.070 | 0.157
+ Compress_Canterbury_WithoutState("fields.c", Fastest) | 0.049 | 0.007 | 0.044 | 0.088
- Compress_Canterbury_WithoutState("fields.c", NoCompression) | 0.059 | 0.006 | 0.054 | 0.090
+ Compress_Canterbury_WithoutState("fields.c", NoCompression) | 0.039 | 0.002 | 0.038 | 0.051
- Compress_Canterbury_WithoutState("fields.c", Optimal) | 17.749 | 0.879 | 16.289 | 22.570
+ Compress_Canterbury_WithoutState("fields.c", Optimal) | 16.045 | 0.905 | 14.840 | 20.842
- Compress_Canterbury_WithoutState("grammar.lsp", Fastest) | 0.026 | 0.006 | 0.024 | 0.072
+ Compress_Canterbury_WithoutState("grammar.lsp", Fastest) | 0.024 | 0.009 | 0.020 | 0.083
- Compress_Canterbury_WithoutState("grammar.lsp", NoCompression) | 0.019 | 0.002 | 0.017 | 0.031
+ Compress_Canterbury_WithoutState("grammar.lsp", NoCompression) | 0.018 | 0.002 | 0.016 | 0.025
- Compress_Canterbury_WithoutState("grammar.lsp", Optimal) | 6.770 | 0.614 | 6.038 | 9.409
+ Compress_Canterbury_WithoutState("grammar.lsp", Optimal) | 6.109 | 0.454 | 5.405 | 7.606
- Compress_Canterbury_WithoutState("kennedy.xls", Fastest) | 4.408 | 0.570 | 3.887 | 6.367
+ Compress_Canterbury_WithoutState("kennedy.xls", Fastest) | 3.622 | 0.387 | 3.228 | 4.853
- Compress_Canterbury_WithoutState("kennedy.xls", NoCompression) | 2.867 | 0.201 | 2.664 | 3.680
+ Compress_Canterbury_WithoutState("kennedy.xls", NoCompression) | 2.371 | 0.190 | 2.157 | 3.565
- Compress_Canterbury_WithoutState("kennedy.xls", Optimal) | 2666.949 | 39.429 | 2638.025 | 2723.019
+ Compress_Canterbury_WithoutState("kennedy.xls", Optimal) | 2551.466 | 3.404 | 2549.538 | 2556.553
- Compress_Canterbury_WithoutState("lcet10.txt", Fastest) | 3.683 | 0.391 | 3.325 | 5.012
+ Compress_Canterbury_WithoutState("lcet10.txt", Fastest) | 2.829 | 0.270 | 2.508 | 3.618
- Compress_Canterbury_WithoutState("lcet10.txt", NoCompression) | 2.661 | 0.234 | 2.413 | 3.569
+ Compress_Canterbury_WithoutState("lcet10.txt", NoCompression) | 2.018 | 2.072 | 1.510 | 19.780
- Compress_Canterbury_WithoutState("lcet10.txt", Optimal) | 837.574 | 16.980 | 824.885 | 880.225
+ Compress_Canterbury_WithoutState("lcet10.txt", Optimal) | 752.714 | 8.551 | 742.157 | 768.464
- Compress_Canterbury_WithoutState("plrabn12.txt", Fastest) | 4.419 | 0.436 | 3.969 | 6.265
+ Compress_Canterbury_WithoutState("plrabn12.txt", Fastest) | 3.575 | 0.459 | 3.092 | 5.088
- Compress_Canterbury_WithoutState("plrabn12.txt", NoCompression) | 3.517 | 0.357 | 3.106 | 4.685
+ Compress_Canterbury_WithoutState("plrabn12.txt", NoCompression) | 2.225 | 0.147 | 2.086 | 2.984
- Compress_Canterbury_WithoutState("plrabn12.txt", Optimal) | 906.004 | 7.378 | 897.405 | 917.548
+ Compress_Canterbury_WithoutState("plrabn12.txt", Optimal) | 828.987 | 14.865 | 814.300 | 866.763
- Compress_Canterbury_WithoutState("ptt5", Fastest) | 1.199 | 0.132 | 1.075 | 1.703
+ Compress_Canterbury_WithoutState("ptt5", Fastest) | 0.944 | 0.109 | 0.836 | 1.341
- Compress_Canterbury_WithoutState("ptt5", NoCompression) | 1.072 | 0.124 | 0.977 | 1.764
+ Compress_Canterbury_WithoutState("ptt5", NoCompression) | 0.780 | 0.109 | 0.706 | 1.596
- Compress_Canterbury_WithoutState("ptt5", Optimal) | 1355.112 | 11.336 | 1339.058 | 1371.698
+ Compress_Canterbury_WithoutState("ptt5", Optimal) | 1052.047 | 5.641 | 1043.890 | 1058.504
- Compress_Canterbury_WithoutState("sum", Fastest) | 0.269 | 0.075 | 0.201 | 0.510
+ Compress_Canterbury_WithoutState("sum", Fastest) | 0.222 | 0.068 | 0.162 | 0.427
- Compress_Canterbury_WithoutState("sum", NoCompression) | 0.174 | 0.031 | 0.152 | 0.335
+ Compress_Canterbury_WithoutState("sum", NoCompression) | 0.123 | 0.012 | 0.116 | 0.182
- Compress_Canterbury_WithoutState("sum", Optimal) | 62.296 | 2.100 | 58.130 | 74.047
+ Compress_Canterbury_WithoutState("sum", Optimal) | 58.627 | 5.462 | 53.762 | 84.067
- Compress_Canterbury_WithoutState("TestDocument.doc", Fastest) | 0.101 | 0.028 | 0.087 | 0.252
+ Compress_Canterbury_WithoutState("TestDocument.doc", Fastest) | 0.080 | 0.020 | 0.071 | 0.214
- Compress_Canterbury_WithoutState("TestDocument.doc", NoCompression) | 0.068 | 0.015 | 0.058 | 0.166
+ Compress_Canterbury_WithoutState("TestDocument.doc", NoCompression) | 0.053 | 0.007 | 0.048 | 0.093
- Compress_Canterbury_WithoutState("TestDocument.doc", Optimal) | 37.074 | 3.662 | 33.985 | 68.344
+ Compress_Canterbury_WithoutState("TestDocument.doc", Optimal) | 32.475 | 1.099 | 30.288 | 36.389
- Compress_Canterbury_WithoutState("TestDocument.docx", Fastest) | 0.072 | 0.012 | 0.066 | 0.135
+ Compress_Canterbury_WithoutState("TestDocument.docx", Fastest) | 0.065 | 0.009 | 0.061 | 0.126
- Compress_Canterbury_WithoutState("TestDocument.docx", NoCompression) | 0.068 | 0.007 | 0.063 | 0.109
+ Compress_Canterbury_WithoutState("TestDocument.docx", NoCompression) | 0.073 | 0.016 | 0.062 | 0.160
- Compress_Canterbury_WithoutState("TestDocument.docx", Optimal) | 37.430 | 1.468 | 34.866 | 44.357
+ Compress_Canterbury_WithoutState("TestDocument.docx", Optimal) | 36.162 | 1.009 | 34.349 | 41.145
- Compress_Canterbury_WithoutState("TestDocument.pdf", Fastest) | 0.425 | 0.109 | 0.370 | 1.241
+ Compress_Canterbury_WithoutState("TestDocument.pdf", Fastest) | 0.392 | 0.043 | 0.371 | 0.646
- Compress_Canterbury_WithoutState("TestDocument.pdf", NoCompression) | 0.322 | 0.034 | 0.301 | 0.492
+ Compress_Canterbury_WithoutState("TestDocument.pdf", NoCompression) | 0.313 | 0.030 | 0.297 | 0.484
- Compress_Canterbury_WithoutState("TestDocument.pdf", Optimal) | 566.702 | 3.411 | 561.637 | 573.785
+ Compress_Canterbury_WithoutState("TestDocument.pdf", Optimal) | 578.972 | 17.271 | 564.081 | 644.136
- Compress_Canterbury_WithoutState("TestDocument.txt", Fastest) | 0.095 | 0.020 | 0.082 | 0.206
+ Compress_Canterbury_WithoutState("TestDocument.txt", Fastest) | 0.089 | 0.023 | 0.075 | 0.203
- Compress_Canterbury_WithoutState("TestDocument.txt", NoCompression) | 0.026 | 0.008 | 0.023 | 0.095
+ Compress_Canterbury_WithoutState("TestDocument.txt", NoCompression) | 0.026 | 0.007 | 0.025 | 0.096
- Compress_Canterbury_WithoutState("TestDocument.txt", Optimal) | 4.081 | 0.335 | 3.657 | 5.216
+ Compress_Canterbury_WithoutState("TestDocument.txt", Optimal) | 3.888 | 0.402 | 3.358 | 5.408
- Compress_Canterbury_WithoutState("xargs.1", Fastest) | 0.032 | 0.004 | 0.030 | 0.054
+ Compress_Canterbury_WithoutState("xargs.1", Fastest) | 0.026 | 0.003 | 0.024 | 0.049
- Compress_Canterbury_WithoutState("xargs.1", NoCompression) | 0.022 | 0.003 | 0.020 | 0.042
+ Compress_Canterbury_WithoutState("xargs.1", NoCompression) | 0.022 | 0.009 | 0.019 | 0.072
- Compress_Canterbury_WithoutState("xargs.1", Optimal) | 7.794 | 0.720 | 6.799 | 11.297
+ Compress_Canterbury_WithoutState("xargs.1", Optimal) | 7.094 | 0.579 | 6.454 | 10.696
- Compress_Canterbury_WithState("alice29.txt", Fastest) | 1.272 | 0.196 | 1.085 | 2.001
+ Compress_Canterbury_WithState("alice29.txt", Fastest) | 0.976 | 0.170 | 0.822 | 1.588
- Compress_Canterbury_WithState("alice29.txt", NoCompression) | 1.012 | 0.207 | 0.866 | 2.055
+ Compress_Canterbury_WithState("alice29.txt", NoCompression) | 0.607 | 0.032 | 0.594 | 0.869
- Compress_Canterbury_WithState("alice29.txt", Optimal) | 266.630 | 3.853 | 258.220 | 275.258
+ Compress_Canterbury_WithState("alice29.txt", Optimal) | 241.458 | 4.078 | 234.457 | 252.477
- Compress_Canterbury_WithState("asyoulik.txt", Fastest) | 1.065 | 0.202 | 0.924 | 2.048
+ Compress_Canterbury_WithState("asyoulik.txt", Fastest) | 0.789 | 0.083 | 0.745 | 1.228
- Compress_Canterbury_WithState("asyoulik.txt", NoCompression) | 0.884 | 0.134 | 0.757 | 1.317
+ Compress_Canterbury_WithState("asyoulik.txt", NoCompression) | 0.562 | 0.065 | 0.508 | 0.910
- Compress_Canterbury_WithState("asyoulik.txt", Optimal) | 210.875 | 2.871 | 205.030 | 218.782
+ Compress_Canterbury_WithState("asyoulik.txt", Optimal) | 191.385 | 2.517 | 186.355 | 197.966
- Compress_Canterbury_WithState("cp.html", Fastest) | 0.195 | 0.038 | 0.161 | 0.327
+ Compress_Canterbury_WithState("cp.html", Fastest) | 0.125 | 0.028 | 0.112 | 0.275
- Compress_Canterbury_WithState("cp.html", NoCompression) | 0.147 | 0.020 | 0.140 | 0.267
+ Compress_Canterbury_WithState("cp.html", NoCompression) | 0.111 | 0.005 | 0.106 | 0.133
- Compress_Canterbury_WithState("cp.html", Optimal) | 36.039 | 1.416 | 33.062 | 41.167
+ Compress_Canterbury_WithState("cp.html", Optimal) | 32.322 | 1.029 | 30.369 | 35.513
- Compress_Canterbury_WithState("fields.c", Fastest) | 0.076 | 0.010 | 0.073 | 0.157
+ Compress_Canterbury_WithState("fields.c", Fastest) | 0.057 | 0.010 | 0.048 | 0.100
- Compress_Canterbury_WithState("fields.c", NoCompression) | 0.071 | 0.008 | 0.065 | 0.116
+ Compress_Canterbury_WithState("fields.c", NoCompression) | 0.061 | 0.019 | 0.046 | 0.116
- Compress_Canterbury_WithState("fields.c", Optimal) | 17.252 | 0.772 | 15.661 | 20.173
+ Compress_Canterbury_WithState("fields.c", Optimal) | 15.555 | 0.780 | 14.352 | 19.407
- Compress_Canterbury_WithState("grammar.lsp", Fastest) | 0.029 | 0.007 | 0.024 | 0.073
+ Compress_Canterbury_WithState("grammar.lsp", Fastest) | 0.022 | 0.005 | 0.020 | 0.059
- Compress_Canterbury_WithState("grammar.lsp", NoCompression) | 0.029 | 0.020 | 0.023 | 0.212
+ Compress_Canterbury_WithState("grammar.lsp", NoCompression) | 0.025 | 0.007 | 0.022 | 0.066
- Compress_Canterbury_WithState("grammar.lsp", Optimal) | 6.827 | 0.601 | 5.940 | 8.774
+ Compress_Canterbury_WithState("grammar.lsp", Optimal) | 6.169 | 0.557 | 5.438 | 7.779
- Compress_Canterbury_WithState("kennedy.xls", Fastest) | 4.086 | 0.446 | 3.662 | 5.758
+ Compress_Canterbury_WithState("kennedy.xls", Fastest) | 3.673 | 0.411 | 3.201 | 5.142
- Compress_Canterbury_WithState("kennedy.xls", NoCompression) | 3.051 | 0.443 | 2.627 | 4.761
+ Compress_Canterbury_WithState("kennedy.xls", NoCompression) | 2.574 | 0.543 | 2.250 | 6.299
- Compress_Canterbury_WithState("kennedy.xls", Optimal) | 2663.118 | 5.475 | 2655.548 | 2667.177
+ Compress_Canterbury_WithState("kennedy.xls", Optimal) | 2570.798 | 16.426 | 2551.147 | 2584.402
- Compress_Canterbury_WithState("lcet10.txt", Fastest) | 3.403 | 0.375 | 2.980 | 4.818
+ Compress_Canterbury_WithState("lcet10.txt", Fastest) | 2.541 | 0.290 | 2.256 | 3.949
- Compress_Canterbury_WithState("lcet10.txt", NoCompression) | 2.695 | 0.308 | 2.389 | 3.787
+ Compress_Canterbury_WithState("lcet10.txt", NoCompression) | 1.699 | 0.196 | 1.524 | 2.625
- Compress_Canterbury_WithState("lcet10.txt", Optimal) | 841.948 | 21.321 | 827.811 | 908.402
+ Compress_Canterbury_WithState("lcet10.txt", Optimal) | 745.196 | 5.925 | 738.894 | 758.802
- Compress_Canterbury_WithState("plrabn12.txt", Fastest) | 4.439 | 0.403 | 3.923 | 5.569
+ Compress_Canterbury_WithState("plrabn12.txt", Fastest) | 3.556 | 0.403 | 3.091 | 5.205
- Compress_Canterbury_WithState("plrabn12.txt", NoCompression) | 3.405 | 0.312 | 3.096 | 4.843
+ Compress_Canterbury_WithState("plrabn12.txt", NoCompression) | 2.531 | 1.016 | 2.052 | 8.305
- Compress_Canterbury_WithState("plrabn12.txt", Optimal) | 902.867 | 9.481 | 892.144 | 926.019
+ Compress_Canterbury_WithState("plrabn12.txt", Optimal) | 822.698 | 8.306 | 809.662 | 845.065
- Compress_Canterbury_WithState("ptt5", Fastest) | 1.202 | 0.145 | 1.087 | 1.767
+ Compress_Canterbury_WithState("ptt5", Fastest) | 1.259 | 0.691 | 0.883 | 3.881
- Compress_Canterbury_WithState("ptt5", NoCompression) | 1.108 | 0.162 | 0.959 | 1.750
+ Compress_Canterbury_WithState("ptt5", NoCompression) | 0.767 | 0.054 | 0.730 | 1.126
- Compress_Canterbury_WithState("ptt5", Optimal) | 1354.923 | 8.380 | 1343.445 | 1368.249
+ Compress_Canterbury_WithState("ptt5", Optimal) | 1058.934 | 27.511 | 1041.584 | 1133.862
- Compress_Canterbury_WithState("sum", Fastest) | 0.227 | 0.048 | 0.202 | 0.526
+ Compress_Canterbury_WithState("sum", Fastest) | 0.179 | 0.028 | 0.166 | 0.312
- Compress_Canterbury_WithState("sum", NoCompression) | 0.182 | 0.025 | 0.170 | 0.371
+ Compress_Canterbury_WithState("sum", NoCompression) | 0.135 | 0.010 | 0.126 | 0.176
- Compress_Canterbury_WithState("sum", Optimal) | 62.217 | 3.787 | 58.357 | 86.804
+ Compress_Canterbury_WithState("sum", Optimal) | 55.420 | 1.424 | 52.720 | 61.378
- Compress_Canterbury_WithState("TestDocument.doc", Fastest) | 0.101 | 0.033 | 0.086 | 0.309
+ Compress_Canterbury_WithState("TestDocument.doc", Fastest) | 0.080 | 0.027 | 0.072 | 0.311
- Compress_Canterbury_WithState("TestDocument.doc", NoCompression) | 0.081 | 0.014 | 0.071 | 0.132
+ Compress_Canterbury_WithState("TestDocument.doc", NoCompression) | 0.059 | 0.003 | 0.057 | 0.074
- Compress_Canterbury_WithState("TestDocument.doc", Optimal) | 37.342 | 3.309 | 34.386 | 55.877
+ Compress_Canterbury_WithState("TestDocument.doc", Optimal) | 32.642 | 1.220 | 30.462 | 37.163
- Compress_Canterbury_WithState("TestDocument.docx", Fastest) | 0.074 | 0.015 | 0.068 | 0.181
+ Compress_Canterbury_WithState("TestDocument.docx", Fastest) | 0.062 | 0.003 | 0.060 | 0.076
- Compress_Canterbury_WithState("TestDocument.docx", NoCompression) | 0.075 | 0.005 | 0.072 | 0.110
+ Compress_Canterbury_WithState("TestDocument.docx", NoCompression) | 0.086 | 0.026 | 0.070 | 0.232
- Compress_Canterbury_WithState("TestDocument.docx", Optimal) | 36.927 | 1.076 | 34.656 | 39.801
+ Compress_Canterbury_WithState("TestDocument.docx", Optimal) | 35.890 | 1.166 | 33.864 | 39.971
- Compress_Canterbury_WithState("TestDocument.pdf", Fastest) | 0.416 | 0.100 | 0.381 | 1.154
+ Compress_Canterbury_WithState("TestDocument.pdf", Fastest) | 0.474 | 0.255 | 0.357 | 2.208
- Compress_Canterbury_WithState("TestDocument.pdf", NoCompression) | 0.337 | 0.052 | 0.299 | 0.601
+ Compress_Canterbury_WithState("TestDocument.pdf", NoCompression) | 0.317 | 0.026 | 0.299 | 0.471
- Compress_Canterbury_WithState("TestDocument.pdf", Optimal) | 567.716 | 7.671 | 556.144 | 592.784
+ Compress_Canterbury_WithState("TestDocument.pdf", Optimal) | 580.094 | 19.408 | 568.391 | 651.445
- Compress_Canterbury_WithState("TestDocument.txt", Fastest) | 0.029 | 0.037 | 0.018 | 0.347
+ Compress_Canterbury_WithState("TestDocument.txt", Fastest) | 0.021 | 0.010 | 0.016 | 0.073
- Compress_Canterbury_WithState("TestDocument.txt", NoCompression) | 0.034 | 0.009 | 0.028 | 0.079
+ Compress_Canterbury_WithState("TestDocument.txt", NoCompression) | 0.037 | 0.018 | 0.030 | 0.151
- Compress_Canterbury_WithState("TestDocument.txt", Optimal) | 3.936 | 0.562 | 3.296 | 5.983
+ Compress_Canterbury_WithState("TestDocument.txt", Optimal) | 4.032 | 0.834 | 3.150 | 6.454
- Compress_Canterbury_WithState("xargs.1", Fastest) | 0.034 | 0.004 | 0.030 | 0.049
+ Compress_Canterbury_WithState("xargs.1", Fastest) | 0.027 | 0.003 | 0.025 | 0.041
- Compress_Canterbury_WithState("xargs.1", NoCompression) | 0.034 | 0.008 | 0.027 | 0.074
+ Compress_Canterbury_WithState("xargs.1", NoCompression) | 0.027 | 0.003 | 0.025 | 0.045
- Compress_Canterbury_WithState("xargs.1", Optimal) | 7.487 | 0.589 | 6.542 | 9.754
+ Compress_Canterbury_WithState("xargs.1", Optimal) | 6.745 | 0.490 | 5.998 | 7.881
- Compress_Canterbury("alice29.txt", Fastest) | 1.235 | 0.137 | 1.130 | 1.944
+ Compress_Canterbury("alice29.txt", Fastest) | 1.039 | 0.148 | 0.927 | 1.620
- Compress_Canterbury("alice29.txt", NoCompression) | 1.018 | 0.129 | 0.901 | 1.752
+ Compress_Canterbury("alice29.txt", NoCompression) | 0.671 | 0.048 | 0.635 | 0.939
- Compress_Canterbury("alice29.txt", Optimal) | 270.635 | 11.905 | 258.136 | 333.321
+ Compress_Canterbury("alice29.txt", Optimal) | 245.969 | 7.757 | 231.059 | 265.503
- Compress_Canterbury("asyoulik.txt", Fastest) | 1.096 | 0.195 | 0.956 | 2.518
+ Compress_Canterbury("asyoulik.txt", Fastest) | 0.880 | 0.145 | 0.776 | 1.422
- Compress_Canterbury("asyoulik.txt", NoCompression) | 0.890 | 0.120 | 0.785 | 1.448
+ Compress_Canterbury("asyoulik.txt", NoCompression) | 0.597 | 0.034 | 0.579 | 0.795
- Compress_Canterbury("asyoulik.txt", Optimal) | 211.859 | 4.600 | 203.336 | 227.932
+ Compress_Canterbury("asyoulik.txt", Optimal) | 192.562 | 7.051 | 184.170 | 217.607
- Compress_Canterbury("cp.html", Fastest) | 0.183 | 0.058 | 0.165 | 0.628
+ Compress_Canterbury("cp.html", Fastest) | 0.168 | 0.073 | 0.117 | 0.827
- Compress_Canterbury("cp.html", NoCompression) | 0.168 | 0.054 | 0.144 | 0.618
+ Compress_Canterbury("cp.html", NoCompression) | 0.135 | 0.052 | 0.117 | 0.612
- Compress_Canterbury("cp.html", Optimal) | 36.090 | 1.396 | 33.851 | 42.423
+ Compress_Canterbury("cp.html", Optimal) | 32.115 | 2.783 | 30.041 | 51.049
- Compress_Canterbury("fields.c", Fastest) | 0.086 | 0.014 | 0.076 | 0.145
+ Compress_Canterbury("fields.c", Fastest) | 0.085 | 0.021 | 0.052 | 0.191
- Compress_Canterbury("fields.c", NoCompression) | 0.083 | 0.014 | 0.072 | 0.143
+ Compress_Canterbury("fields.c", NoCompression) | 0.059 | 0.008 | 0.051 | 0.086
- Compress_Canterbury("fields.c", Optimal) | 17.243 | 0.900 | 15.720 | 20.332
+ Compress_Canterbury("fields.c", Optimal) | 15.710 | 0.816 | 14.618 | 19.892
- Compress_Canterbury("grammar.lsp", Fastest) | 0.041 | 0.055 | 0.027 | 0.579
+ Compress_Canterbury("grammar.lsp", Fastest) | 0.040 | 0.068 | 0.023 | 0.683
- Compress_Canterbury("grammar.lsp", NoCompression) | 0.029 | 0.005 | 0.025 | 0.054
+ Compress_Canterbury("grammar.lsp", NoCompression) | 0.027 | 0.004 | 0.024 | 0.051
- Compress_Canterbury("grammar.lsp", Optimal) | 6.664 | 0.460 | 5.954 | 8.447
+ Compress_Canterbury("grammar.lsp", Optimal) | 6.377 | 0.652 | 5.811 | 10.402
- Compress_Canterbury("kennedy.xls", Fastest) | 4.218 | 0.923 | 3.652 | 12.475
+ Compress_Canterbury("kennedy.xls", Fastest) | 3.829 | 0.470 | 3.321 | 5.960
- Compress_Canterbury("kennedy.xls", NoCompression) | 3.036 | 0.362 | 2.639 | 4.199
+ Compress_Canterbury("kennedy.xls", NoCompression) | 2.530 | 0.303 | 2.325 | 4.520
- Compress_Canterbury("kennedy.xls", Optimal) | 2654.240 | 21.585 | 2636.523 | 2682.780
+ Compress_Canterbury("kennedy.xls", Optimal) | 2585.241 | 12.884 | 2574.880 | 2601.915
- Compress_Canterbury("lcet10.txt", Fastest) | 3.430 | 0.338 | 3.055 | 4.876
+ Compress_Canterbury("lcet10.txt", Fastest) | 2.571 | 0.148 | 2.435 | 3.371
- Compress_Canterbury("lcet10.txt", NoCompression) | 2.738 | 0.277 | 2.400 | 3.730
+ Compress_Canterbury("lcet10.txt", NoCompression) | 1.716 | 0.143 | 1.633 | 2.600
- Compress_Canterbury("lcet10.txt", Optimal) | 833.566 | 7.028 | 822.675 | 845.543
+ Compress_Canterbury("lcet10.txt", Optimal) | 758.690 | 14.802 | 730.686 | 789.723
- Compress_Canterbury("plrabn12.txt", Fastest) | 4.477 | 0.428 | 3.928 | 6.103
+ Compress_Canterbury("plrabn12.txt", Fastest) | 3.447 | 0.202 | 3.196 | 4.251
- Compress_Canterbury("plrabn12.txt", NoCompression) | 3.572 | 0.363 | 3.107 | 4.795
+ Compress_Canterbury("plrabn12.txt", NoCompression) | 2.322 | 0.163 | 2.194 | 3.235
- Compress_Canterbury("plrabn12.txt", Optimal) | 907.135 | 7.493 | 893.440 | 921.431
+ Compress_Canterbury("plrabn12.txt", Optimal) | 834.578 | 14.729 | 816.497 | 863.284
- Compress_Canterbury("ptt5", Fastest) | 1.266 | 0.202 | 1.093 | 2.178
+ Compress_Canterbury("ptt5", Fastest) | 1.041 | 0.137 | 0.943 | 1.690
- Compress_Canterbury("ptt5", NoCompression) | 1.096 | 0.101 | 1.021 | 1.446
+ Compress_Canterbury("ptt5", NoCompression) | 0.811 | 0.074 | 0.756 | 1.118
- Compress_Canterbury("ptt5", Optimal) | 1345.753 | 8.623 | 1334.310 | 1355.328
+ Compress_Canterbury("ptt5", Optimal) | 1048.621 | 17.723 | 1016.943 | 1075.439
- Compress_Canterbury("sum", Fastest) | 0.227 | 0.059 | 0.204 | 0.721
+ Compress_Canterbury("sum", Fastest) | 0.186 | 0.051 | 0.171 | 0.641
- Compress_Canterbury("sum", NoCompression) | 0.193 | 0.054 | 0.174 | 0.667
+ Compress_Canterbury("sum", NoCompression) | 0.152 | 0.069 | 0.139 | 0.812
- Compress_Canterbury("sum", Optimal) | 61.731 | 2.182 | 58.163 | 70.890
+ Compress_Canterbury("sum", Optimal) | 57.020 | 3.504 | 52.442 | 71.705
- Compress_Canterbury("TestDocument.doc", Fastest) | 0.098 | 0.013 | 0.091 | 0.195
+ Compress_Canterbury("TestDocument.doc", Fastest) | 0.113 | 0.025 | 0.078 | 0.164
- Compress_Canterbury("TestDocument.doc", NoCompression) | 0.079 | 0.006 | 0.074 | 0.104
+ Compress_Canterbury("TestDocument.doc", NoCompression) | 0.069 | 0.018 | 0.061 | 0.238
- Compress_Canterbury("TestDocument.doc", Optimal) | 36.733 | 1.231 | 34.786 | 40.521
+ Compress_Canterbury("TestDocument.doc", Optimal) | 32.599 | 2.655 | 30.229 | 44.355
- Compress_Canterbury("TestDocument.docx", Fastest) | 0.076 | 0.007 | 0.070 | 0.117
+ Compress_Canterbury("TestDocument.docx", Fastest) | 0.115 | 0.009 | 0.103 | 0.160
- Compress_Canterbury("TestDocument.docx", NoCompression) | 0.089 | 0.018 | 0.076 | 0.152
+ Compress_Canterbury("TestDocument.docx", NoCompression) | 0.082 | 0.009 | 0.076 | 0.127
- Compress_Canterbury("TestDocument.docx", Optimal) | 37.472 | 1.558 | 34.181 | 43.270
+ Compress_Canterbury("TestDocument.docx", Optimal) | 36.659 | 2.236 | 33.908 | 46.430
- Compress_Canterbury("TestDocument.pdf", Fastest) | 0.464 | 0.063 | 0.409 | 0.716
+ Compress_Canterbury("TestDocument.pdf", Fastest) | 0.495 | 0.082 | 0.415 | 0.843
- Compress_Canterbury("TestDocument.pdf", NoCompression) | 0.386 | 0.042 | 0.337 | 0.533
+ Compress_Canterbury("TestDocument.pdf", NoCompression) | 0.389 | 0.042 | 0.363 | 0.582
- Compress_Canterbury("TestDocument.pdf", Optimal) | 567.276 | 4.635 | 560.703 | 578.078
+ Compress_Canterbury("TestDocument.pdf", Optimal) | 575.646 | 13.921 | 557.289 | 608.856
- Compress_Canterbury("TestDocument.txt", Fastest) | 0.038 | 0.030 | 0.022 | 0.285
+ Compress_Canterbury("TestDocument.txt", Fastest) | 0.026 | 0.009 | 0.020 | 0.078
- Compress_Canterbury("TestDocument.txt", NoCompression) | 0.044 | 0.011 | 0.032 | 0.105
+ Compress_Canterbury("TestDocument.txt", NoCompression) | 0.041 | 0.009 | 0.034 | 0.073
- Compress_Canterbury("TestDocument.txt", Optimal) | 3.852 | 0.580 | 3.301 | 6.302
+ Compress_Canterbury("TestDocument.txt", Optimal) | 3.494 | 0.194 | 3.239 | 4.282
- Compress_Canterbury("xargs.1", Fastest) | 0.040 | 0.010 | 0.033 | 0.082
+ Compress_Canterbury("xargs.1", Fastest) | 0.032 | 0.005 | 0.028 | 0.052
- Compress_Canterbury("xargs.1", NoCompression) | 0.039 | 0.045 | 0.029 | 0.475
+ Compress_Canterbury("xargs.1", NoCompression) | 0.051 | 0.061 | 0.029 | 0.633
- Compress_Canterbury("xargs.1", Optimal) | 7.464 | 0.516 | 6.646 | 8.969
+ Compress_Canterbury("xargs.1", Optimal) | 6.800 | 0.383 | 6.155 | 8.174 |
|
@buyaa-n @ahsonkhan @ViktorHofer can you please take a look? It's 2+ weeks old PR without code review yet ... |
006d98d to
62e842f
Compare
62e842f to
80c4998
Compare
|
Rebased on master post v1.0.7 merge |
|
From #35172 (comment)
Ideally, the change from this PR wouldn't be required explicitly and we can just updated the brotli version, once available, and keep the sources in sync that way. Otherwise, we have to keep track of these source differences explicitly which increases the maintainability burden since we effectively end up having a fork which has to be consolidated when the next update becomes available. Maybe its the right trade off here given the perf gain, but I am not sure. Do we have some scenario where this perf improvement shows up significantly to help motivate this change? @stephentoub, what are your thoughts on this? |
I would like for us to avoid having forked code long-term; ideally the only reason we have the code in the repo at all is because we need to build it to ensure it's deployed with .NET Core. My preference thus would be to not diverge at all, and just wait for an updated checkpoint from upstream that includes the intrinsics change. However, if there is a meaningful improvement here (e.g. an answer to @ahsonkhan's question "Do we have some scenario where this perf improvement shows up significantly to help motivate this change?"), I'd be ok accepting this PR for the very short term, but with the understanding that it's temporary and that we will simply overwrite the changes the next time we take a source code copy of Brotli, rather than trying to merge. If at that point the PR to upstream went through, great, we'll get the improvements, and if at that point the PR wasn't, well, there must be a reason for it and we'll simply sync up with what's upstream and lose this intrinsics change. |
The scenario would be any place anyone is using I agree that maintaining this patch long-term should be a non-goal. If it gets clobbered by the next upstream sync, we'd be no worse off than we are now. But hopefully by that time, the matching PR will be merged, meaning we just get the benefits for longer if we take it now. |
|
cc: @joshfree @ahsonkhan, can you make a call on this and either review/merge it or close it? Thanks. |
|
Just noticed that @mjsabby mentioned Bing using the netcore Brotli support here https://devblogs.microsoft.com/dotnet/bing-com-runs-on-net-core-2-1/ If that's still true, that might be your compelling use case. |
|
@saucecontrol any idea why your upstream PR hasn't been picked up yet? Since everyone agrees that would be ideal.. |
|
I could only speculate, but some background might help. When I initially submitted the upstream PR over a year ago, the platform/compiler abstraction in the project (platform.h) was going through a redesign, and the structure of my initial changes didn't fit with what the project maintainers were planning. They have since solidified that structure, and I've updated my PR to match, which seems to meet with their approval. In the meantime, work on the project has stalled a bit, so it's probably just a matter of their getting back into it. That said, it looks to me like the next release on their side might not come any time soon, so it made sense to open a patch PR here instead of waiting. |
|
I'm in agreement here that we should not fork brotli from our upstream without a super compelling case. While the performance win is super awesome, I'm in favor of pushing on the original upstream PR and then picking up the latest version of brotli from there. |
Thanks, @joshfree. Given this, I'm going to close this PR. @saucecontrol, thank you very much for your efforts here. To whatever extent we can influence it, let's try to get your changes integrated upstream, and then we'll happily consume an updated version of the official source. |
|
This patch was finally merged in the upstream repo today. I'd still like to get If you look at the post-v1.0.7 commits in the Brotli repo, you can see references to "Shared-Brotli". This is effectively Brotli 2.0, with a new RFC and a new IANA designation ( Now that these specific changes have been merged upstream, would a revival of this PR in the runtime repo be accepted, or would a sync to current master in the upstream repo be acceptable, or do we need to wait for the next official release tag? |
@ericstj, opinion? |
|
If they will by 5.0 then I can see us taking a change now followed by another when they release before 5.0. If there isn't a planned release I'd feel reluctant about us taking a random snap of their codebase since we don't have a read on the quality and it would mean we'd be applying a higher stability promise than the code owners. Is there a way we can get a 1.0.8 release? |
|
Put another way, we could find ourselves in a position where we have to roll back the preview we ingested because they didn't release a stable version by (say) Aug. |
|
I think it's unlikely we'll see a new Brotli release before the cut for 5.0. If there is a release, it's even more unlikely it would be a minor 1.0.8 version, based on the fact that some of the Shared Brotli code has been merged into master already.
Because current master includes changes for the next major version, I would agree. The targeted change in this PR was built on v1.0.7, so it would be safe to apply in isolation, and the risk that it would be clobbered by a future upstream sync has been eliminated now that it's merged there. If the ruling is that we'd only sync on a release milestone upstream, that does answer my question, though. Just wanted to clarify that since there was discussion about diverging from the Google codebase at one point, which has been partially addressed. |
There are 2 places in the Brotli encoder where GCC intrinsics for TZCNT and BSR are used. I have added the MSVC equivalents so we don't incur a performance penalty compared to GCC (and compatible) builds. I submitted a matching PR over in the Google repo, which looks like it will be picked up.
I noticed a slight performance regression between 2.1 and the 3.0 previews, which had been updated to Brotli v1.0.5. This change makes up that perf difference and more.