{"id":3542,"date":"2018-05-09T00:39:00","date_gmt":"2018-05-09T00:39:00","guid":{"rendered":"https:\/\/webtide.com\/?p=3542"},"modified":"2018-05-09T00:39:00","modified_gmt":"2018-05-09T00:39:00","slug":"fast-multipart-formdata","status":"publish","type":"post","link":"https:\/\/webtide.com\/fast-multipart-formdata\/","title":{"rendered":"Fast MultiPart FormData"},"content":{"rendered":"<p>Jetty&#8217;s venerable <a href=\"https:\/\/github.com\/eclipse\/jetty.project\/blob\/jetty-9.4.10.v20180503\/jetty-util\/src\/main\/java\/org\/eclipse\/jetty\/util\/MultiPartInputStreamParser.java\">MultiPartInputStreamParser<\/a> for parsing MultiPart form-data has been deprecated and replaced by the much more efficient <a href=\"https:\/\/github.com\/eclipse\/jetty.project\/blob\/jetty-9.4.10.v20180503\/jetty-http\/src\/main\/java\/org\/eclipse\/jetty\/http\/MultiPartFormInputStream.java\">MultiPartFormInputStream<\/a>, based on a new <a href=\"https:\/\/github.com\/eclipse\/jetty.project\/blob\/jetty-9.4.10.v20180503\/jetty-http\/src\/main\/java\/org\/eclipse\/jetty\/http\/MultiPartParser.java\">MultiPartParser<\/a>. This is much faster, but less forgiving of non-compliant format. So we have implemented a legacy mode to access the old parser, but with enhancements to make logging of compliance violations possible.<\/p>\n<h2><strong>Benchmarks<\/strong><\/h2>\n<p>We have achieved an order of magnitude speed-up in the parsing of large uploaded content and even small content is significantly faster.<br \/>\nWe performed a JMH benchmark of the (new) HTTP MultiPartFormInputStream vs the (old) UTIL MultiPartInputStreamParser. Our tests were:<\/p>\n<ul>\n<li><em>testLargeGenerated:<\/em>\u00a0 parses a 10MB file of random binary data<\/li>\n<li><em>testParser:<\/em>\u00a0 parses a series of small multipart forms captured by a browser<\/li>\n<\/ul>\n<p>Our results clearly show that the new multipart processing is superior in terms of speed to the old processing:<\/p>\n<pre class=\"lang:default decode:true \"># Run complete. Total time: 00:02:09\nBenchmark                              (parserType)  Mode  Cnt  Score   Error  Units\nMultiPartBenchmark.testLargeGenerated          UTIL  avgt   10  0.252 \u00b1 0.025   s\/op\nMultiPartBenchmark.testLargeGenerated          HTTP  avgt   10  0.035 \u00b1 0.004   s\/op\nMultiPartBenchmark.testParser                  UTIL  avgt   10  0.028 \u00b1 0.005   s\/op\nMultiPartBenchmark.testParser                  HTTP  avgt   10  0.015 \u00b1 0.006   s\/op\n<\/pre>\n<h2><strong>How To Use<\/strong><\/h2>\n<p>By default in Jetty 9.4, the old MultiPartInputStreamParser will be used. The default will be switched to the new MultiPartInputStreamParser in jetty-10.\u00a0 To use the new parser (available since release <a href=\"https:\/\/github.com\/eclipse\/jetty.project\/tree\/jetty-9.4.10.v20180503\"><strong>9.4.10<\/strong><\/a>)\u00a0 you can change the <a href=\"#complianceModes\">compliance mode<\/a> in the server.ini file so that it defaults to using RFC7578 instead of the LEGACY mode.<\/p>\n<pre class=\"lang:default decode:true\">## multipart\/form-data compliance mode of: LEGACY(slow), RFC7578(fast)\n# jetty.httpConfig.multiPartFormDataCompliance=LEGACY<\/pre>\n<p>This feature can also be used programmatically by setting the compliance mode through the HttpConfiguration instance which can be obtained through the HttpConnectionFactory in the connector.<\/p>\n<pre class=\"lang:default decode:true\">connector.getConnectionFactory(HttpConnectionFactory.class).getHttpConfiguration()\n.setMultiPartFormDataCompliance(MultiPartFormDataCompliance.RFC7578);\n<\/pre>\n<h2 id=\"complianceModes\"><strong>Compliance Modes<\/strong><\/h2>\n<p>There are now two compliance modes for MultiPart form parsing:<\/p>\n<ul>\n<li><strong>LEGACY<\/strong> mode which uses the old MultiPartInputStreamParser in jetty-util, this will be slower but more forgiving in accepting formats that are non-compliant with RFC7578.<\/li>\n<li><strong>RFC7578<\/strong> mode which uses the new MultiPartFormInputStream in jetty-http, this will perform faster than the LEGACY mode, however, there may be issues in receiving badly formatted MultiPart forms that were previously accepted.<\/li>\n<\/ul>\n<p>The default compliance mode is currently LEGACY, however, this will be changed to RFC7578 a future release.<\/p>\n<h2><strong>Legacy Mode Compliance Warnings<\/strong><\/h2>\n<p>When the old MultiPartInputStreamParser accepts a format non-compliant with the RFC, a violation is recorded as an attribute in the request. These violations include:<\/p>\n<ul>\n<li><span class=\"s2\">CR_LINE_TERMINATION: <\/span>\n<ul>\n<li><span class=\"s2\"><a href=\"https:\/\/tools.ietf.org\/html\/rfc2046#section-4.1.1\">Carriage return used as line break instead of CRLF<br \/>\n<\/a><\/span><\/li>\n<\/ul>\n<\/li>\n<li><span class=\"s2\">LF_LINE_TERMINATION: <\/span>\n<ul>\n<li><span class=\"s2\"><a href=\"https:\/\/tools.ietf.org\/html\/rfc2046#section-4.1.1\">Line feed used as line break instead of CRLF<br \/>\n<\/a><\/span><\/li>\n<\/ul>\n<\/li>\n<li><span class=\"s2\">NO_CRLF_AFTER_PREAMBLE:\u00a0 <\/span>\n<ul>\n<li><span class=\"s2\"><a href=\"https:\/\/tools.ietf.org\/html\/rfc2046#section-5.1.1\">CRLF did not appear after preamble before initial boundary<br \/>\n<\/a><\/span><\/li>\n<\/ul>\n<\/li>\n<li><span class=\"s2\">BASE64_TRANSFER_ENCODING: <\/span>\n<ul>\n<li><span class=\"s2\"><a href=\"https:\/\/tools.ietf.org\/html\/rfc7578#section-4.7\">Content transfer encoding has been deprecated<br \/>\n<\/a><\/span><\/li>\n<\/ul>\n<\/li>\n<li><span class=\"s2\">QUOTED_PRINTABLE_TRANSFER_ENCODING: <\/span>\n<ul>\n<li><span class=\"s2\"><a href=\"https:\/\/tools.ietf.org\/html\/rfc7578#section-4.7\">Content transfer encoding has been deprecated<\/a><\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>The list of violations as Strings can be obtained from the request by accessing the attribute\u00a0 <strong>HttpCompliance.VIOLATIONS_ATTR<\/strong>.<\/p>\n<pre class=\"lang:java decode:true\">(List&lt;String&gt;)request.getAttribute(HttpCompliance.VIOLATIONS_ATTR);<\/pre>\n<p>Each violation string gives the name of the violation followed by a link to the RFC describing that particular violation.<br \/>\nHere&#8217;s an example:<br \/>\nCR_LINE_TERMINATION: https:\/\/tools.ietf.org\/html\/rfc2046#section-4.1.1<br \/>\nNO_CRLF_AFTER_PREAMBLE: https:\/\/tools.ietf.org\/html\/rfc2046#section-5.1.1<\/p>\n<h2><strong>The Future<\/strong><\/h2>\n<p>The parser is async capable, so expect further innovations with non-blocking uploads and possibly <a href=\"http:\/\/www.reactive-streams.org\/\">reactive<\/a> parts.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Jetty&#8217;s venerable MultiPartInputStreamParser for parsing MultiPart form-data has been deprecated and replaced by the much more efficient MultiPartFormInputStream, based on a new MultiPartParser. This is much faster, but less forgiving of non-compliant format. So we have implemented a legacy mode to access the old parser, but with enhancements to make [&hellip;]<\/p>\n","protected":false},"author":11,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[42,1],"tags":[],"class_list":["post-3542","post","type-post","status-publish","format-standard","hentry","category-performance","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/webtide.com\/wp-json\/wp\/v2\/posts\/3542","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/webtide.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/webtide.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/webtide.com\/wp-json\/wp\/v2\/users\/11"}],"replies":[{"embeddable":true,"href":"https:\/\/webtide.com\/wp-json\/wp\/v2\/comments?post=3542"}],"version-history":[{"count":0,"href":"https:\/\/webtide.com\/wp-json\/wp\/v2\/posts\/3542\/revisions"}],"wp:attachment":[{"href":"https:\/\/webtide.com\/wp-json\/wp\/v2\/media?parent=3542"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/webtide.com\/wp-json\/wp\/v2\/categories?post=3542"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/webtide.com\/wp-json\/wp\/v2\/tags?post=3542"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}