Skip to content

parser: accelerate DecodingLayerParser#660

Merged
notti merged 2 commits intogoogle:masterfrom
yerden:decoding-layer-container
Jun 20, 2019
Merged

parser: accelerate DecodingLayerParser#660
notti merged 2 commits intogoogle:masterfrom
yerden:decoding-layer-container

Conversation

@yerden
Copy link
Copy Markdown
Contributor

@yerden yerden commented Jun 2, 2019

Hello, everyone,

This is a followup to #657.

I've investigated various ways to gain some extra performance for DecodingLayerParser. In the course of it, I was pursuing these goals:

  • avoid breaking API;
  • avoid performance degradation in existing API;
  • provide maximum flexibility to suit end-user's needs.

So, I ended up adding DecodingLayerContainer which stores DecodingLayer values and allows to address them. Various implementations of DecodingLayerContainer are also presented: DecodingLayerMap, DecodingLayerSparse, DecodingLayerArray.

DecodingLayerParser is refactored to use DecodingLayerContainer. DecodingLayerMap is used by default in current implementation as the most versatile one. Additionally, new decoding logic omits redundant lookup of the first LayerType value. From another side, additional level of indirection may impose some overhead.

Of course, the user may write his/her own DecodingLayerContainer implementation if necessary. Those in need may refer to LayersDecoder function implementation in the package.

Here are benchmark results comparing with master branch, somewhat average value taken:

benchmark                                     old ns/op     new ns/op     delta
BenchmarkDecodingLayerParserIgnorePanic-4     133           130           -2.26%
BenchmarkDecodingLayerParserHandlePanic-4     188           181           -3.72%

As you can see, the performance is nearly the same (or even slightly better) for default DecodingLayerMap container.

Some additional benchmarks added to test new containers:

BenchmarkDecodingLayerParserIgnorePanic-4               10000000               130 ns/op
BenchmarkDecodingLayerParserHandlePanic-4               10000000               179 ns/op

BenchmarkDecodingLayerParserSparseIgnorePanic-4         20000000               113 ns/op (-13.1%)
BenchmarkDecodingLayerParserSparseHandlePanic-4         10000000               163 ns/op (-8.9%)

BenchmarkDecodingLayerParserArrayIgnorePanic-4          10000000               122 ns/op (-6.2%)
BenchmarkDecodingLayerParserArrayHandlePanic-4          10000000               170 ns/op (-5.0%)

BenchmarkDecodingLayerParserMapIgnorePanic-4            10000000               130 ns/op (~)
BenchmarkDecodingLayerParserMapHandlePanic-4            10000000               178 ns/op (~)

As you can see, DecodingLayerSparse container is the fastest with standard LayerType values.

These three benchmarks show how DecodingLayerContainer may be used as a standalone layers decoding tool. As you can see, DecodingLayerParser with identical container imposes some additional overhead though adds some abstraction.

BenchmarkDecodingLayerArray-4                           20000000               118 ns/op (-3.3%)
BenchmarkDecodingLayerMap-4                             10000000               124 ns/op (-4.6%)
BenchmarkDecodingLayerSparse-4                          20000000               106 ns/op (-6.2%)

Tests were run on a laptop (Core i5-3320M), go 1.12.5. Please cross-check the results since I don't have access to another not heavily loaded physical machine.

Some caveat. LayersDecoder method in all presented DecodingLayerContainer implementations is a carbon copy of the package's LayersDecoder function so it leads to some code redundancy. It may be avoided simply by doing indirect call to LayersDecoder() but in that case the performance gains are not convincing that much. I believe it is so due to inlining done by compiler.

@yerden
Copy link
Copy Markdown
Contributor Author

yerden commented Jun 3, 2019

Sorry, force-pushed some changes, quite a few fixups came up.

@notti
Copy link
Copy Markdown
Contributor

notti commented Jun 10, 2019

Test results on my laptop (Core i5-5300U)

name                                    old time/op  new time/op  delta
DecodingLayerParserIgnorePanic-4         138ns ± 0%   136ns ± 0%   -1.81%  (p=0.000 n=8+10)
DecodingLayerParserHandlePanic-4         192ns ± 0%   187ns ± 1%   -2.71%  (p=0.000 n=9+9)
DecodingLayerParserSparseIgnorePanic-4   138ns ± 0%   125ns ± 0%   -9.42%  (p=0.000 n=8+8)
DecodingLayerParserSparseHandlePanic-4   192ns ± 0%   175ns ± 1%   -9.06%  (p=0.000 n=9+9)
DecodingLayerParserArrayIgnorePanic-4    138ns ± 0%   131ns ± 0%   -5.07%  (p=0.000 n=8+7)
DecodingLayerParserArrayHandlePanic-4    192ns ± 0%   181ns ± 2%   -6.00%  (p=0.000 n=9+10)
DecodingLayerParserMapIgnorePanic-4      138ns ± 0%   137ns ± 1%   -0.80%  (p=0.002 n=8+10)
DecodingLayerParserMapHandlePanic-4      192ns ± 0%   190ns ± 1%   -1.43%  (p=0.000 n=9+10)
DecodingLayerArray-4                     138ns ± 0%   128ns ± 0%   -7.25%  (p=0.002 n=8+10)
DecodingLayerMap-4                       138ns ± 0%   133ns ± 2%   -3.62%  (p=0.000 n=8+10)
DecodingLayerSparse-4                    138ns ± 0%   114ns ± 0%  -17.03%  (p=0.000 n=8+10)

(all compared to the old implementation of DLP; last 3 compared to old implementation of DecodingLayerParserIgnorePanic)

@notti
Copy link
Copy Markdown
Contributor

notti commented Jun 10, 2019

Hmm right now this looks like a lot code duplication: LayersDecoder looks the same in every layer. Can this be somehow reduced to one implementation? Otherwise, this looks good to me.

@yerden
Copy link
Copy Markdown
Contributor Author

yerden commented Jun 11, 2019

@notti thanks for the feedback. Unfortunately, as I stated earlier, attempt to make a single LayersDecoder function and use it by reference in any DecodingLayerContainer implementation makes performance gain much less palpable in new containers. And even, in case of DecodingLayerMap, this leads to some performance loss. The reason, I believe, is inability to inline Decoder() method inside indrect LayersDecoder() function call during compilation. I'll take a look about what can be done here.

@yerden
Copy link
Copy Markdown
Contributor Author

yerden commented Jun 11, 2019

@notti I've managed to reduce code duplication a little in 627aaa6 with no performance degradation.

@notti
Copy link
Copy Markdown
Contributor

notti commented Jun 17, 2019

Hmm there is still 5 times the same function. I think there is no other way without loosing inlining/other optimizations. So last resort I could think of: how about generating the code like with gen2.go in layers?

@yerden
Copy link
Copy Markdown
Contributor Author

yerden commented Jun 18, 2019

@notti thanks for the call, please see squashed revision.

Copy link
Copy Markdown
Contributor

@notti notti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for this last nitpick. With this last thing I'd say it is ready to merge. Thanks for the patience so far.

yerden added 2 commits June 20, 2019 10:03
Changes:
* add DecodingLayerContainer to contain/search DecodingLayer values:
  * add DecodingLayerFunc type as a decoder closure;
  * add DecodingLayerMap, DecodingLayerSparse, DecodingLayerArray
    as implementations of DecodingLayerContainer;
* add code generation for the reference LayersDecoder method
  implementation.
* refactor DecodingLayerParser:
  * use DecodingLayerMap by default;
  * add SetDecodingLayerContainer method to initialize parser.

Signed-off-by: Yerden Zhumabekov <[email protected]>
@yerden yerden closed this Jun 20, 2019
@yerden yerden reopened this Jun 20, 2019
@yerden
Copy link
Copy Markdown
Contributor Author

yerden commented Jun 20, 2019

Windows tests were failing, I closed and reopened PR to trigger new build and they are fine now. A mystery.

@notti notti merged commit 836b571 into google:master Jun 20, 2019
@notti
Copy link
Copy Markdown
Contributor

notti commented Jun 20, 2019

Windows tests were failing, I closed and reopened PR to trigger new build and they are fine now. A mystery.

Yeah somehow travis ci is not completely reliable.

Thanks for the contribution and patience with my requests.

@yerden
Copy link
Copy Markdown
Contributor Author

yerden commented Jun 20, 2019

Thanks for the contribution and patience with my requests.

Not at all, thanks to you I learned some new tricks :)

@yerden yerden deleted the decoding-layer-container branch June 20, 2019 15:45
tklauser added a commit to tklauser/gopacket that referenced this pull request Dec 16, 2020
Before v1.1.18 passing an empty data slice to
(*DecodingLayerParser).DecodeLayers would return a nil error. Commit
04f6565 (PR google#660) changed this behavior, passing the empty data to the
to return an error from the first layer that considers an empty slice to
be invalid.

Strictly speaking this is an API breakage, so restore the previous
behavior by returning nil straight away on empty data.

Fixes google#846

Signed-off-by: Tobias Klauser <[email protected]>
tklauser added a commit to tklauser/gopacket that referenced this pull request Dec 16, 2020
Before v1.1.18 passing an empty data slice to
(*DecodingLayerParser).DecodeLayers would return a nil error. Commit
04f6565 (PR google#660) changed this behavior, passing the empty data to the
to return an error from the first layer that considers an empty slice to
be invalid.

Strictly speaking this is an API breakage, so restore the previous
behavior by returning nil straight away on empty data.

Fixes google#846

Signed-off-by: Tobias Klauser <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants