-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Description
MultipartReaderStream currently uses a Boyer-Moore substring search implementation as part of finding the next boundary:
aspnetcore/src/Http/WebUtilities/src/MultipartReaderStream.cs
Lines 279 to 294 in 53aad98
| var matchBytesLengthMinusOne = matchBytes.Length - 1; | |
| var matchBytesLastByte = matchBytes[matchBytesLengthMinusOne]; | |
| var segmentEndMinusMatchBytesLength = segment1.Offset + segment1.Count - matchBytes.Length; | |
| matchOffset = segment1.Offset; | |
| while (matchOffset < segmentEndMinusMatchBytesLength) | |
| { | |
| var lookaheadTailChar = segment1.Array![matchOffset + matchBytesLengthMinusOne]; | |
| if (lookaheadTailChar == matchBytesLastByte && | |
| CompareBuffers(segment1.Array, matchOffset, matchBytes, 0, matchBytesLengthMinusOne) == 0) | |
| { | |
| matchCount = matchBytes.Length; | |
| return true; | |
| } | |
| matchOffset += _boundary.GetSkipValue(lookaheadTailChar); | |
| } |
We should check to see whether this is still more beneficial than just using the vectorized:
https://github.com/dotnet/runtime/blob/7b91fd42a64732681472afc8d5a52c5bc5eb0c8a/src/libraries/System.Private.CoreLib/src/System/MemoryExtensions.cs#L1694
Regex used to use Boyer-Moore as well, and in .NET 7 deleted its use of Boyer-Moore entirely, instead just using IndexOf, for significant gains in most cases. See https://devblogs.microsoft.com/dotnet/regular-expression-improvements-in-dotnet-7/#leading-vectorization for details.