-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Closed
Labels
area-System.Text.RegularExpressionstenet-performancePerformance related issuePerformance related issue
Milestone
Description
Given an alternation like "abc|def|ghi", we will currently do an IndexOfAny to find the next position for a possible match. At that point, we try to avoid false positives by doing a few set lookups, e.g. we IndexOfAny('a', 'd', 'g'), and then we test whether the next character is in the set [beh]. But specifically for expressions that are small alternations, or that begin with small alternations, or that begin with constructs we can convert to small alternations, we could instead use Aho-Corasick to actually validate whether any of the strings in the alternation are a prefix of the text at the current location. Rust uses a vectorized implementation named Teddy.
Metadata
Metadata
Assignees
Labels
area-System.Text.RegularExpressionstenet-performancePerformance related issuePerformance related issue