Skip to content

<regex>: Implement small vector optimization for state frames#6147

Merged
StephanTLavavej merged 11 commits into
microsoft:mainfrom
muellerj2:regex-small-vector-optimization-for-state-frames
Mar 10, 2026
Merged

<regex>: Implement small vector optimization for state frames#6147
StephanTLavavej merged 11 commits into
microsoft:mainfrom
muellerj2:regex-small-vector-optimization-for-state-frames

Conversation

@muellerj2
Copy link
Copy Markdown

Resolves #5969.

This adds a minimal implementation of a vector variant that supports small vector optimization. As its initial storage area, it uses what is left of the stack-allocated buffer after all the other buffers used by the matcher have been initialized. If more memory is necessary, it reallocates like std::vector at the same rate of growth (factor 1.5).

Drive-by change:

  • Handle fancy pointers more cleanly in the buffers (although there can't be any fancy pointers yet because we haven't wired up the allocators).
  • Unify the allocation guards for vector, _Rx_fixed_size_buffer and _Rx_small_vector.

Benchmarks

regex_search
benchmark before [ns] after [ns] speedup
bm_lorem_search/"^bibe"/2 54.4085 56.25 0.97
bm_lorem_search/"^bibe"/3 58.5938 54.4085 1.08
bm_lorem_search/"^bibe"/4 55.8036 53.0134 1.05
bm_lorem_search/"bibe"/2 3138.95 2982.01 1.05
bm_lorem_search/"bibe"/3 6277.9 5580.36 1.12
bm_lorem_search/"bibe"/4 11300.2 11718.8 0.96
bm_lorem_search/"bibe".collate/2 2999.44 2982.01 1.01
bm_lorem_search/"bibe".collate/3 6277.9 5625 1.12
bm_lorem_search/"bibe".collate/4 11997.8 10986.3 1.09
bm_lorem_search/"(bibe)"/2 5440.85 3431.91 1.59
bm_lorem_search/"(bibe)"/3 10602.7 7149.83 1.48
bm_lorem_search/"(bibe)"/4 21484.4 13811.3 1.56
bm_lorem_search/"(bibe)+"/2 9416.81 4464.29 2.11
bm_lorem_search/"(bibe)+"/3 18415.3 9416.85 1.96
bm_lorem_search/"(bibe)+"/4 36900.6 16741.2 2.20
bm_lorem_search/"(?:bibe)+"/2 5998.88 3955.08 1.52
bm_lorem_search/"(?:bibe)+"/3 11718.8 8021.76 1.46
bm_lorem_search/"(?:bibe)+"/4 22495.6 15729.7 1.43
bm_lorem_search/R"(\bbibe)"/2 83701.6 83705.4 1.00
bm_lorem_search/R"(\bbibe)"/3 168795 163923 1.03
bm_lorem_search/R"(\bbibe)"/4 337672 327846 1.03
bm_lorem_search/R"(\Bibe)"/2 188354 187976 1.00
bm_lorem_search/R"(\Bibe)"/3 395570 368968 1.07
bm_lorem_search/R"(\Bibe)"/4 749860 739397 1.01
bm_lorem_search/R"((?=....)bibe)"/2 5156.25 4171.31 1.24
bm_lorem_search/R"((?=....)bibe)"/3 9765.62 8196.15 1.19
bm_lorem_search/R"((?=....)bibe)"/4 19043 14997.2 1.27
bm_lorem_search/R"((?=bibe)....)"/2 4708.44 3766.73 1.25
bm_lorem_search/R"((?=bibe)....)"/3 9207.55 7114.96 1.29
bm_lorem_search/R"((?=bibe)....)"/4 17996.8 14299.7 1.26
bm_lorem_search/R"((?!lorem)bibe)"/2 4289.91 3609.79 1.19
bm_lorem_search/R"((?!lorem)bibe)"/3 8579.76 7533.48 1.14
bm_lorem_search/R"((?!lorem)bibe)"/4 17578.3 13497.4 1.30
regex_match
benchmark before [ns] after [ns] speedup
bm_match_sequence_of_as/"a*"/100 1349.75 1143.97 1.18
bm_match_sequence_of_as/"a*"/200 2563.47 2539.06 1.01
bm_match_sequence_of_as/"a*"/400 4541.02 4080.63 1.11
bm_match_sequence_of_as/"a*?"/100 2490.24 2511.16 0.99
bm_match_sequence_of_as/"a*?"/200 4882.81 4757.26 1.03
bm_match_sequence_of_as/"a*?"/400 9626.07 9207.55 1.05
bm_match_sequence_of_as/"(?:a)*"/100 2008.93 1649.69 1.22
bm_match_sequence_of_as/"(?:a)*"/200 3606.31 3002.93 1.20
bm_match_sequence_of_as/"(?:a)*"/400 6975.45 6138.39 1.14
bm_match_sequence_of_as/"(a)*"/100 2786.7 2301.89 1.21
bm_match_sequence_of_as/"(a)*"/200 5000 4394.53 1.14
bm_match_sequence_of_as/"(a)*"/400 9765.62 8719.31 1.12
bm_match_sequence_of_as/"(a)*?"/100 3690 3138.95 1.18
bm_match_sequence_of_as/"(a)*?"/200 6975.45 6277.9 1.11
bm_match_sequence_of_as/"(a)*?"/400 13113.8 11962.9 1.10
bm_match_sequence_of_as/"(?:b|a)*"/100 5081.62 3923.69 1.30
bm_match_sequence_of_as/"(?:b|a)*"/200 9068.08 8370.54 1.08
bm_match_sequence_of_as/"(?:b|a)*"/400 16043.5 15694.8 1.02
bm_match_sequence_of_as/"(b|a)*"/100 6556.92 7847.38 0.84
bm_match_sequence_of_as/"(b|a)*"/200 18833.9 17159.8 1.10
bm_match_sequence_of_as/"(b|a)*"/400 51618.3 35296.2 1.46
bm_match_sequence_of_as/"(a)(?:b|a)*"/100 4813.07 4098.07 1.17
bm_match_sequence_of_as/"(a)(?:b|a)*"/200 8719.31 8370.54 1.04
bm_match_sequence_of_as/"(a)(?:b|a)*"/400 16741.1 16043.5 1.04
bm_match_sequence_of_as/"(a)(b|a)*"/100 6452.29 8196.15 0.79
bm_match_sequence_of_as/"(a)(b|a)*"/200 15066.9 17264.3 0.87
bm_match_sequence_of_as/"(a)(b|a)*"/400 53125 35992.7 1.48
bm_match_sequence_of_as/"(a)(?:b|a)*c"/100 5580.36 5000 1.12
bm_match_sequence_of_as/"(a)(?:b|a)*c"/200 10498 9416.81 1.11
bm_match_sequence_of_as/"(a)(?:b|a)*c"/400 19949.5 18415.3 1.08
bm_match_sequence_of_9a1b/"(?:ab)"/100 16741.1 14648.4 1.14
bm_match_sequence_of_9a1b/"(?:ab)"/200 33691.9 28599.3 1.18
bm_match_sequence_of_9a1b/"(?:ab)"/400 75334.8 64062.5 1.18

@muellerj2 muellerj2 requested a review from a team as a code owner March 6, 2026 21:45
@github-project-automation github-project-automation Bot moved this to Initial Review in STL Code Reviews Mar 6, 2026
@StephanTLavavej StephanTLavavej added performance Must go faster regex meow is a substring of homeowner labels Mar 6, 2026
@StephanTLavavej StephanTLavavej self-assigned this Mar 6, 2026
Comment thread stl/inc/regex
Comment thread stl/inc/regex
Comment thread stl/inc/regex Outdated
Comment thread stl/inc/regex Outdated
Comment thread stl/inc/regex Outdated
Comment thread stl/inc/regex
Comment thread stl/inc/regex Outdated
Comment thread stl/inc/regex Outdated
Comment thread stl/inc/regex Outdated
Comment thread stl/inc/regex Outdated
Comment thread stl/inc/regex Outdated
@StephanTLavavej StephanTLavavej removed their assignment Mar 9, 2026
@StephanTLavavej StephanTLavavej moved this from Initial Review to Merging in STL Code Reviews Mar 9, 2026
@StephanTLavavej
Copy link
Copy Markdown
Member

I'm mirroring this to the MSVC-internal repo. Please notify me if any further changes are pushed, otherwise no action is required.

@StephanTLavavej StephanTLavavej merged commit 26b7ca5 into microsoft:main Mar 10, 2026
49 checks passed
@github-project-automation github-project-automation Bot moved this from Merging to Done in STL Code Reviews Mar 10, 2026
@StephanTLavavej
Copy link
Copy Markdown
Member

Thanks for this incredible series of PRs for 14.51! 😻 📈 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Must go faster regex meow is a substring of homeowner

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

<regex>: Avoid heap allocations for few capturing groups and stack frames

4 participants