Combine the suffixes for shorter regular expression. #1651
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
Per #1647 large glossaries cause the expression to fail to execute when the glossary item count is high.
This is due to regular expressions having a limit to the overall length.
Solution
This solution is not the most ideal, ideally this should not be using a regular expression at all.
This solution simply combines the regex such that the suffix matches are only included once, rather than individually on every single term.
For Translate.WordPress.org glossaries, it reduces the regex length...
Roughly a 50-60% decrease in length.
Initially there were no logical changes I could see, but now that I'm PR'ing it, I can see that there's a "small" change - the terms are no longer sorted by length, as they're only sorted within their "suffix group".
\b(favorited(?:s|es|ed|ing)?|favorites(?:s|es|ed|ing)?|zip code|url(?:s|es|ed|ing)?)\b\b((?:favorited|favorites|url)(?:s|es|ed|ing)?)|(?:zip code))\b(Yes, those suffixes are wildly inaccurate, but that's not the purpose of this issue/pr)
To-do
Testing Instructions
Screenshots or screencast