-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Charset: Rely on new UTF-8 pipeline for mb_substr() fallback. #9829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
6d20d74 to
6a2908b
Compare
20b3871 to
4730c30
Compare
98435d0 to
35ee6cd
Compare
45f6697 to
f67bb55
Compare
307bbe8 to
8ef9abc
Compare
8ef9abc to
2404133
Compare
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN: To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
6a2ddd5 to
fa7821d
Compare
…ess#9829) The existing polyfill for `mb_substr()` contains a number of issues áving plenty of opportunity for improvement. Specifically, the following are all deficiencies: it relies on Unicode PCRE support, assumes input strings are valid UTF-8, splits input strings into an array of characters (1,000 at a time, iterating until complete), and re-joins them at the end. This patch provides an updated polyfill which will reliably parse UTF-8 strings even in the presence of invalid bytes. It computes boundaries for the substring extraction with zero allocations and then returns a single `substr()` call at the end. This change improves the reliability of UTF-8 string handling and removes behavioral variability based on the runtime system. Github-PR: 9829 Github-PR-URL: WordPress#9829 Trac-Ticket: 63863 Trac-Ticket-URL: https://core.trac.wordpress.org/ticket/63863
fa7821d to
f7fb52b
Compare
The existing polyfill for `mb_substr()` contains a number of issues leaving plenty of opportunity for improvement. Specifically, the following are all deficiencies: it relies on Unicode PCRE support, assumes input strings are valid UTF-8, splits input strings into an array of characters (1,000 at a time, iterating until complete), and re-joins them at the end. This patch provides an updated polyfill which will reliably parse UTF-8 strings even in the presence of invalid bytes. It computes boundaries for the substring extraction with zero allocations and then returns a single `substr()` call at the end. This change improves the reliability of UTF-8 string handling and removes behavioral variability based on the runtime system. Developed in #9829 Discussed in https://core.trac.wordpress.org/ticket/63863 See #63863. git-svn-id: https://develop.svn.wordpress.org/trunk@60969 602fd350-edb4-49c9-b593-d223f7449a82
The existing polyfill for `mb_substr()` contains a number of issues leaving plenty of opportunity for improvement. Specifically, the following are all deficiencies: it relies on Unicode PCRE support, assumes input strings are valid UTF-8, splits input strings into an array of characters (1,000 at a time, iterating until complete), and re-joins them at the end. This patch provides an updated polyfill which will reliably parse UTF-8 strings even in the presence of invalid bytes. It computes boundaries for the substring extraction with zero allocations and then returns a single `substr()` call at the end. This change improves the reliability of UTF-8 string handling and removes behavioral variability based on the runtime system. Developed in WordPress/wordpress-develop#9829 Discussed in https://core.trac.wordpress.org/ticket/63863 See #63863. Built from https://develop.svn.wordpress.org/trunk@60969 git-svn-id: http://core.svn.wordpress.org/trunk@60305 1a063a9b-81f0-0310-95a4-ce76da25c4cd
The existing polyfill for `mb_substr()` contains a number of issues leaving plenty of opportunity for improvement. Specifically, the following are all deficiencies: it relies on Unicode PCRE support, assumes input strings are valid UTF-8, splits input strings into an array of characters (1,000 at a time, iterating until complete), and re-joins them at the end. This patch provides an updated polyfill which will reliably parse UTF-8 strings even in the presence of invalid bytes. It computes boundaries for the substring extraction with zero allocations and then returns a single `substr()` call at the end. This change improves the reliability of UTF-8 string handling and removes behavioral variability based on the runtime system. Developed in WordPress/wordpress-develop#9829 Discussed in https://core.trac.wordpress.org/ticket/63863 See #63863. Built from https://develop.svn.wordpress.org/trunk@60969 git-svn-id: https://core.svn.wordpress.org/trunk@60305 1a063a9b-81f0-0310-95a4-ce76da25c4cd
Trac ticket: Core-63863
See:
#9825,#9830,#9498,#9826,#9827, #9798,#9828, (#9829)Update the polyfill of
mb_substr()to rely on the new UTF-8 pipeline.