-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Allow block attributes strings to terminate in \ character #9558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow block attributes strings to terminate in \ character #9558
Conversation
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
| return strtr( | ||
| $encoded_attributes, | ||
| array( | ||
| '\\\\' => '\\u005c', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this to align with a different approach in WordPress/gutenberg#71291.
By replacing escaped \ characters with their escape sequence (first — strtr operates on the longest key first and does not perform replacements in already replaced segments), it makes the replacement of escaped quotes \" much easier.
The return value is a string where all the occurrences of the array keys have been replaced by the corresponding values. The longest keys will be tried first. Once a substring has been replaced, its new value will not be searched again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It says the longest keys are replaced first, but '\\\\' is not the longest. It is the same length as '--' and '\\"'. It doesn't seem to indicate that the first array key is also replaced first.
If this is the case, should there be a separate strstr() call first specifically for replacing '\\\\'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a really good point. The two relevant character sequences for search are \\ and \" because they have some overlap.
In the character sequence \\" we don't want to process \" first to produce \\u0022, we want to process \\ first to produce \u005c".
The PR includes tests where it's clear it's working well. I believe the processing is from the start of the string towards the end.
var_dump( strtr(
'\\\\"',
array(
'\\"' => '[dbl-quote]',
'\\\\' => '[backslash]',
)
) );
// string(12) "[backslash]""
var_dump( strtr(
'\\"\\\\',
array(
'\\"' => '[dbl-quote]',
'\\\\' => '[backslash]',
)
) );
// string(22) "[dbl-quote][backslash]"In both cases, we get the desired result and it seems to be the behavior for any reasonably recent PHP version (7.0+ and even earlier I believe).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can add a specific test for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a test for this. When run against trunk it fails like this:
_Blocks_Serialize
✘ Attribute encoding
┐
├ Failed asserting that two strings are identical.
┊ ---·Expected
┊ +++·Actual
┊ @@ @@
┊ -'<!--·wp:test·{"lt":"\u003c","gt":"\u003e","amp":"\u0026","bs":"\u005c","quot":"\u0022","bs-bs-quot":"\u005c\u005c\u0022"}·/-->'
┊ +'<!--·wp:test·{"lt":"\u003c","gt":"\u003e","amp":"\u0026","bs":"\\u0022,"quot":"\u0022","bs-bs-quot":"\\\\\u0022"}·/-->'
│
╵ /Users/jonsurrell/a8c/wordpress-develop/trunk/tests/phpunit/tests/blocks/serialize.php:35
┴
You can see the bad escaping there: "bs":"\\u0022. In this test on the branch we can confirm that the character sequence \\\" is replaced as desired with \u005c\u005c\u0022.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That quote is BS.
3f8fa52 to
b041b24
Compare
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN: To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Co-authored-by: Shail Mehta <[email protected]>
|
Experimenting with using Gemini CLI to help with code reviewing. gh pr checkout https://github.com/WordPress/wordpress-develop/pull/9558;
git diff trunk... | gemini -m "gemini-2.5-pro" -p "Review these changes to WordPress core. Anything amiss?"Output:
|
westonruter
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also looks good to me!
Fix an issue in the block editor where attributes that terminate in the \ character are mis-encoded and cause block attributes to be lost. This change encodes `\` characters with their Unicode escape sequence `\u005c` instead of their escaped form `\\`. This makes the replacement of escaped double quotes `\"` much simpler because the preceding `\` character must be the escape character for the quote and an escaped character itself. Escaping of `\"` was originally introduced in #6619 to address an issue where `wp_kses_stripslashes()` would replace escaped double quotes `\"` with plain quotes `"` and break JSON syntax (`{"str":"\""}` becomes `{"str":"""}`). There is a companion ticket for WordPress Core: https://core.trac.wordpress.org/ticket/63917 And an associated PR to apply the same updated JSON encoding: WordPress/wordpress-develop#9558 See a related ticket about `wp_kses_stripslashes()` and its purpose today: https://core.trac.wordpress.org/ticket/63881 --- Unlinked contributors: ehti, Alexius08. Co-authored-by: sirreal <[email protected]> Co-authored-by: dmsnell <[email protected]> Co-authored-by: Mamaduka <[email protected]> Co-authored-by: skorasaurus <[email protected]> Co-authored-by: youknowriad <[email protected]>
…butes. Corrects an issue with block attribute encoding where JSON strings ending in the `\` character would be misencoded and cause block attributes to be lost. Client-side block serialization was updated with matching logic in WordPress/gutenberg@10453ab. Developed in #9558. Props jonsurrell, westonruter, mamaduka, dmsnell, shailu25. Fixes #63917. git-svn-id: https://develop.svn.wordpress.org/trunk@60708 602fd350-edb4-49c9-b593-d223f7449a82
…butes. Corrects an issue with block attribute encoding where JSON strings ending in the `\` character would be misencoded and cause block attributes to be lost. Client-side block serialization was updated with matching logic in WordPress/gutenberg@10453ab. Developed in WordPress/wordpress-develop#9558. Props jonsurrell, westonruter, mamaduka, dmsnell, shailu25. Fixes #63917. Built from https://develop.svn.wordpress.org/trunk@60708 git-svn-id: http://core.svn.wordpress.org/trunk@60044 1a063a9b-81f0-0310-95a4-ce76da25c4cd
…butes. Corrects an issue with block attribute encoding where JSON strings ending in the `\` character would be misencoded and cause block attributes to be lost. Client-side block serialization was updated with matching logic in WordPress/gutenberg@10453ab. Developed in WordPress/wordpress-develop#9558. Props jonsurrell, westonruter, mamaduka, dmsnell, shailu25. Fixes #63917. Built from https://develop.svn.wordpress.org/trunk@60708 git-svn-id: https://core.svn.wordpress.org/trunk@60044 1a063a9b-81f0-0310-95a4-ce76da25c4cd
Fix an issue in the block editor where attributes that terminate in the \ character are mis-encoded and cause block attributes to be lost. This change encodes `\` characters with their Unicode escape sequence `\u005c` instead of their escaped form `\\`. This makes the replacement of escaped double quotes `\"` much simpler because the preceding `\` character must be the escape character for the quote and an escaped character itself. Escaping of `\"` was originally introduced in WordPress/gutenberg#6619 to address an issue where `wp_kses_stripslashes()` would replace escaped double quotes `\"` with plain quotes `"` and break JSON syntax (`{"str":"\""}` becomes `{"str":"""}`). There is a companion ticket for WordPress Core: https://core.trac.wordpress.org/ticket/63917 And an associated PR to apply the same updated JSON encoding: WordPress/wordpress-develop#9558 See a related ticket about `wp_kses_stripslashes()` and its purpose today: https://core.trac.wordpress.org/ticket/63881 --- Unlinked contributors: ehti, Alexius08. Co-authored-by: sirreal <[email protected]> Co-authored-by: dmsnell <[email protected]> Co-authored-by: Mamaduka <[email protected]> Co-authored-by: skorasaurus <[email protected]> Co-authored-by: youknowriad <[email protected]> Source: WordPress/gutenberg@10453ab
…butes. Corrects an issue with block attribute encoding where JSON strings ending in the `\` character would be misencoded and cause block attributes to be lost. Client-side block serialization was updated with matching logic in WordPress/gutenberg@10453ab. Developed in WordPress#9558. Props jonsurrell, westonruter, mamaduka, dmsnell, shailu25. Fixes #63917. git-svn-id: https://develop.svn.wordpress.org/trunk@60708 602fd350-edb4-49c9-b593-d223f7449a82
Change the JSON encoding of block attributes to address problems with strings ending in the
\character.The encoding is updated to replace the escaped
\character with its unicode escape\u005c. This simplifies the replacement of escaped"characters with its unicode escape\u0022. This addresses a bug where plain, syntactic JSON"characters following an escaped\character were replaced, breaking the JSON syntax and causing block attributes to be ignored entirely.By replacing escaped
\characters, subsequent replacements become simpler. Escaped"replacement (applied bywp_kses_stripslashes()and the motivation for WordPress/gutenberg#6619 that introduced the escaped"replacement) can be maintained as a simple replacement with no need for backtracking or other complex regular expression techniques.This should also make the syntactic JSON
"characters safe fromwp_kses_stripslashes()because escaped slashes are been replaced.This applies the same encoding as WordPress/gutenberg#71291.
This likely requires #9559.See WordPress/gutenberg#16508 and WordPress/gutenberg#6619.
Trac ticket: https://core.trac.wordpress.org/ticket/63917
This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.