#64177 closed defect (bug) (fixed)
Command Palette: Encoded ampersands in URLs
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Milestone: | 6.9 | Priority: | normal |
| Severity: | normal | Version: | 6.9 |
| Component: | Administration | Keywords: | has-patch needs-testing |
| Focuses: | Cc: |
Description
I noticed a bug with the Command Palette in 6.9 while testing it with the Web Stories plugin.
That plugin has a React app with some sub-pages, the routing looks like so: wp-admin/edit.php?post_type=web-story&page=stories-dashboard#/templates-gallery
In the command palette, these URLs are broken because the ampersand is encoded: wp-admin/edit.php?post_type=web-story&page=stories-dashboard#/templates-gallery
Thus, one cannot navigate to those URLs using the command palette.
Looks like this would need to be fixed in wp_enqueue_command_palette_assets()
Change History (24)
This ticket was mentioned in PR #10446 on WordPress/wordpress-develop by @wildworks.
4 months ago
#2
- Keywords has-patch added
#5
@
4 months ago
As a tested with 6.9 there is still persist and not redirect to the selected page from command palette(ctrl + k)
Please show the attach video
https://www.loom.com/share/d91696b594274d6889db056b0664f2ca
1.Stories->Dashboard
2.Stories->Explore Templates
3.Stories->Catagories
4.Stories->Tags
5.Stories->Settings
#6
@
4 months ago
GB 72898 has been merged. Regarding the core patch, I plan to commit it by 6.9 Beta3.
This ticket was mentioned in Slack in #core-committers by wildworks. View the logs.
4 months ago
#9
@
4 months ago
@tusharaddweb Thanks for the testing! This issue should be fixed in Beta 3. If the problem persists, please let us know.
#10
@
4 months ago
It would be worth revisiting this as I think there are a few more gaps we could close.
It looks like this is potentially coming in from the esc_url() call in menu_page_url() when there’s a parent slug pointing to a defined parent page. In the other condition WordPress calls add_query_arg() which in turn calls urlencode_deep() which should be replacing the & with %26.
If that’s the case it would be preferable, I think to ensure that we turn the plaintext $menu_slug into its URL-escaped variety before calling esc_url() on it.
There’s another point in the patch that’s easy to overlook: we should be careful about calling html_entity_decode() and also about passing get_bloginfo( 'charset' ). First of all, we have WP_HTML_Decoder::decode_attribute() which will more accurately decode values that are from or encoded for an HTML attribute. Second, that function in combination with the blog_charset is generally unsafe.
For stronger interoperability and to avoid security issues, we should only UTF-8 values when percent-encoding. This is complicated because even if we expect to print out something like an ISO-8859-1 page, we’re not printing the bytes here but rather the URL-encoding of a query arg, which will always be ASCII.
What can happen is that if we decode into a non-UTF-8 locale then we can generate invalid UTF-8 in the percent-encoding and that leads to ill-defined behaviors and can lead to exploits.
<?php $iso_8859_1 = html_entity_decode( '©©', ENT_QUOTES, 'ISO-8859-1' ); // contains the bytes 0xA9 0xA9, which are invalid UTF-8 echo urlencode( $iso_8859_1 ); // %A9%A9 $utf_8 = html_entity_decode( '©©', ENT_QUOTES, 'UTF-8' ); // contains the bytes 0xC2 0xA9 0xC2 0xA9 echo urlencode( $utf_8 ); // %C2%A9%C2%A9 echo urlencode( WP_HTML_Decoder::decode_attribute( '©©' ) ); // %C2%A9%C2%A9
More than likely, if we use blog_charset we will present links that are also broken if the site is using anything for blog_charset other than UTF-8. WP_HTML_Decoder::decode_attribute() is also spec-compliant and so will produce an equivalent match to how the browser decodes the value, whereas html_entity_decode() is not able to do that reliably.
#11
@
4 months ago
@dmsnell Thanks for the PR! If I understand correctly, do I need to address the following two points?
- Escape
$menu_slugbeforehand - Use
WP_HTML_Decoder::decode_attribute()instead ofhtml_entity_decode().
If that's the case, I'd like to work to ensure we meet the deadline for RC1.
#12
@
4 months ago
@wildworks if we escape $menu_slug beforehand we shouldn’t have the reported problem to begin with, because urlencode() will replace & with %26. though I might have been confused because I thought the problem was when we encountered menu slugs like One & Two. Either way, because that second clause is directly creating the URL and passing $menu_slug without percent-encoding we can predict this issue, even if it’s not what was reported here.
that would leave other related issues which are caused by the fact that the generated URL runs through esc_url() way before it’s sent to the browser (why we want late-escaping…). it seems like we want actual URLs, like the ones we would type into the address bar in a browser, to be printed in the SCRIPT tag as serialized into JSON.
to do this we would need to undo what esc_url() did, and so therefore yes, I suggest that we run it through WP_HTML_Decoder::decode_attribute().
this stuff is so complicated it’s easy to overlook the tiny details.
#13
@
4 months ago
- Resolution fixed deleted
- Status changed from closed to reopened
@dmsnell I see, let's reopen this ticket and try to improve it before 6.9 RC1.
This ticket was mentioned in PR #10480 on WordPress/wordpress-develop by @wildworks.
3 months ago
#14
#15
@
3 months ago
I have submitted a PR: https://github.com/WordPress/wordpress-develop/pull/10480/
This PR addresses both this ticket and #64196.
@wildworks commented on PR #10480:
3 months ago
#16
I'll prepare the commit message now, ready for when I commit this pull request.
Command Palette: Use WP_HTML_Processor and WP_HTML_Decoder for menu labels and URLs. Replace regex-based HTML tag removal with WP_HTML_Processor to properly extract text nodes from menu labels. This ensures only root-level text nodes are collected. Additionally, replace html_entity_decode() with WP_HTML_Decoder::decode_attribute() for URL decoding to use the modern HTML API for consistent attribute decoding. Follow-up to [61124], [61126], [61127], [61142]. Props: wildworks, dmsnell, madhavishah01. Fixes #64177, #64196.
@wildworks commented on PR #10480:
3 months ago
#18
@dmsnell @mukeshpanchal27 @peterwilsoncc If there are no other blockers, I'd like to commit this before the RC1 release, but what do you think? This PR can be considered a code quality improvement, so if there are any blockers, we can punt it to a future release.
#19
@
3 months ago
- Milestone changed from 6.9 to 7.0
There's less than an hour until the RC1 commit freeze. I'm not entirely confident about PR 10480, so let's punt this ticket to 7.0.
If this patch needs to be included in version 6.9, let's reconsider it.
#20
@
3 months ago
- Component changed from General to Administration
- Milestone changed from 7.0 to 6.9
- Resolution set to fixed
- Status changed from reopened to closed
@wildworks Since there have been commits to this ticket in the 6.9 milestone, the standard workflow is to close this ticket in this milestone and open a follow-up ticket for 7.0.
@dmsnell commented on PR #10480:
3 months ago
#22
@t-hamano let me know your preferences on this based on my feedback. I’m happy to approve the work if we want to get it in still, understanding that it’s now after the RC1 deadline. I believe that with a couple of sign-offs we can still do so.
@wildworks commented on PR #10480:
3 months ago
#23
Thanks for the feedback!
I tried the latter approach using the WP_HTML_Tag_Processor. What do you think?
By the way, I personally feel there is no need to rush this PR into 6.9 🙂
@wildworks commented on PR #10480:
3 months ago
#24
Thanks for the review, @dmsnell!
Finally, could you review the commit message? I'm concerned that the explanation may not be accurate.
Command Palette: Use WP_HTML_Tag_Processor and WP_HTML_Decoder for menu labels and URLs. Replace regex-based HTML tag removal with WP_HTML_Tag_Processor to properly extract text nodes from menu labels. This ensures only root-level text nodes are collected. Additionally, replace html_entity_decode() with WP_HTML_Decoder::decode_attribute() for URL decoding to use the modern HTML API for consistent attribute decoding. Follow-up to [61124], [61126], [61127], [61142]. Props: dmsnell, madhavishah01, peterwilsoncc, wildworks. Fixes #64177, #64196.
@dmsnell commented on PR #10480:
3 months ago
#25
review the commit message?
Generally I just write “use HTML API” rather than noting the specific classes and methods. It might be valuable to tweak the note on “tag removal” since that is what originally led me to misunderstand the problem.
- Command Palette: Use WP_HTML_Tag_Processor and WP_HTML_Decoder for menu labels and URLs. + Command Palette: Use HTML API for more reliable menu labels and URLs. - Replace regex-based HTML tag removal with WP_HTML_Tag_Processor to properly extract text nodes from menu labels. This ensures only root-level text nodes are collected. + Replace regex-based HTML parsing with WP_HTML_Tag_Processor to properly extract text nodes from menu labels. This ensures only root-level text nodes are collected. - Additionally, replace html_entity_decode() with WP_HTML_Decoder::decode_attribute() for URL decoding to use the modern HTML API for consistent attribute decoding. + Additionally, replace html_entity_decode() with WP_HTML_Decoder::decode_attribute() with the menu URL for consistent attribute decoding. Follow-up to [61124], [61126], [61127], [61142]. Props: dmsnell, madhavishah01, peterwilsoncc, wildworks. Fixes #64177, #64196.
I tossed in some minor styling updates in there, which you are free to ignore, but since you asked…
@wildworks commented on PR #10480:
3 months ago
#26
@dmsnell Thanks for the feedback! I will use that message and commit as per your suggestion.
I was able to reproduce the problem. I plan to fix it by Beta 3.