Command Palette: Use WP_HTML_Processor and WP_HTML_Decoder to generate menu label and menu URL #10480

t-hamano · 2025-11-07T03:43:20Z

Trac ticket: https://core.trac.wordpress.org/ticket/64233

This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

…e menu label and menu URL

t-hamano · 2025-11-07T03:51:43Z

I'll prepare the commit message now, ready for when I commit this pull request.

Command Palette: Use WP_HTML_Tag_Processor and WP_HTML_Decoder for menu labels and URLs.

Replace regex-based HTML tag removal with WP_HTML_Tag_Processor to properly extract text nodes from menu labels. This ensures only root-level text nodes are
collected.

Additionally, replace html_entity_decode() with WP_HTML_Decoder::decode_attribute() for URL decoding to use the modern HTML API for consistent attribute decoding.

Follow-up to [61124], [61126], [61127], [61142].

Props: dmsnell, madhavishah01, peterwilsoncc, wildworks.
Fixes #64177, #64196.

github-actions · 2025-11-07T03:52:05Z

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props wildworks, mukesh27, peterwilsoncc, dmsnell, westonruter.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

github-actions · 2025-11-07T03:57:11Z

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

The Plugin and Theme Directories cannot be accessed within Playground.
All changes will be lost when closing a tab with a Playground instance.
All changes will be lost when refreshing the page.
A fresh instance is created each time the link below is clicked.
Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

src/wp-includes/script-loader.php

t-hamano · 2025-11-11T07:36:22Z

@dmsnell @mukeshpanchal27 @peterwilsoncc If there are no other blockers, I'd like to commit this before the RC1 release, but what do you think? This PR can be considered a code quality improvement, so if there are any blockers, we can punt it to a future release.

dmsnell · 2025-11-12T02:21:45Z

src/wp-includes/script-loader.php

+		}
+
+		return trim( implode( '', $text_parts ) );
+	};


hey nice job @t-hamano getting this built. I hope it wasn’t too obscure to figure out.

this looks like it should be solid, but I can share a couple of points of feedback.

Finding root-level text nodes

when creating a fragment (with the default <body> context) we will always have an open HTML element and BODY element, meaning that root-level text will always have a depth of 3 (and likewise, the breadcrumb depth will be three).

this means we can eliminate the nested loop and directly check if the depth is 3. we don’t have to capture the root depth. that open HTML and BODY are guarantees with how it works.

On the other hand, we can also test this via the breadcrumbs. I found an issue that we should probably change/fix on matches_breadcrumbs(), because that won’t work here, but for the time being this would.

while ( $processor->next_token() ) { if ( array( 'HTML', 'BODY', '#text' ) !== $processor->get_breadcrumbs() ) { continue; } $text_parts… }

Efficiency and reliability

The use of the HTML Processor is particularly convenient because it provides depth automatically. On the other hand, if you find that it’s too slow or fails too frequently (because it receives the fraction of input documents it can’t parse) then we can still adjust the lever on the reliability/practicality spectrum. The Tag Processor will not fail with the same parsing issues the PCRE matches did, even though that can lead to some kinds of parsing failures (with, for example, mismatched tags).

Still, the Tag Processor won’t fail a parse each token and is considerably faster than the fully-fledged HTML Processor. If we were to choose this approach, we’d want to manually track depth, which again, could be wrong because HTML is so wonderfully complex (vs. the HTML Processor which will not be wrong here).

$processor = new WP_HTML_Tag_Processor( $label ); $depth = 0; while ( $processor->next_token() ) { $token_name = $processor->get_token_name(); if ( '#text' === $token_name && 0 === $depth ) { $text_parts… continue; } if ( $processor->is_closing_tag() ) { --$depth; } else if ( ! WP_HTML_Processor::is_void( $token_name ) ) { ++$depth; } }

The choice is up to you. The only thing I’d watch out for is that occasionally we get things like “nested” A tags, and those can cause the HTML Processor to abort out of caution.

dmsnell · 2025-11-12T02:29:41Z

@t-hamano let me know your preferences on this based on my feedback. I’m happy to approve the work if we want to get it in still, understanding that it’s now after the RC1 deadline. I believe that with a couple of sign-offs we can still do so.

t-hamano · 2025-11-17T08:10:50Z

Thanks for the feedback!

I tried the latter approach using the WP_HTML_Tag_Processor. What do you think?

By the way, I personally feel there is no need to rush this PR into 6.9 🙂

dmsnell

everything that seems important checks out. thanks for working through this. I agree on not rushing this into 6.9

dmsnell · 2025-11-17T20:19:59Z

src/wp-includes/script-loader.php

 				$menu_url = $menu_slug;
 			} elseif ( ! empty( menu_page_url( $menu_slug, false ) ) ) {
-				$menu_url = html_entity_decode( menu_page_url( $menu_slug, false ), ENT_QUOTES, get_bloginfo( 'charset' ) );
+				$menu_url = WP_HTML_Decoder::decode_attribute( menu_page_url( $menu_slug, false ) );


let’s also earmark this for a follow-up to address the issue in menu_page_url() that it fails to escape $menu_slug in the case where there’s no parent slug.

maybe we could create a ticket for that?

I took a closer look at the implementation of the menu_page_url function, namely here:

wordpress-develop/src/wp-admin/includes/plugin.php

Lines 1929 to 1933 in cd301d0

if ( $parent_slug && ! isset( $_parent_pages[ $parent_slug ] ) ) {

$url = admin_url( add_query_arg( 'page', $menu_slug, $parent_slug ) );

} else {

$url = admin_url( 'admin.php?page=' . $menu_slug );

}

I had assumed that the add_query_arg() function would URL-encode the string, but apparently it doesn't. Look at the following test results. As you can see, the add_query_arg() function does not URL-encode:

$encoded = add_query_arg( 'page', 'test #1&2', 'admin.php' ); $direct = 'admin.php?page=test #1&2'; echo bin2hex( $encoded ) == bin2hex( $direct ) ? 'Equal' : 'Not Equal'; // Output: "Equal"

So in the menu_page_url() function, the $menu_slug is not escaped in either case, so we need to fix that. Is my understanding correct?

@t-hamano it’s actually the other case that’s more problematic because it doesn’t even attempt to escape the $menu_slug. I don’t know off-hand, but I think that # is a special case here; you may try passing something like é or 🏴󠁧󠁢󠁥󠁮󠁧󠁿 and see if it encodes that.

The second clause should read…

} else { $url = admin_url( add_query_arg( 'page', $menu_slug, 'admin.php' ) ); }

or something like that. it’s missing the call to add query args.

t-hamano · 2025-11-25T03:40:29Z

Thanks for the review, @dmsnell!

Finally, could you review the commit message? I'm concerned that the explanation may not be accurate.

Command Palette: Use WP_HTML_Tag_Processor and WP_HTML_Decoder for menu labels and URLs.

Replace regex-based HTML tag removal with WP_HTML_Tag_Processor to properly extract text nodes from menu labels. This ensures only root-level text nodes are
collected.

Additionally, replace html_entity_decode() with WP_HTML_Decoder::decode_attribute() for URL decoding to use the modern HTML API for consistent attribute decoding.

Follow-up to [61124], [61126], [61127], [61142].

Props: dmsnell, madhavishah01, peterwilsoncc, wildworks.
Fixes #64177, #64196.

src/wp-includes/script-loader.php

Co-authored-by: Weston Ruter <[email protected]>

dmsnell · 2025-11-25T18:39:38Z

review the commit message?

Generally I just write “use HTML API” rather than noting the specific classes and methods. It might be valuable to tweak the note on “tag removal” since that is what originally led me to misunderstand the problem.

- Command Palette: Use WP_HTML_Tag_Processor and WP_HTML_Decoder for menu labels and URLs.
+ Command Palette: Use HTML API for more reliable menu labels and URLs.

- Replace regex-based HTML tag removal with WP_HTML_Tag_Processor to properly extract text nodes from menu labels. This ensures only root-level text nodes are
collected.
+ Replace regex-based HTML parsing with WP_HTML_Tag_Processor to properly extract text nodes from menu labels. This ensures only root-level text nodes are
collected.

- Additionally, replace html_entity_decode() with WP_HTML_Decoder::decode_attribute() for URL decoding to use the modern HTML API for consistent attribute decoding.
+ Additionally, replace html_entity_decode() with WP_HTML_Decoder::decode_attribute() with the menu URL for consistent attribute decoding.

  Follow-up to [61124], [61126], [61127], [61142].

  Props: dmsnell, madhavishah01, peterwilsoncc, wildworks.
  Fixes #64177, #64196.

I tossed in some minor styling updates in there, which you are free to ignore, but since you asked…

dmsnell

Looks fine from the perspective of the HTML API use. The new version of the code seems clearer in intent to me, making it a bit more explicit that the goal is to strip away any elements with all their content.

Not sure why we do that or want that, but that’s neither here nor there.

t-hamano · 2025-11-26T04:19:19Z

@dmsnell Thanks for the feedback! I will use that message and commit as per your suggestion.

github-actions · 2025-11-26T04:27:14Z

A commit was made that fixes the Trac ticket referenced in the description of this pull request.

SVN changeset: 61310
GitHub commit: 76a8f03

This PR will be closed, but please confirm the accuracy of this and reopen if there is more work to be done.

Command Palette: Use WP_HTML_Processor and WP_HTML_Decoder to generat…

8b09509

…e menu label and menu URL

t-hamano requested a review from dmsnell November 7, 2025 03:51

t-hamano marked this pull request as ready for review November 7, 2025 03:51

mukeshpanchal27 reviewed Nov 7, 2025

View reviewed changes

src/wp-includes/script-loader.php Outdated Show resolved Hide resolved

t-hamano added 2 commits November 7, 2025 17:38

Extract common logic

7a03b40

Fix PHPCS error

1f16c56

peterwilsoncc reviewed Nov 10, 2025

View reviewed changes

src/wp-includes/script-loader.php Outdated Show resolved Hide resolved

t-hamano added 2 commits November 11, 2025 09:30

Add docblock

e5b6e12

Merge branch 'trunk' into 64177-palette-enhancement

38a0f43

dmsnell reviewed Nov 12, 2025

View reviewed changes

t-hamano added 3 commits November 17, 2025 16:41

Merge branch 'trunk' into 64177-palette-enhancement

dfe44d1

WIP

b0439d4

Use WP_HTML_Tag_Processor to extract root-level test nodes

c3acbc7

dmsnell approved these changes Nov 17, 2025

View reviewed changes

westonruter reviewed Nov 25, 2025

View reviewed changes

src/wp-includes/script-loader.php Outdated Show resolved Hide resolved

src/wp-includes/script-loader.php Outdated Show resolved Hide resolved

src/wp-includes/script-loader.php Outdated Show resolved Hide resolved

t-hamano and others added 3 commits November 25, 2025 16:56

Remove empty line break

55a29e0

Co-authored-by: Weston Ruter <[email protected]>

Add PHP type hints

3f98766

Co-authored-by: Weston Ruter <[email protected]>

Checks if the menu item name is a string

6190673

Co-authored-by: Weston Ruter <[email protected]>

dmsnell approved these changes Nov 25, 2025

View reviewed changes

github-actions bot closed this Nov 26, 2025

t-hamano deleted the 64177-palette-enhancement branch November 26, 2025 10:13

	if ( $parent_slug && ! isset( $_parent_pages[ $parent_slug ] ) ) {
	$url = admin_url( add_query_arg( 'page', $menu_slug, $parent_slug ) );
	} else {
	$url = admin_url( 'admin.php?page=' . $menu_slug );
	}

Command Palette: Use WP_HTML_Processor and WP_HTML_Decoder to generate menu label and menu URL #10480

Command Palette: Use WP_HTML_Processor and WP_HTML_Decoder to generate menu label and menu URL #10480

Uh oh!

Conversation

t-hamano commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

t-hamano commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 7, 2025

Test using WordPress Playground

Some things to be aware of

Uh oh!

Uh oh!

Uh oh!

t-hamano commented Nov 11, 2025

Uh oh!

dmsnell Nov 12, 2025

Choose a reason for hiding this comment

Finding root-level text nodes

Efficiency and reliability

Uh oh!

dmsnell commented Nov 12, 2025

Uh oh!

t-hamano commented Nov 17, 2025

Uh oh!

dmsnell left a comment

Choose a reason for hiding this comment

Uh oh!

dmsnell Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

t-hamano Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

dmsnell Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

t-hamano commented Nov 25, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dmsnell commented Nov 25, 2025

Uh oh!

dmsnell left a comment

Choose a reason for hiding this comment

Uh oh!

t-hamano commented Nov 26, 2025

Uh oh!

github-actions bot commented Nov 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

t-hamano commented Nov 7, 2025 •

edited

Loading

t-hamano commented Nov 7, 2025 •

edited

Loading

github-actions bot commented Nov 7, 2025 •

edited

Loading