Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update OD_HTML_Tag_Processor::next_tag() to allow $query arg and prepare to skip visiting tag closers by default #1872

Merged
merged 8 commits into from
Feb 26, 2025

Conversation

westonruter
Copy link
Member

@westonruter westonruter commented Feb 16, 2025

I realized that the restriction to disallow passing the $query arg to next_tag() is obsolete since the breadcrumbs are tracked by calls to next_token(). The disallowing of the $query arg was a mistake ever since the subclass was introduced in #1302.

The test_next_tag_with_query test case confirms that the breadcrumbs (XPaths) are still computed correctly even when passing queries to next_token().

Before OD_HTML_Tag_Processor was introduced there was the OD_HTML_Tag_Walker class which wrapped an instance of WP_HTML_Tag_Processor. This wrapper walker class at that time was responsible for keeping track of the breadcrumbs and computing the XPaths. It only ever called $processor->next_tag( array( 'tag_closers' => 'visit' ) ) as it didn't integrate with the lower-level next_token() method to be able to observe the consumption of all tokens. This logic was ported over to the OD_HTML_Tag_Processor subclass of WP_HTML_Tag_Processor unnecessarily.

Also, since the OD_HTML_Tag_Walker had to visit all tags including tag closers in order to keep track of the breadcrumbs, this meant that the ported logic into the OD_HTML_Tag_Processor::next_tag() method visited tag closers by default even though WP_HTML_Tag_Processor::next_tag() does not visit tag closers by default. In fact, there was a OD_HTML_Tag_Processor::next_open_tag() method introduced which had the behavior for skipping tag closer visiting. All of this seems like it would be confusing for people expecting that the $context->processor supplied to tag visitors would have a next_tag() method that behaves like they expect, to skip visiting tag closers by default. But currently it doesn't: again, it does visit tag closers by default. This PR intends to move toward restoring the base method's logic. (This PR also soft-deprecates the redundant next_open_tag() method.)

The first step is to add a warning when no $query arg is supplied to the next_tag() method. Previously if a plugin tried to supply a $query it would actually throw an exception. Now, however, they'll get a migration warning when they don't supply the $query argument. The migration warning is to inform developers that the default behavior of next_tag() is changing from visiting tag closers by default to skipping tag visitors by default.

See westonruter/od-intrinsic-dimensions#2 for how another plugin can avoid the migration warning while also avoiding a fatal error when attempting to run with an older version of Optimization Detective which doesn't allow passing $query to next_tag().

I think this necessitates adding an optimization-detective 1.0.0-beta3 milestone. This bumps the version to 1.0.0-beta3 to anticipate this.

Lastly, the handling of the admin bar is now consistent with the handling of NOSCRIPT. While no tag visitor will be explicitly be invoked with either the NOSCRIPT tag or any tag in the admin bar, it is possible for tag visitors to manually walk over those tags if they want to. This was done so that the next_tag() method can be (eventually) unmodified in its behavior from the method in the base class.

@westonruter westonruter added [Type] Enhancement A suggestion for improvement of an existing feature [Plugin] Optimization Detective Issues for the Optimization Detective plugin labels Feb 16, 2025
@westonruter westonruter changed the title Try: Allow queries to be passed to OD_HTML_Tag_Processor::next_tag() Update OD_HTML_Tag_Processor::next_tag() to allow query arg and prepare to skip closers by default Feb 18, 2025
@westonruter westonruter changed the title Update OD_HTML_Tag_Processor::next_tag() to allow query arg and prepare to skip closers by default Update OD_HTML_Tag_Processor::next_tag() to allow query arg and prepare to skip visiting tag closers by default Feb 18, 2025
Copy link

codecov bot commented Feb 18, 2025

Codecov Report

Attention: Patch coverage is 94.11765% with 1 line in your changes missing coverage. Please review.

Project coverage is 67.91%. Comparing base (42bd7a8) to head (2f97645).
Report is 105 commits behind head on trunk.

Files with missing lines Patch % Lines
plugins/optimization-detective/load.php 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            trunk    #1872      +/-   ##
==========================================
+ Coverage   66.70%   67.91%   +1.20%     
==========================================
  Files          88       86       -2     
  Lines        7029     7001      -28     
==========================================
+ Hits         4689     4755      +66     
+ Misses       2340     2246      -94     
Flag Coverage Δ
multisite 67.91% <94.11%> (+1.20%) ⬆️
single 37.70% <17.64%> (+0.50%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

github-actions bot commented Feb 18, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: westonruter <[email protected]>
Co-authored-by: felixarntz <[email protected]>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@westonruter westonruter changed the title Update OD_HTML_Tag_Processor::next_tag() to allow query arg and prepare to skip visiting tag closers by default Update OD_HTML_Tag_Processor::next_tag() to allow $query arg and prepare to skip visiting tag closers by default Feb 19, 2025
Copy link
Member

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonruter Mostly looks good, just one thing I think needs to be changed and one other concern/question.

if (
in_array( 'NOSCRIPT', $processor->get_breadcrumbs(), true )
||
$processor->is_admin_bar()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? I thought we had previously decided to omit any tags within the admin bar anyway as part of OD_HTML_Tag_Processor itself.

Copy link
Member Author

@westonruter westonruter Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See this explanation in the PR description:

Lastly, the handling of the admin bar is now consistent with the handling of NOSCRIPT. While no tag visitor will be explicitly be invoked with either the NOSCRIPT tag or any tag in the admin bar, it is possible for tag visitors to manually walk over those tags if they want to. This was done so that the next_tag() method can be (eventually) unmodified in its behavior from the method in the base class.

This will make it easier to switch to WP_HTML_Processor as well, since the next_tag() method won't need to be overridden for that class either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While no tag visitor will be explicitly be invoked with either the NOSCRIPT tag or any tag in the admin bar, it is possible for tag visitors to manually walk over those tags if they want to.

How would that be possible? Can you share an example?

Copy link
Member Author

@westonruter westonruter Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for example, in #1746 is about undoing JS-based lazy loading in favor of native lazy loading. Consider this markup used by lazysizes:

<img src="" data-src="https://example.com/bar.jpg" data-srcset="https://example.com/bar-large.jpg 1000w, https://example.com/bar-large.jpg 1000w" sizes="(max-width: 556px) 100vw, 556px" alt="Bar" class="attachment-large size-large wp-image-2 has-transparency lazyload" width="500" height="300">
<noscript>
	<img src="https://example.com/bar.jpg" srcset="https://example.com/bar-large.jpg 1000w, https://example.com/bar-large.jpg 1000w" sizes="(max-width: 556px) 100vw, 556px" alt="Bar" class="attachment-large size-large wp-image-2 has-transparency lazyload" width="500" height="300">
</noscript>

To optimize this, a tag visitor would need to (1) replace the src with the data-src, (2) replace the srcset with the data-srcset, and finally (3) remove the noscript entirely. A tag visitor implementing this could look like the following:

<?php
add_action( 'od_register_tag_visitors', function ( OD_Tag_Visitor_Registry $registry ) {
	$registry->register( 'native-lazy-loading', function ( OD_Tag_Visitor_Context $context ) {
		$processor = $context->processor;
		if ( $processor->get_tag() !== 'IMG' || ! $processor->has_class( 'lazyload' ) ) {
			return; // Tag is not relevant for this tag visitor.
		}

		$src    = $processor->get_attribute( 'data-src' );
		$srcset = $processor->get_attribute( 'data-srcset' );

		if ( ! is_string( $src ) && ! is_string( $srcset ) ) {
			return;
		}

		// Replace the src attribute with the data-src attribute.
		if ( is_string( $src ) ) {
			$processor->set_attribute( 'src', $src );
			$processor->remove_attribute( 'data-src' );
		}

		// Replace the srcset attribute with the data-srcset attribute.
		if ( is_string( $srcset ) ) {
			$processor->set_attribute( 'srcset', $srcset );
			$processor->remove_attribute( 'data-srcset' );
		}

		$processor->remove_class( 'lazyload' );

		if ( $processor->next_tag() && $processor->get_tag() === 'NOSCRIPT' ) {
			if ( method_exists( $processor, 'remove_tag' ) ) {
				// Note: This is not yet supported by the HTML API! But this is how the NOSCRIPT could in theory be removed in the future.
				$processor->remove_tag();
			} elseif ( $processor->next_tag() && $processor->get_tag() === 'IMG' ) {
				// This alternative prevents the duplicate NOSCRIPT > IMG from being shown.
				$processor->remove_attribute( 'src' ); // Prevent loading the image.
				$processor->remove_attribute( 'srcset' ); // Prevent loading the image.
				$processor->set_attribute( 'hidden', true ); // Prevent the image from being rendered twice.
			}
		}
	} );
} );

So in this case, while tag visitors don't iterate over NOSCRIPT elements or their descendants by default, it is useful for there to still be the possibility to do so during their own iteration outside of the overall "optimization loop".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm maybe I'm not understanding what you mean. My question was related to an admin bar example, not noscript.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've filed an enhancement request in Core-63020 to expose the necessary information in WP_HTML_Processor to be able to construct the XPath in the future without the need for subclassing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method marked as private in 2f97645.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works for now. With the eventual migration to WP_HTML_Processor though, how would you propose to solve this then?

I looked at the 63020 ticket, but don't see how it would address what you flagged above:

then it would need to compute the XPath for every single tag when iterating over the document, and this is more expensive than calling the is_admin_bar() method

Wouldn't we need a method that reads directly from the $open_stack_tags property? For example, something like this:

public function is_within_tag( string $tagname, array $attributes = array() ): bool {
	foreach ( $this->open_stack_tags as $index => $current_tagname ) {
		if ( strtoupper( $tagname ) !== $current_tagname ) {
			continue;
		}
		if ( count( $attributes ) === 0 ) {
			return true;
		}
		if ( ! isset( $this->open_stack_attributes[ $index ] ) ) {
			continue;
		}
		$current_attributes = $this->open_stack_attributes[ $index ];
		if ( $current_attributes === $attributes ) {
			return true;
		}
	}
	return false;
}

Of course, this could become expensive if used excessively or to check for nodes that are deeply nested. So alternatively we could have a method like is_within_root_tag( string $tagname, array $attributes = array() ): bool that does the same as above, but limits it to only checking the first child element of the BODY instead of iterating through all open stack tags.

Just some ideas to consider.

Copy link
Member Author

@westonruter westonruter Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just updated the description of Core-63020 to make my proposal more explicit. Namely, if there was something like a WP_HTML_Processor::get_element_breadcrumbs() then you could check to see if you are inside of the Admin Bar with this code, similar to what you suggest with is_within_root_tag:

function in_admin_bar( WP_HTML_Processor $processor ): bool {
	$root = $processor->get_element_breadcrumbs()[2] ?? null;
	return (
		null !== $root
		&&
		$root->get_tag() === 'DIV'
		&&
		$root->get_attribute( 'id' ) === 'wpadminbar'
	);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, thanks for clarifying!

Copy link
Member

@felixarntz felixarntz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonruter I'm okay with unblocking this for now, yet don't think it's a great solution. As an interim solution it's fine, but per #1872 (comment) I'm curious what the eventual solution could be.

@westonruter westonruter merged commit 4a79036 into trunk Feb 26, 2025
19 of 20 checks passed
@westonruter westonruter deleted the try/allow-next-tag-query branch February 26, 2025 19:58
@westonruter
Copy link
Member Author

Follow up PR: #1908

@westonruter westonruter added [Plugin] Embed Optimizer Issues for the Embed Optimizer plugin (formerly Auto Sizes) [Plugin] Image Prioritizer Issues for the Image Prioritizer plugin (dependent on Optimization Detective) labels Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Plugin] Embed Optimizer Issues for the Embed Optimizer plugin (formerly Auto Sizes) [Plugin] Image Prioritizer Issues for the Image Prioritizer plugin (dependent on Optimization Detective) [Plugin] Optimization Detective Issues for the Optimization Detective plugin [Type] Enhancement A suggestion for improvement of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants