class WP_HTML_Processor {}

Core class used to safely parse and modify an HTML document.

Description

The HTML Processor class properly parses and modifies HTML5 documents.

It supports a subset of the HTML5 specification, and when it encounters unsupported markup, it aborts early to avoid unintentionally breaking the document. The HTML Processor should never break an HTML document.

While the WP_HTML_Tag_Processor is a valuable tool for modifying attributes on individual HTML tags, the HTML Processor is more capable and useful for the following operations:

  • Querying based on nested HTML structure.

Eventually the HTML Processor will also support:

  • Wrapping a tag in surrounding HTML.
  • Unwrapping a tag by removing its parent.
  • Inserting and removing nodes.
  • Reading and changing inner content.
  • Navigating up or around HTML structure.

Usage

Use of this class requires three steps:

  1. Call a static creator method with your input HTML document.
  2. Find the location in the document you are looking for.
  3. Request changes to the document at that location.

Example:

$processor = WP_HTML_Processor::create_fragment( $html );
if ( $processor->next_tag( array( 'breadcrumbs' => array( 'DIV', 'FIGURE', 'IMG' ) ) ) ) {
    $processor->add_class( 'responsive-image' );
}

Breadcrumbs

Breadcrumbs represent the stack of open elements from the root of the document or fragment down to the currently-matched node, if one is currently selected. Call WP_HTML_Processor::get_breadcrumbs() to inspect the breadcrumbs for a matched tag.

Breadcrumbs can specify nested HTML structure and are equivalent to a CSS selector comprising tag names separated by the child combinator, such as "DIV > FIGURE > IMG".

Since all elements find themselves inside a full HTML document when parsed, the return value from get_breadcrumbs() will always contain any implicit outermost elements. For example, when parsing with create_fragment() in the BODY context (the default), any tag in the given HTML document will contain array( 'HTML', 'BODY', … ) in its breadcrumbs.

Despite containing the implied outermost elements in their breadcrumbs, tags may be found with the shortest-matching breadcrumb query. That is, array( 'IMG' ) matches all IMG elements and array( 'P', 'IMG' ) matches all IMG elements directly inside a P element. To ensure that no partial matches erroneously match it’s possible to specify in a query the full breadcrumb match all the way down from the root HTML element.

Example:

$html = '<figure><img><figcaption>A <em>lovely</em> day outside</figcaption></figure>';
//               ----- Matches here.
$processor->next_tag( array( 'breadcrumbs' => array( 'FIGURE', 'IMG' ) ) );

$html = '<figure><img><figcaption>A <em>lovely</em> day outside</figcaption></figure>';
//                                  ---- Matches here.
$processor->next_tag( array( 'breadcrumbs' => array( 'FIGURE', 'FIGCAPTION', 'EM' ) ) );

$html = '<div><img></div><img>';
//                       ----- Matches here, because IMG must be a direct child of the implicit BODY.
$processor->next_tag( array( 'breadcrumbs' => array( 'BODY', 'IMG' ) ) );

HTML Support

This class implements a small part of the HTML5 specification.
It’s designed to operate within its support and abort early whenever encountering circumstances it can’t properly handle. This is the principle way in which this class remains as simple as possible without cutting corners and breaking compliance.

Supported elements

If any unsupported element appears in the HTML input the HTML Processor will abort early and stop all processing. This draconian measure ensures that the HTML Processor won’t break any HTML it doesn’t fully understand.

The following list specifies the HTML tags that are supported:

  • Containers: ADDRESS, BLOCKQUOTE, DETAILS, DIALOG, DIV, FOOTER, HEADER, MAIN, MENU, SPAN, SUMMARY.
  • Custom elements: All custom elements are supported. :)
  • Form elements: BUTTON, DATALIST, FIELDSET, INPUT, LABEL, LEGEND, METER, PROGRESS, SEARCH.
  • Formatting elements: B, BIG, CODE, EM, FONT, I, PRE, SMALL, STRIKE, STRONG, TT, U, WBR.
  • Heading elements: H1, H2, H3, H4, H5, H6, HGROUP.
  • Links: A.
  • Lists: DD, DL, DT, LI, OL, UL.
  • Media elements: AUDIO, CANVAS, EMBED, FIGCAPTION, FIGURE, IMG, MAP, PICTURE, SOURCE, TRACK, VIDEO.
  • Paragraph: BR, P.
  • Phrasing elements: ABBR, AREA, BDI, BDO, CITE, DATA, DEL, DFN, INS, MARK, OUTPUT, Q, SAMP, SUB, SUP, TIME, VAR.
  • Sectioning elements: ARTICLE, ASIDE, HR, NAV, SECTION.
  • Templating elements: SLOT.
  • Text decoration: RUBY.
  • Deprecated elements: ACRONYM, BLINK, CENTER, DIR, ISINDEX, KEYGEN, LISTING, MULTICOL, NEXTID, PARAM, SPACER.

Supported markup

Some kinds of non-normative HTML involve reconstruction of formatting elements and re-parenting of mis-nested elements. For example, a DIV tag found inside a TABLE may in fact belong before the table in the DOM. If the HTML Processor encounters such a case it will stop processing.

The following list specifies HTML markup that is supported:

  • Markup involving only those tags listed above.
  • Fully-balanced and non-overlapping tags.
  • HTML with unexpected tag closers.
  • Some unbalanced or overlapping tags.
  • P tags after unclosed P tags.
  • BUTTON tags after unclosed BUTTON tags.
  • A tags after unclosed A tags that don’t involve any active formatting elements.

See also

Methods

NameDescription
WP_HTML_Processor::__constructConstructor.
WP_HTML_Processor::add_classAdds a new class name to the currently matched tag.
WP_HTML_Processor::bookmark_tagCreates a new bookmark for the currently-matched tag and returns the generated name.
WP_HTML_Processor::bookmark_tokenCreates a new bookmark for the currently-matched token and returns the generated name.
WP_HTML_Processor::class_listGenerator for a foreach loop to step through each class name for the matched tag.
WP_HTML_Processor::close_a_p_elementCloses a P element.
WP_HTML_Processor::create_fragmentCreates an HTML processor in the fragment parsing mode.
WP_HTML_Processor::expects_closerIndicates if the currently-matched node expects a closing token, or if it will self-close on the next step.
WP_HTML_Processor::generate_implied_end_tagsCloses elements that have implied end tags.
WP_HTML_Processor::generate_implied_end_tags_thoroughlyCloses elements that have implied end tags, thoroughly.
WP_HTML_Processor::get_attributeReturns the value of a requested attribute from a matched tag opener if that attribute exists.
WP_HTML_Processor::get_attribute_names_with_prefixGets lowercase names of all attributes matching a given prefix in the current tag.
WP_HTML_Processor::get_breadcrumbsComputes the HTML breadcrumbs for the currently-matched node, if matched.
WP_HTML_Processor::get_comment_typeIndicates what kind of comment produced the comment node.
WP_HTML_Processor::get_current_depthReturns the nesting depth of the current location in the document.
WP_HTML_Processor::get_last_errorReturns the last error, if any.
WP_HTML_Processor::get_modifiable_textReturns the modifiable text for a matched token, or an empty string.
WP_HTML_Processor::get_tagReturns the uppercase name of the matched tag.
WP_HTML_Processor::get_token_nameReturns the node name represented by the token.
WP_HTML_Processor::get_token_typeIndicates the kind of matched token, if any.
WP_HTML_Processor::has_bookmarkChecks whether a bookmark with the given name exists.
WP_HTML_Processor::has_classReturns if a matched tag contains the given ASCII case-insensitive class name.
WP_HTML_Processor::has_self_closing_flagIndicates if the currently matched tag contains the self-closing flag.
WP_HTML_Processor::insert_html_elementInserts an HTML element on the stack of open elements.
WP_HTML_Processor::is_specialReturns whether an element of a given name is in the HTML special category.
WP_HTML_Processor::is_tag_closerIndicates if the current tag token is a tag closer.
WP_HTML_Processor::is_virtualIndicates if the currently-matched token is virtual, created by a stack operation while processing HTML, rather than a token found in the HTML text itself.
WP_HTML_Processor::is_voidReturns whether a given element is an HTML Void Element
WP_HTML_Processor::matches_breadcrumbsIndicates if the currently-matched tag matches the given breadcrumbs.
WP_HTML_Processor::next_tagFinds the next tag matching the $query.
WP_HTML_Processor::next_tokenEnsures internal accounting is maintained for HTML semantic rules while the underlying Tag Processor class is seeking to a bookmark.
WP_HTML_Processor::reconstruct_active_formatting_elementsReconstructs the active formatting elements.
WP_HTML_Processor::release_bookmarkRemoves a bookmark that is no longer needed.
WP_HTML_Processor::remove_attributeRemove an attribute from the currently-matched tag.
WP_HTML_Processor::remove_classRemoves a class name from the currently matched tag.
WP_HTML_Processor::run_adoption_agency_algorithmRuns the adoption agency algorithm.
WP_HTML_Processor::seekMoves the internal cursor in the HTML Processor to a given bookmark’s location.
WP_HTML_Processor::set_attributeUpdates or creates a new attribute on the currently matched tag with the passed value.
WP_HTML_Processor::set_bookmarkSets a bookmark in the HTML document.
WP_HTML_Processor::stepSteps through the HTML document and stop at the next tag, if any.
WP_HTML_Processor::step_in_bodyParses next element in the ‘in body’ insertion mode.

Source

 *
 * @see WP_HTML_Tag_Processor
 * @see https://html.spec.whatwg.org/
 */
class WP_HTML_Processor extends WP_HTML_Tag_Processor {
	/**
	 * The maximum number of bookmarks allowed to exist at any given time.
	 *
	 * HTML processing requires more bookmarks than basic tag processing,
	 * so this class constant from the Tag Processor is overwritten.
	 *
	 * @since 6.4.0
	 *
	 * @var int
	 */
	const MAX_BOOKMARKS = 100;

	/**
	 * Holds the working state of the parser, including the stack of
	 * open elements and the stack of active formatting elements.
	 *
	 * Initialized in the constructor.
	 *
	 * @since 6.4.0
	 *
	 * @var WP_HTML_Processor_State
	 */
	private $state;

	/**
	 * Used to create unique bookmark names.
	 *
	 * This class sets a bookmark for every tag in the HTML document that it encounters.
	 * The bookmark name is auto-generated and increments, starting with `1`. These are
	 * internal bookmarks and are automatically released when the referring WP_HTML_Token
	 * goes out of scope and is garbage-collected.
	 *
	 * @since 6.4.0
	 *
	 * @see WP_HTML_Processor::$release_internal_bookmark_on_destruct
	 *
	 * @var int
	 */
	private $bookmark_counter = 0;

	/**
	 * Stores an explanation for why something failed, if it did.
	 *
	 * @see self::get_last_error
	 *
	 * @since 6.4.0
	 *
	 * @var string|null
	 */
	private $last_error = null;

	/**
	 * Stores context for why the parser bailed on unsupported HTML, if it did.
	 *
	 * @see self::get_unsupported_exception
	 *
	 * @since 6.7.0
	 *
	 * @var WP_HTML_Unsupported_Exception|null
	 */
	private $unsupported_exception = null;

	/**
	 * Releases a bookmark when PHP garbage-collects its wrapping WP_HTML_Token instance.
	 *
	 * This function is created inside the class constructor so that it can be passed to
	 * the stack of open elements and the stack of active formatting elements without
	 * exposing it as a public method on the class.
	 *
	 * @since 6.4.0
	 *
	 * @var Closure|null
	 */
	private $release_internal_bookmark_on_destruct = null;

	/**
	 * Stores stack events which arise during parsing of the
	 * HTML document, which will then supply the "match" events.
	 *
	 * @since 6.6.0
	 *
	 * @var WP_HTML_Stack_Event[]
	 */
	private $element_queue = array();

	/**
	 * Stores the current breadcrumbs.
	 *
	 * @since 6.7.0
	 *
	 * @var string[]
	 */
	private $breadcrumbs = array();

	/**
	 * Current stack event, if set, representing a matched token.
	 *
	 * Because the parser may internally point to a place further along in a document
	 * than the nodes which have already been processed (some "virtual" nodes may have
	 * appeared while scanning the HTML document), this will point at the "current" node
	 * being processed. It comes from the front of the element queue.
	 *
	 * @since 6.6.0
	 *
	 * @var WP_HTML_Stack_Event|null
	 */
	private $current_element = null;

	/**
	 * Context node if created as a fragment parser.
	 *
	 * @var WP_HTML_Token|null
	 */
	private $context_node = null;

	/*
	 * Public Interface Functions
	 */

	/**
	 * Creates an HTML processor in the fragment parsing mode.
	 *
	 * Use this for cases where you are processing chunks of HTML that
	 * will be found within a bigger HTML document, such as rendered
	 * block output that exists within a post, `the_content` inside a
	 * rendered site layout.
	 *
	 * Fragment parsing occurs within a context, which is an HTML element
	 * that the document will eventually be placed in. It becomes important
	 * when special elements have different rules than others, such as inside
	 * a TEXTAREA or a TITLE tag where things that look like tags are text,
	 * or inside a SCRIPT tag where things that look like HTML syntax are JS.
	 *
	 * The context value should be a representation of the tag into which the
	 * HTML is found. For most cases this will be the body element. The HTML
	 * form is provided because a context element may have attributes that
	 * impact the parse, such as with a SCRIPT tag and its `type` attribute.
	 *
	 * ## Current HTML Support
	 *
	 *  - The only supported context is `<body>`, which is the default value.
	 *  - The only supported document encoding is `UTF-8`, which is the default value.
	 *
	 * @since 6.4.0
	 * @since 6.6.0 Returns `static` instead of `self` so it can create subclass instances.
	 *
	 * @param string $html     Input HTML fragment to process.
	 * @param string $context  Context element for the fragment, must be default of `<body>`.
	 * @param string $encoding Text encoding of the document; must be default of 'UTF-8'.
	 * @return static|null The created processor if successful, otherwise null.
	 */
	public static function create_fragment( $html, $context = '<body>', $encoding = 'UTF-8' ) {
		if ( '<body>' !== $context || 'UTF-8' !== $encoding ) {
			return null;
		}

		$processor                             = new static( $html, self::CONSTRUCTOR_UNLOCK_CODE );
		$processor->state->context_node        = array( 'BODY', array() );
		$processor->state->insertion_mode      = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
		$processor->state->encoding            = $encoding;
		$processor->state->encoding_confidence = 'certain';

		// @todo Create "fake" bookmarks for non-existent but implied nodes.
		$processor->bookmarks['root-node']    = new WP_HTML_Span( 0, 0 );
		$processor->bookmarks['context-node'] = new WP_HTML_Span( 0, 0 );

		$root_node = new WP_HTML_Token(
			'root-node',
			'HTML',
			false
		);

		$processor->state->stack_of_open_elements->push( $root_node );

		$context_node = new WP_HTML_Token(
			'context-node',
			$processor->state->context_node[0],
			false
		);

		$processor->context_node = $context_node;
		$processor->breadcrumbs  = array( 'HTML', $context_node->node_name );

		return $processor;
	}

	/**
	 * Creates an HTML processor in the full parsing mode.
	 *
	 * It's likely that a fragment parser is more appropriate, unless sending an
	 * entire HTML document from start to finish. Consider a fragment parser with
	 * a context node of `<body>`.
	 *
	 * Since UTF-8 is the only currently-accepted charset, if working with a
	 * document that isn't UTF-8, it's important to convert the document before
	 * creating the processor: pass in the converted HTML.
	 *
	 * @param string      $html                    Input HTML document to process.
	 * @param string|null $known_definite_encoding Optional. If provided, specifies the charset used
	 *                                             in the input byte stream. Currently must be UTF-8.
	 * @return static|null The created processor if successful, otherwise null.
	 */
	public static function create_full_parser( $html, $known_definite_encoding = 'UTF-8' ) {
		if ( 'UTF-8' !== $known_definite_encoding ) {
			return null;
		}

		$processor                             = new static( $html, self::CONSTRUCTOR_UNLOCK_CODE );
		$processor->state->encoding            = $known_definite_encoding;
		$processor->state->encoding_confidence = 'certain';

		return $processor;
	}

	/**
	 * Constructor.
	 *
	 * Do not use this method. Use the static creator methods instead.
	 *
	 * @access private
	 *
	 * @since 6.4.0
	 *
	 * @see WP_HTML_Processor::create_fragment()
	 *
	 * @param string      $html                                  HTML to process.
	 * @param string|null $use_the_static_create_methods_instead This constructor should not be called manually.
	 */
	public function __construct( $html, $use_the_static_create_methods_instead = null ) {
		parent::__construct( $html );

		if ( self::CONSTRUCTOR_UNLOCK_CODE !== $use_the_static_create_methods_instead ) {
			_doing_it_wrong(
				__METHOD__,
				sprintf(
					/* translators: %s: WP_HTML_Processor::create_fragment(). */
					__( 'Call %s to create an HTML Processor instead of calling the constructor directly.' ),
					'<code>WP_HTML_Processor::create_fragment()</code>'
				),
				'6.4.0'
			);
		}

		$this->state = new WP_HTML_Processor_State();

		$this->state->stack_of_open_elements->set_push_handler(
			function ( WP_HTML_Token $token ): void {
				$is_virtual            = ! isset( $this->state->current_token ) || $this->is_tag_closer();
				$same_node             = isset( $this->state->current_token ) && $token->node_name === $this->state->current_token->node_name;
				$provenance            = ( ! $same_node || $is_virtual ) ? 'virtual' : 'real';
				$this->element_queue[] = new WP_HTML_Stack_Event( $token, WP_HTML_Stack_Event::PUSH, $provenance );

				$this->change_parsing_namespace( $token->integration_node_type ? 'html' : $token->namespace );
			}
		);

		$this->state->stack_of_open_elements->set_pop_handler(
			function ( WP_HTML_Token $token ): void {
				$is_virtual            = ! isset( $this->state->current_token ) || ! $this->is_tag_closer();
				$same_node             = isset( $this->state->current_token ) && $token->node_name === $this->state->current_token->node_name;
				$provenance            = ( ! $same_node || $is_virtual ) ? 'virtual' : 'real';
				$this->element_queue[] = new WP_HTML_Stack_Event( $token, WP_HTML_Stack_Event::POP, $provenance );

				$adjusted_current_node = $this->get_adjusted_current_node();

				if ( $adjusted_current_node ) {
					$this->change_parsing_namespace( $adjusted_current_node->integration_node_type ? 'html' : $adjusted_current_node->namespace );
				} else {
					$this->change_parsing_namespace( 'html' );
				}
			}
		);

		/*
		 * Create this wrapper so that it's possible to pass
		 * a private method into WP_HTML_Token classes without
		 * exposing it to any public API.
		 */
		$this->release_internal_bookmark_on_destruct = function ( string $name ): void {
			parent::release_bookmark( $name );
		};
	}

	/**
	 * Stops the parser and terminates its execution when encountering unsupported markup.
	 *
	 * @throws WP_HTML_Unsupported_Exception Halts execution of the parser.
	 *
	 * @since 6.7.0
	 *
	 * @param string $message Explains support is missing in order to parse the current node.
	 */
	private function bail( string $message ) {
		$here  = $this->bookmarks[ $this->state->current_token->bookmark_name ];
		$token = substr( $this->html, $here->start, $here->length );

		$open_elements = array();
		foreach ( $this->state->stack_of_open_elements->stack as $item ) {
			$open_elements[] = $item->node_name;
		}

		$active_formats = array();
		foreach ( $this->state->active_formatting_elements->walk_down() as $item ) {
			$active_formats[] = $item->node_name;
		}

		$this->last_error = self::ERROR_UNSUPPORTED;

		$this->unsupported_exception = new WP_HTML_Unsupported_Exception(
			$message,
			$this->state->current_token->node_name,
			$here->start,
			$token,
			$open_elements,
			$active_formats
		);

		throw $this->unsupported_exception;
	}

	/**
	 * Returns the last error, if any.
	 *
	 * Various situations lead to parsing failure but this class will
	 * return `false` in all those cases. To determine why something
	 * failed it's possible to request the last error. This can be
	 * helpful to know to distinguish whether a given tag couldn't
	 * be found or if content in the document caused the processor
	 * to give up and abort processing.
	 *
	 * Example
	 *
	 *     $processor = WP_HTML_Processor::create_fragment( '<template><strong><button><em><p><em>' );
	 *     false === $processor->next_tag();
	 *     WP_HTML_Processor::ERROR_UNSUPPORTED === $processor->get_last_error();
	 *
	 * @since 6.4.0
	 *
	 * @see self::ERROR_UNSUPPORTED
	 * @see self::ERROR_EXCEEDED_MAX_BOOKMARKS
	 *
	 * @return string|null The last error, if one exists, otherwise null.
	 */
	public function get_last_error(): ?string {
		return $this->last_error;
	}

	/**
	 * Returns context for why the parser aborted due to unsupported HTML, if it did.
	 *
	 * This is meant for debugging purposes, not for production use.
	 *
	 * @since 6.7.0
	 *
	 * @see self::$unsupported_exception
	 *
	 * @return WP_HTML_Unsupported_Exception|null
	 */
	public function get_unsupported_exception() {
		return $this->unsupported_exception;
	}

	/**
	 * Finds the next tag matching the $query.
	 *
	 * @todo Support matching the class name and tag name.
	 *
	 * @since 6.4.0
	 * @since 6.6.0 Visits all tokens, including virtual ones.
	 *
	 * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document.
	 *
	 * @param array|string|null $query {
	 *     Optional. Which tag name to find, having which class, etc. Default is to find any tag.
	 *
	 *     @type string|null $tag_name     Which tag to find, or `null` for "any tag."
	 *     @type string      $tag_closers  'visit' to pause at tag closers, 'skip' or unset to only visit openers.
	 *     @type int|null    $match_offset Find the Nth tag matching all search criteria.
	 *                                     1 for "first" tag, 3 for "third," etc.
	 *                                     Defaults to first tag.
	 *     @type string|null $class_name   Tag must contain this whole class name to match.
	 *     @type string[]    $breadcrumbs  DOM sub-path at which element is found, e.g. `array( 'FIGURE', 'IMG' )`.
	 *                                     May also contain the wildcard `*` which matches a single element, e.g. `array( 'SECTION', '*' )`.
	 * }
	 * @return bool Whether a tag was matched.
	 */
	public function next_tag( $query = null ): bool {
		$visit_closers = isset( $query['tag_closers'] ) && 'visit' === $query['tag_closers'];

		if ( null === $query ) {
			while ( $this->next_token() ) {
				if ( '#tag' !== $this->get_token_type() ) {
					continue;
				}

				if ( ! $this->is_tag_closer() || $visit_closers ) {
					return true;
				}
			}

			return false;
		}

		if ( is_string( $query ) ) {
			$query = array( 'breadcrumbs' => array( $query ) );
		}

		if ( ! is_array( $query ) ) {
			_doing_it_wrong(
				__METHOD__,
				__( 'Please pass a query array to this function.' ),
				'6.4.0'
			);
			return false;
		}

		$needs_class = ( isset( $query['class_name'] ) && is_string( $query['class_name'] ) )
			? $query['class_name']
			: null;

		if ( ! ( array_key_exists( 'breadcrumbs', $query ) && is_array( $query['breadcrumbs'] ) ) ) {
			while ( $this->next_token() ) {
				if ( '#tag' !== $this->get_token_type() ) {
					continue;
				}

				if ( isset( $query['tag_name'] ) && $query['tag_name'] !== $this->get_token_name() ) {
					continue;
				}

				if ( isset( $needs_class ) && ! $this->has_class( $needs_class ) ) {
					continue;
				}

				if ( ! $this->is_tag_closer() || $visit_closers ) {
					return true;
				}
			}

			return false;
		}

		$breadcrumbs  = $query['breadcrumbs'];
		$match_offset = isset( $query['match_offset'] ) ? (int) $query['match_offset'] : 1;

		while ( $match_offset > 0 && $this->next_token() ) {
			if ( '#tag' !== $this->get_token_type() || $this->is_tag_closer() ) {
				continue;
			}

			if ( isset( $needs_class ) && ! $this->has_class( $needs_class ) ) {
				continue;
			}

			if ( $this->matches_breadcrumbs( $breadcrumbs ) && 0 === --$match_offset ) {
				return true;
			}
		}

		return false;
	}

	/**
	 * Ensures internal accounting is maintained for HTML semantic rules while
	 * the underlying Tag Processor class is seeking to a bookmark.
	 *
	 * This doesn't currently have a way to represent non-tags and doesn't process
	 * semantic rules for text nodes. For access to the raw tokens consider using
	 * WP_HTML_Tag_Processor instead.
	 *
	 * @since 6.5.0 Added for internal support; do not use.
	 *
	 * @access private
	 *
	 * @return bool
	 */
	public function next_token(): bool {
		$this->current_element = null;

		if ( isset( $this->last_error ) ) {
			return false;
		}

		/*
		 * Prime the events if there are none.
		 *
		 * @todo In some cases, probably related to the adoption agency
		 *       algorithm, this call to step() doesn't create any new
		 *       events. Calling it again creates them. Figure out why
		 *       this is and if it's inherent or if it's a bug. Looping
		 *       until there are events or until there are no more
		 *       tokens works in the meantime and isn't obviously wrong.
		 */
		if ( empty( $this->element_queue ) && $this->step() ) {
			return $this->next_token();
		}

		// Process the next event on the queue.
		$this->current_element = array_shift( $this->element_queue );
		if ( ! isset( $this->current_element ) ) {
			// There are no tokens left, so close all remaining open elements.
			while ( $this->state->stack_of_open_elements->pop() ) {
				continue;
			}

			return empty( $this->element_queue ) ? false : $this->next_token();
		}

		$is_pop = WP_HTML_Stack_Event::POP === $this->current_element->operation;

		/*
		 * The root node only exists in the fragment parser, and closing it
		 * indicates that the parse is complete. Stop before popping it from
		 * the breadcrumbs.
		 */
		if ( 'root-node' === $this->current_element->token->bookmark_name ) {
			return $this->next_token();
		}

		// Adjust the breadcrumbs for this event.
		if ( $is_pop ) {
			array_pop( $this->breadcrumbs );
		} else {
			$this->breadcrumbs[] = $this->current_element->token->node_name;
		}

		// Avoid sending close events for elements which don't expect a closing.
		if ( $is_pop && ! $this->expects_closer( $this->current_element->token ) ) {
			return $this->next_token();
		}

		return true;
	}

	/**
	 * Indicates if the current tag token is a tag closer.
	 *
	 * Example:
	 *
	 *     $p = WP_HTML_Processor::create_fragment( '<div></div>' );
	 *     $p->next_tag( array( 'tag_name' => 'div', 'tag_closers' => 'visit' ) );
	 *     $p->is_tag_closer() === false;
	 *
	 *     $p->next_tag( array( 'tag_name' => 'div', 'tag_closers' => 'visit' ) );
	 *     $p->is_tag_closer() === true;
	 *
	 * @since 6.6.0 Subclassed for HTML Processor.
	 *
	 * @return bool Whether the current tag is a tag closer.
	 */
	public function is_tag_closer(): bool {
		return $this->is_virtual()
			? ( WP_HTML_Stack_Event::POP === $this->current_element->operation && '#tag' === $this->get_token_type() )
			: parent::is_tag_closer();
	}

	/**
	 * Indicates if the currently-matched token is virtual, created by a stack operation
	 * while processing HTML, rather than a token found in the HTML text itself.
	 *
	 * @since 6.6.0
	 *
	 * @return bool Whether the current token is virtual.
	 */
	private function is_virtual(): bool {
		return (
			isset( $this->current_element->provenance ) &&
			'virtual' === $this->current_element->provenance
		);
	}

	/**
	 * Indicates if the currently-matched tag matches the given breadcrumbs.
	 *
	 * A "*" represents a single tag wildcard, where any tag matches, but not no tags.
	 *
	 * At some point this function _may_ support a `**` syntax for matching any number
	 * of unspecified tags in the breadcrumb stack. This has been intentionally left
	 * out, however, to keep this function simple and to avoid introducing backtracking,
	 * which could open up surprising performance breakdowns.
	 *
	 * Example:
	 *
	 *     $processor = WP_HTML_Processor::create_fragment( '<div><span><figure><img></figure></span></div>' );
	 *     $processor->next_tag( 'img' );
	 *     true  === $processor->matches_breadcrumbs( array( 'figure', 'img' ) );
	 *     true  === $processor->matches_breadcrumbs( array( 'span', 'figure', 'img' ) );
	 *     false === $processor->matches_breadcrumbs( array( 'span', 'img' ) );
	 *     true  === $processor->matches_breadcrumbs( array( 'span', '*', 'img' ) );
	 *
	 * @since 6.4.0
	 *
	 * @param string[] $breadcrumbs DOM sub-path at which element is found, e.g. `array( 'FIGURE', 'IMG' )`.
	 *                              May also contain the wildcard `*` which matches a single element, e.g. `array( 'SECTION', '*' )`.
	 * @return bool Whether the currently-matched tag is found at the given nested structure.
	 */
	public function matches_breadcrumbs( $breadcrumbs ): bool {
		// Everything matches when there are zero constraints.
		if ( 0 === count( $breadcrumbs ) ) {
			return true;
		}

		// Start at the last crumb.
		$crumb = end( $breadcrumbs );

		if ( '*' !== $crumb && $this->get_tag() !== strtoupper( $crumb ) ) {
			return false;
		}

		for ( $i = count( $this->breadcrumbs ) - 1; $i >= 0; $i-- ) {
			$node  = $this->breadcrumbs[ $i ];
			$crumb = strtoupper( current( $breadcrumbs ) );

			if ( '*' !== $crumb && $node !== $crumb ) {
				return false;
			}

			if ( false === prev( $breadcrumbs ) ) {
				return true;
			}
		}

		return false;
	}

	/**
	 * Indicates if the currently-matched node expects a closing
	 * token, or if it will self-close on the next step.
	 *
	 * Most HTML elements expect a closer, such as a P element or
	 * a DIV element. Others, like an IMG element are void and don't
	 * have a closing tag. Special elements, such as SCRIPT and STYLE,
	 * are treated just like void tags. Text nodes and self-closing
	 * foreign content will also act just like a void tag, immediately
	 * closing as soon as the processor advances to the next token.
	 *
	 * @since 6.6.0
	 *
	 * @param WP_HTML_Token|null $node Optional. Node to examine, if provided.
	 *                                 Default is to examine current node.
	 * @return bool|null Whether to expect a closer for the currently-matched node,
	 *                   or `null` if not matched on any token.
	 */
	public function expects_closer( ?WP_HTML_Token $node = null ): ?bool {
		$token_name = $node->node_name ?? $this->get_token_name();

		if ( ! isset( $token_name ) ) {
			return null;
		}

		$token_namespace        = $node->namespace ?? $this->get_namespace();
		$token_has_self_closing = $node->has_self_closing_flag ?? $this->has_self_closing_flag();

		return ! (
			// Comments, text nodes, and other atomic tokens.
			'#' === $token_name[0] ||
			// Doctype declarations.
			'html' === $token_name ||
			// Void elements.
			self::is_void( $token_name ) ||
			// Special atomic elements.
			( 'html' === $token_namespace && in_array( $token_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP' ), true ) ) ||
			// Self-closing elements in foreign content.
			( 'html' !== $token_namespace && $token_has_self_closing )
		);
	}

	/**
	 * Steps through the HTML document and stop at the next tag, if any.
	 *
	 * @since 6.4.0
	 *
	 * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document.
	 *
	 * @see self::PROCESS_NEXT_NODE
	 * @see self::REPROCESS_CURRENT_NODE
	 *
	 * @param string $node_to_process Whether to parse the next node or reprocess the current node.
	 * @return bool Whether a tag was matched.
	 */
	public function step( $node_to_process = self::PROCESS_NEXT_NODE ): bool {
		// Refuse to proceed if there was a previous error.
		if ( null !== $this->last_error ) {
			return false;
		}

		if ( self::REPROCESS_CURRENT_NODE !== $node_to_process ) {
			/*
			 * Void elements still hop onto the stack of open elements even though
			 * there's no corresponding closing tag. This is important for managing
			 * stack-based operations such as "navigate to parent node" or checking
			 * on an element's breadcrumbs.
			 *
			 * When moving on to the next node, therefore, if the bottom-most element
			 * on the stack is a void element, it must be closed.
			 */
			$top_node = $this->state->stack_of_open_elements->current_node();
			if ( isset( $top_node ) && ! $this->expects_closer( $top_node ) ) {
				$this->state->stack_of_open_elements->pop();
			}
		}

		if ( self::PROCESS_NEXT_NODE === $node_to_process ) {
			parent::next_token();
			if ( WP_HTML_Tag_Processor::STATE_TEXT_NODE === $this->parser_state ) {
				parent::subdivide_text_appropriately();
			}
		}

		// Finish stepping when there are no more tokens in the document.
		if (
			WP_HTML_Tag_Processor::STATE_INCOMPLETE_INPUT === $this->parser_state ||
			WP_HTML_Tag_Processor::STATE_COMPLETE === $this->parser_state
		) {
			return false;
		}

		$adjusted_current_node = $this->get_adjusted_current_node();
		$is_closer             = $this->is_tag_closer();
		$is_start_tag          = WP_HTML_Tag_Processor::STATE_MATCHED_TAG === $this->parser_state && ! $is_closer;
		$token_name            = $this->get_token_name();

		if ( self::REPROCESS_CURRENT_NODE !== $node_to_process ) {
			$this->state->current_token = new WP_HTML_Token(
				$this->bookmark_token(),
				$token_name,
				$this->has_self_closing_flag(),
				$this->release_internal_bookmark_on_destruct
			);
		}

		$parse_in_current_insertion_mode = (
			0 === $this->state->stack_of_open_elements->count() ||
			'html' === $adjusted_current_node->namespace ||
			(
				'math' === $adjusted_current_node->integration_node_type &&
				(
					( $is_start_tag && ! in_array( $token_name, array( 'MGLYPH', 'MALIGNMARK' ), true ) ) ||
					'#text' === $token_name
				)
			) ||
			(
				'math' === $adjusted_current_node->namespace &&
				'ANNOTATION-XML' === $adjusted_current_node->node_name &&
				$is_start_tag && 'SVG' === $token_name
			) ||
			(
				'html' === $adjusted_current_node->integration_node_type &&
				( $is_start_tag || '#text' === $token_name )
			)
		);

		try {
			if ( ! $parse_in_current_insertion_mode ) {
				return $this->step_in_foreign_content();
			}

			switch ( $this->state->insertion_mode ) {
				case WP_HTML_Processor_State::INSERTION_MODE_INITIAL:
					return $this->step_initial();

				case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML:
					return $this->step_before_html();

				case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD:
					return $this->step_before_head();

				case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD:
					return $this->step_in_head();

				case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD_NOSCRIPT:
					return $this->step_in_head_noscript();

				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD:
					return $this->step_after_head();

				case WP_HTML_Processor_State::INSERTION_MODE_IN_BODY:
					return $this->step_in_body();

				case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE:
					return $this->step_in_table();

				case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_TEXT:
					return $this->step_in_table_text();

				case WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION:
					return $this->step_in_caption();

				case WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP:
					return $this->step_in_column_group();

				case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY:
					return $this->step_in_table_body();

				case WP_HTML_Processor_State::INSERTION_MODE_IN_ROW:
					return $this->step_in_row();

				case WP_HTML_Processor_State::INSERTION_MODE_IN_CELL:
					return $this->step_in_cell();

				case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT:
					return $this->step_in_select();

				case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE:
					return $this->step_in_select_in_table();

				case WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE:
					return $this->step_in_template();

				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY:
					return $this->step_after_body();

				case WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET:
					return $this->step_in_frameset();

				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_FRAMESET:
					return $this->step_after_frameset();

				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_BODY:
					return $this->step_after_after_body();

				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_FRAMESET:
					return $this->step_after_after_frameset();

				// This should be unreachable but PHP doesn't have total type checking on switch.
				default:
					$this->bail( "Unaware of the requested parsing mode: '{$this->state->insertion_mode}'." );
			}
		} catch ( WP_HTML_Unsupported_Exception $e ) {
			/*
			 * Exceptions are used in this class to escape deep call stacks that
			 * otherwise might involve messier calling and return conventions.
			 */
			return false;
		}
	}

	/**
	 * Computes the HTML breadcrumbs for the currently-matched node, if matched.
	 *
	 * Breadcrumbs start at the outermost parent and descend toward the matched element.
	 * They always include the entire path from the root HTML node to the matched element.
	 *
	 * @todo It could be more efficient to expose a generator-based version of this function
	 *       to avoid creating the array copy on tag iteration. If this is done, it would likely
	 *       be more useful to walk up the stack when yielding instead of starting at the top.
	 *
	 * Example
	 *
	 *     $processor = WP_HTML_Processor::create_fragment( '<p><strong><em><img></em></strong></p>' );
	 *     $processor->next_tag( 'IMG' );
	 *     $processor->get_breadcrumbs() === array( 'HTML', 'BODY', 'P', 'STRONG', 'EM', 'IMG' );
	 *
	 * @since 6.4.0
	 *
	 * @return string[]|null Array of tag names representing path to matched node, if matched, otherwise NULL.
	 */
	public function get_breadcrumbs(): ?array {
		return $this->breadcrumbs;
	}

	/**
	 * Returns the nesting depth of the current location in the document.
	 *
	 * Example:
	 *
	 *     $processor = WP_HTML_Processor::create_fragment( '<div><p></p></div>' );
	 *     // The processor starts in the BODY context, meaning it has depth from the start: HTML > BODY.
	 *     2 === $processor->get_current_depth();
	 *
	 *     // Opening the DIV element increases the depth.
	 *     $processor->next_token();
	 *     3 === $processor->get_current_depth();
	 *
	 *     // Opening the P element increases the depth.
	 *     $processor->next_token();
	 *     4 === $processor->get_current_depth();
	 *
	 *     // The P element is closed during `next_token()` so the depth is decreased to reflect that.
	 *     $processor->next_token();
	 *     3 === $processor->get_current_depth();
	 *
	 * @since 6.6.0
	 *
	 * @return int Nesting-depth of current location in the document.
	 */
	public function get_current_depth(): int {
		return count( $this->breadcrumbs );
	}

	/**
	 * Normalizes an HTML fragment by serializing it.
	 *
	 * This method assumes that the given HTML snippet is found in BODY context.
	 * For normalizing full documents or fragments found in other contexts, create
	 * a new processor using WP_HTML_Processor::create_fragment or
	 * WP_HTML_Processor::create_full_parser and call WP_HTML_Processor::serialize
	 * on the created instances.
	 *
	 * Many aspects of an input HTML fragment may be changed during normalization.
	 *
	 *  - Attribute values will be double-quoted.
	 *  - Duplicate attributes will be removed.
	 *  - Omitted tags will be added.
	 *  - Tag and attribute name casing will be lower-cased,
	 *    except for specific SVG and MathML tags or attributes.
	 *  - Text will be re-encoded, null bytes handled,
	 *    and invalid UTF-8 replaced with U+FFFD.
	 *  - Any incomplete syntax trailing at the end will be omitted,
	 *    for example, an unclosed comment opener will be removed.
	 *
	 * Example:
	 *
	 *     echo WP_HTML_Processor::normalize( '<a href=#anchor v=5 href="/" enabled>One</a another v=5><!--' );
	 *     // <a href="#anchor" v="5" enabled>One</a>
	 *
	 *     echo WP_HTML_Processor::normalize( '<div></p>fun<table><td>cell</div>' );
	 *     // <div><p></p>fun<table><tbody><tr><td>cell</td></tr></tbody></table></div>
	 *
	 *     echo WP_HTML_Processor::normalize( '<![CDATA[invalid comment]]> syntax < <> "oddities"' );
	 *     // <!--[CDATA[invalid comment]]--> syntax &lt; &lt;&gt; &quot;oddities&quot;
	 *
	 * @since 6.7.0
	 *
	 * @param string $html Input HTML to normalize.
	 *
	 * @return string|null Normalized output, or `null` if unable to normalize.
	 */
	public static function normalize( string $html ): ?string {
		return static::create_fragment( $html )->serialize();
	}

	/**
	 * Returns normalized HTML for a fragment by serializing it.
	 *
	 * This differs from WP_HTML_Processor::normalize in that it starts with
	 * a specific HTML Processor, which _must_ not have already started scanning;
	 * it must be in the initial ready state and will be in the completed state once
	 * serialization is complete.
	 *
	 * Many aspects of an input HTML fragment may be changed during normalization.
	 *
	 *  - Attribute values will be double-quoted.
	 *  - Duplicate attributes will be removed.
	 *  - Omitted tags will be added.
	 *  - Tag and attribute name casing will be lower-cased,
	 *    except for specific SVG and MathML tags or attributes.
	 *  - Text will be re-encoded, null bytes handled,
	 *    and invalid UTF-8 replaced with U+FFFD.
	 *  - Any incomplete syntax trailing at the end will be omitted,
	 *    for example, an unclosed comment opener will be removed.
	 *
	 * Example:
	 *
	 *     $processor = WP_HTML_Processor::create_fragment( '<a href=#anchor v=5 href="/" enabled>One</a another v=5><!--' );
	 *     echo $processor->serialize();
	 *     // <a href="#anchor" v="5" enabled>One</a>
	 *
	 *     $processor = WP_HTML_Processor::create_fragment( '<div></p>fun<table><td>cell</div>' );
	 *     echo $processor->serialize();
	 *     // <div><p></p>fun<table><tbody><tr><td>cell</td></tr></tbody></table></div>
	 *
	 *     $processor = WP_HTML_Processor::create_fragment( '<![CDATA[invalid comment]]> syntax < <> "oddities"' );
	 *     echo $processor->serialize();
	 *     // <!--[CDATA[invalid comment]]--> syntax &lt; &lt;&gt; &quot;oddities&quot;
	 *
	 * @since 6.7.0
	 *
	 * @return string|null Normalized HTML markup represented by processor,
	 *                     or `null` if unable to generate serialization.
	 */
	public function serialize(): ?string {
		if ( WP_HTML_Tag_Processor::STATE_READY !== $this->parser_state ) {
			wp_trigger_error(
				__METHOD__,
				'An HTML Processor which has already started processing cannot serialize its contents. Serialize immediately after creating the instance.',
				E_USER_WARNING
			);
			return null;
		}

		$html = '';
		while ( $this->next_token() ) {
			$html .= $this->serialize_token();
		}

		if ( null !== $this->get_last_error() ) {
			wp_trigger_error(
				__METHOD__,
				"Cannot serialize HTML Processor with parsing error: {$this->get_last_error()}.",
				E_USER_WARNING
			);
			return null;
		}

		return $html;
	}

	/**
	 * Serializes the currently-matched token.
	 *
	 * This method produces a fully-normative HTML string for the currently-matched token,
	 * if able. If not matched at any token or if the token doesn't correspond to any HTML
	 * it will return an empty string (for example, presumptuous end tags are ignored).
	 *
	 * @see static::serialize()
	 *
	 * @since 6.7.0
	 *
	 * @return string Serialization of token, or empty string if no serialization exists.
	 */
	protected function serialize_token(): string {
		$html       = '';
		$token_type = $this->get_token_type();

		switch ( $token_type ) {
			case '#text':
				$html .= htmlspecialchars( $this->get_modifiable_text(), ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8' );
				break;

			// Unlike the `<>` which is interpreted as plaintext, this is ignored entirely.
			case '#presumptuous-tag':
				break;

			case '#funky-comment':
			case '#comment':
				$html .= "<!--{$this->get_full_comment_text()}-->";
				break;

			case '#cdata-section':
				$html .= "<![CDATA[{$this->get_modifiable_text()}]]>";
				break;

			case 'html':
				$html .= '<!DOCTYPE html>';
				break;
		}

		if ( '#tag' !== $token_type ) {
			return $html;
		}

		$tag_name       = str_replace( "\x00", "\u{FFFD}", $this->get_tag() );
		$in_html        = 'html' === $this->get_namespace();
		$qualified_name = $in_html ? strtolower( $tag_name ) : $this->get_qualified_tag_name();

		if ( $this->is_tag_closer() ) {
			$html .= "</{$qualified_name}>";
			return $html;
		}

		$attribute_names = $this->get_attribute_names_with_prefix( '' );
		if ( ! isset( $attribute_names ) ) {
			$html .= "<{$qualified_name}>";
			return $html;
		}

		$html .= "<{$qualified_name}";
		foreach ( $attribute_names as $attribute_name ) {
			$html .= " {$this->get_qualified_attribute_name( $attribute_name )}";
			$value = $this->get_attribute( $attribute_name );

			if ( is_string( $value ) ) {
				$html .= '="' . htmlspecialchars( $value, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5 ) . '"';
			}

			$html = str_replace( "\x00", "\u{FFFD}", $html );
		}

		if ( ! $in_html && $this->has_self_closing_flag() ) {
			$html .= ' /';
		}

		$html .= '>';

		// Flush out self-contained elements.
		if ( $in_html && in_array( $tag_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP' ), true ) ) {
			$text = $this->get_modifiable_text();

			switch ( $tag_name ) {
				case 'IFRAME':
				case 'NOEMBED':
				case 'NOFRAMES':
					$text = '';
					break;

				case 'SCRIPT':
				case 'STYLE':
					break;

				default:
					$text = htmlspecialchars( $text, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8' );
			}

			$html .= "{$text}</{$qualified_name}>";
		}

		return $html;
	}

	/**
	 * Parses next element in the 'initial' insertion mode.
	 *
	 * This internal function performs the 'initial' insertion mode
	 * logic for the generalized WP_HTML_Processor::step() function.
	 *
	 * @since 6.7.0
	 *
	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
	 *
	 * @see https://html.spec.whatwg.org/#the-initial-insertion-mode
	 * @see WP_HTML_Processor::step
	 *
	 * @return bool Whether an element was found.
	 */
	private function step_initial(): bool {
		$token_name = $this->get_token_name();
		$token_type = $this->get_token_type();
		$op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
		$op         = "{$op_sigil}{$token_name}";

		switch ( $op ) {
			/*
			 * > A character token that is one of U+0009 CHARACTER TABULATION,
			 * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
			 * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
			 *
			 * Parse error: ignore the token.
			 */
			case '#text':
				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
					return $this->step();
				}
				goto initial_anything_else;
				break;

			/*
			 * > A comment token
			 */
			case '#comment':
			case '#funky-comment':
			case '#presumptuous-tag':
				$this->insert_html_element( $this->state->current_token );
				return true;

			/*
			 * > A DOCTYPE token
			 */
			case 'html':
				$doctype = $this->get_doctype_info();
				if ( null !== $doctype && 'quirks' === $doctype->indicated_compatability_mode ) {
					$this->compat_mode = WP_HTML_Tag_Processor::QUIRKS_MODE;
				}

				/*
				 * > Then, switch the insertion mode to "before html".
				 */
				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML;
				$this->insert_html_element( $this->state->current_token );
				return true;
		}

		/*
		 * > Anything else
		 */
		initial_anything_else:
		$this->compat_mode           = WP_HTML_Tag_Processor::QUIRKS_MODE;
		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML;
		return $this->step( self::REPROCESS_CURRENT_NODE );
	}

	/**
	 * Parses next element in the 'before html' insertion mode.
	 *
	 * This internal function performs the 'before html' insertion mode
	 * logic for the generalized WP_HTML_Processor::step() function.
	 *
	 * @since 6.7.0
	 *
	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
	 *
	 * @see https://html.spec.whatwg.org/#the-before-html-insertion-mode
	 * @see WP_HTML_Processor::step
	 *
	 * @return bool Whether an element was found.
	 */
	private function step_before_html(): bool {
		$token_name = $this->get_token_name();
		$token_type = $this->get_token_type();
		$is_closer  = parent::is_tag_closer();
		$op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
		$op         = "{$op_sigil}{$token_name}";

		switch ( $op ) {
			/*
			 * > A DOCTYPE token
			 */
			case 'html':
				// Parse error: ignore the token.
				return $this->step();

			/*
			 * > A comment token
			 */
			case '#comment':
			case '#funky-comment':
			case '#presumptuous-tag':
				$this->insert_html_element( $this->state->current_token );
				return true;

			/*
			 * > A character token that is one of U+0009 CHARACTER TABULATION,
			 * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
			 * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
			 *
			 * Parse error: ignore the token.
			 */
			case '#text':
				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
					return $this->step();
				}
				goto before_html_anything_else;
				break;

			/*
			 * > A start tag whose tag name is "html"
			 */
			case '+HTML':
				$this->insert_html_element( $this->state->current_token );
				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD;
				return true;

			/*
			 * > An end tag whose tag name is one of: "head", "body", "html", "br"
			 *
			 * Closing BR tags are always reported by the Tag Processor as opening tags.
			 */
			case '-HEAD':
			case '-BODY':
			case '-HTML':
				/*
				 * > Act as described in the "anything else" entry below.
				 */
				goto before_html_anything_else;
				break;
		}

		/*
		 * > Any other end tag
		 */
		if ( $is_closer ) {
			// Parse error: ignore the token.
			return $this->step();
		}

		/*
		 * > Anything else.
		 *
		 * > Create an html element whose node document is the Document object.
		 * > Append it to the Document object. Put this element in the stack of open elements.
		 * > Switch the insertion mode to "before head", then reprocess the token.
		 */
		before_html_anything_else:
		$this->insert_virtual_node( 'HTML' );
		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD;
		return $this->step( self::REPROCESS_CURRENT_NODE );
	}

	/**
	 * Parses next element in the 'before head' insertion mode.
	 *
	 * This internal function performs the 'before head' insertion mode
	 * logic for the generalized WP_HTML_Processor::step() function.
	 *
	 * @since 6.7.0 Stub implementation.
	 *
	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
	 *
	 * @see https://html.spec.whatwg.org/#the-before-head-insertion-mode
	 * @see WP_HTML_Processor::step
	 *
	 * @return bool Whether an element was found.
	 */
	private function step_before_head(): bool {
		$token_name = $this->get_token_name();
		$token_type = $this->get_token_type();
		$is_closer  = parent::is_tag_closer();
		$op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
		$op         = "{$op_sigil}{$token_name}";

		switch ( $op ) {
			/*
			 * > A character token that is one of U+0009 CHARACTER TABULATION,
			 * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
			 * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
			 *
			 * Parse error: ignore the token.
			 */
			case '#text':
				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
					return $this->step();
				}
				goto before_head_anything_else;
				break;

			/*
			 * > A comment token
			 */
			case '#comment':
			case '#funky-comment':
			case '#presumptuous-tag':
				$this->insert_html_element( $this->state->current_token );
				return true;

			/*
			 * > A DOCTYPE token
			 */
			case 'html':
				// Parse error: ignore the token.
				return $this->step();

			/*
			 * > A start tag whose tag name is "html"
			 */
			case '+HTML':
				return $this->step_in_body();

			/*
			 * > A start tag whose tag name is "head"
			 */
			case '+HEAD':
				$this->insert_html_element( $this->state->current_token );
				$this->state->head_element   = $this->state->current_token;
				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
				return true;

			/*
			 * > An end tag whose tag name is one of: "head", "body", "html", "br"
			 * > Act as described in the "anything else" entry below.
			 *
			 * Closing BR tags are always reported by the Tag Processor as opening tags.
			 */
			case '-HEAD':
			case '-BODY':
			case '-HTML':
				goto before_head_anything_else;
				break;
		}

		if ( $is_closer ) {
			// Parse error: ignore the token.
			return $this->step();
		}

		/*
		 * > Anything else
		 *
		 * > Insert an HTML element for a "head" start tag token with no attributes.
		 */
		before_head_anything_else:
		$this->state->head_element   = $this->insert_virtual_node( 'HEAD' );
		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
		return $this->step( self::REPROCESS_CURRENT_NODE );
	}

	/**
	 * Parses next element in the 'in head' insertion mode.
	 *
	 * This internal function performs the 'in head' insertion mode
	 * logic for the generalized WP_HTML_Processor::step() function.
	 *
	 * @since 6.7.0
	 *
	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
	 *
	 * @see https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inhead
	 * @see WP_HTML_Processor::step
	 *
	 * @return bool Whether an element was found.
	 */
	private function step_in_head(): bool {
		$token_name = $this->get_token_name();
		$token_type = $this->get_token_type();
		$is_closer  = parent::is_tag_closer();
		$op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
		$op         = "{$op_sigil}{$token_name}";

		switch ( $op ) {
			case '#text':
				/*
				 * > A character token that is one of U+0009 CHARACTER TABULATION,
				 * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
				 * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
				 */
				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
					// Insert the character.
					$this->insert_html_element( $this->state->current_token );
					return true;
				}

				goto in_head_anything_else;
				break;

			/*
			 * > A comment token
			 */
			case '#comment':
			case '#funky-comment':
			case '#presumptuous-tag':
				$this->insert_html_element( $this->state->current_token );
				return true;

			/*
			 * > A DOCTYPE token
			 */
			case 'html':
				// Parse error: ignore the token.
				return $this->step();

			/*
			 * > A start tag whose tag name is "html"
			 */
			case '+HTML':
				return $this->step_in_body();

			/*
			 * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link"
			 */
			case '+BASE':
			case '+BASEFONT':
			case '+BGSOUND':
			case '+LINK':
				$this->insert_html_element( $this->state->current_token );
				return true;

			/*
			 * > A start tag whose tag name is "meta"
			 */
			case '+META':
				$this->insert_html_element( $this->state->current_token );

				/*
				 * > If the active speculative HTML parser is null, then:
				 * >   - If the element has a charset attribute, and getting an encoding from
				 * >     its value results in an encoding, and the confidence is currently
				 * >     tentative, then change the encoding to the resulting encoding.
				 */
				$charset = $this->get_attribute( 'charset' );
				if ( is_string( $charset ) && 'tentative' === $this->state->encoding_confidence ) {
					$this->bail( 'Cannot yet process META tags with charset to determine encoding.' );
				}

				/*
				 * >   - Otherwise, if the element has an http-equiv attribute whose value is
				 * >     an ASCII case-insensitive match for the string "Content-Type", and
				 * >     the element has a content attribute, and applying the algorithm for
				 * >     extracting a character encoding from a meta element to that attribute's
				 * >     value returns an encoding, and the confidence is currently tentative,
				 * >     then change the encoding to the extracted encoding.
				 */
				$http_equiv = $this->get_attribute( 'http-equiv' );
				$content    = $this->get_attribute( 'content' );
				if (
					is_string( $http_equiv ) &&
					is_string( $content ) &&
					0 === strcasecmp( $http_equiv, 'Content-Type' ) &&
					'tentative' === $this->state->encoding_confidence
				) {
					$this->bail( 'Cannot yet process META tags with http-equiv Content-Type to determine encoding.' );
				}

				return true;

			/*
			 * > A start tag whose tag name is "title"
			 */
			case '+TITLE':
				$this->insert_html_element( $this->state->current_token );
				return true;

			/*
			 * > A start tag whose tag name is "noscript", if the scripting flag is enabled
			 * > A start tag whose tag name is one of: "noframes", "style"
			 *
			 * The scripting flag is never enabled in this parser.
			 */
			case '+NOFRAMES':
			case '+STYLE':
				$this->insert_html_element( $this->state->current_token );
				return true;

			/*
			 * > A start tag whose tag name is "noscript", if the scripting flag is disabled
			 */
			case '+NOSCRIPT':
				$this->insert_html_element( $this->state->current_token );
				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD_NOSCRIPT;
				return true;

			/*
			 * > A start tag whose tag name is "script"
			 *
			 * @todo Could the adjusted insertion location be anything other than the current location?
			 */
			case '+SCRIPT':
				$this->insert_html_element( $this->state->current_token );
				return true;

			/*
			 * > An end tag whose tag name is "head"
			 */
			case '-HEAD':
				$this->state->stack_of_open_elements->pop();
				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD;
				return true;

			/*
			 * > An end tag whose tag name is one of: "body", "html", "br"
			 *
			 * BR tags are always reported by the Tag Processor as opening tags.
			 */
			case '-BODY':
			case '-HTML':
				/*
				 * > Act as described in the "anything else" entry below.
				 */
				goto in_head_anything_else;
				break;

			/*
			 * > A start tag whose tag name is "template"
			 *
			 * @todo Could the adjusted insertion location be anything other than the current location?
			 */
			case '+TEMPLATE':
				$this->state->active_formatting_elements->insert_marker();
				$this->state->frameset_ok = false;

				$this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
				$this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;

				$this->insert_html_element( $this->state->current_token );
				return true;

			/*
			 * > An end tag whose tag name is "template"
			 */
			case '-TEMPLATE':
				if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
					// @todo Indicate a parse error once it's possible.
					return $this->step();
				}

				$this->generate_implied_end_tags_thoroughly();
				if ( ! $this->state->stack_of_open_elements->current_node_is( 'TEMPLATE' ) ) {
					// @todo Indicate a parse error once it's possible.
				}

				$this->state->stack_of_open_elements->pop_until( 'TEMPLATE' );
				$this->state->active_formatting_elements->clear_up_to_last_marker();
				array_pop( $this->state->stack_of_template_insertion_modes );
				$this->reset_insertion_mode_appropriately();
				return true;
		}

		/*
		 * > A start tag whose tag name is "head"
		 * > Any other end tag
		 */
		if ( '+HEAD' === $op || $is_closer ) {
			// Parse error: ignore the token.
			return $this->step();
		}

		/*
		 * > Anything else
		 */
		in_head_anything_else:
		$this->state->stack_of_open_elements->pop();
		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD;
		return $this->step( self::REPROCESS_CURRENT_NODE );
	}

	/**
	 * Parses next element in the 'in head noscript' insertion mode.
	 *
	 * This internal function performs the 'in head noscript' insertion mode
	 * logic for the generalized WP_HTML_Processor::step() function.
	 *
	 * @since 6.7.0 Stub implementation.
	 *
	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
	 *
	 * @see https://html.spec.whatwg.org/#parsing-main-inheadnoscript
	 * @see WP_HTML_Processor::step
	 *
	 * @return bool Whether an element was found.
	 */
	private function step_in_head_noscript(): bool {
		$token_name = $this->get_token_name();
		$token_type = $this->get_token_type();
		$is_closer  = parent::is_tag_closer();
		$op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
		$op         = "{$op_sigil}{$token_name}";

		switch ( $op ) {
			/*
			 * > A character token that is one of U+0009 CHARACTER TABULATION,
			 * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
			 * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
			 *
			 * Parse error: ignore the token.
			 */
			case '#text':
				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
					return $this->step_in_head();
				}

				goto in_head_noscript_anything_else;
				break;

			/*
			 * > A DOCTYPE token
			 */
			case 'html':
				// Parse error: ignore the token.
				return $this->step();

			/*
			 * > A start tag whose tag name is "html"
			 */
			case '+HTML':
				return $this->step_in_body();

			/*
			 * > An end tag whose tag name is "noscript"
			 */
			case '-NOSCRIPT':
				$this->state->stack_of_open_elements->pop();
				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
				return true;

			/*
			 * > A comment token
			 * >
			 * > A start tag whose tag name is one of: "basefont", "bgsound",
			 * > "link", "meta", "noframes", "style"
			 */
			case '#comment':
			case '#funky-comment':
			case '#presumptuous-tag':
			case '+BASEFONT':
			case '+BGSOUND':
			case '+LINK':
			case '+META':
			case '+NOFRAMES':
			case '+STYLE':
				return $this->step_in_head();

			/*
			 * > An end tag whose tag name is "br"
			 *
			 * This should never happen, as the Tag Processor prevents showing a BR closing tag.
			 */
		}

		/*
		 * > A start tag whose tag name is one of: "head", "noscript"
		 * > Any other end tag
		 */
		if ( '+HEAD' === $op || '+NOSCRIPT' === $op || $is_closer ) {
			// Parse error: ignore the token.
			return $this->step();
		}

		/*
		 * > Anything else
		 *
		 * Anything here is a parse error.
		 */
		in_head_noscript_anything_else:
		$this->state->stack_of_open_elements->pop();
		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
		return $this->step( self::REPROCESS_CURRENT_NODE );
	}

	/**
	 * Parses next element in the 'after head' insertion mode.
	 *
	 * This internal function performs the 'after head' insertion mode
	 * logic for the generalized WP_HTML_Processor::step() function.
	 *
	 * @since 6.7.0 Stub implementation.
	 *
	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
	 *
	 * @see https://html.spec.whatwg.org/#the-after-head-insertion-mode
	 * @see WP_HTML_Processor::step
	 *
	 * @return bool Whether an element was found.
	 */
	private function step_after_head(): bool {
		$token_name = $this->get_token_name();
		$token_type = $this->get_token_type();
		$is_closer  = parent::is_tag_closer();
		$op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
		$op         = "{$op_sigil}{$token_name}";

		switch ( $op ) {
			/*
			 * > A character token that is one of U+0009 CHARACTER TABULATION,
			 * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
			 * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
			 */
			case '#text':
				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
					// Insert the character.
					$this->insert_html_element( $this->state->current_token );
					return true;
				}
				goto after_head_anything_else;
				break;

			/*
			 * > A comment token
			 */
			case '#comment':
			case '#funky-comment':
			case '#presumptuous-tag':
				$this->insert_html_element( $this->state->current_token );
				return true;

			/*
			 * > A DOCTYPE token
			 */
			case 'html':
				// Parse error: ignore the token.
				return $this->step();

			/*
			 * > A start tag whose tag name is "html"
			 */
			case '+HTML':
				return $this->step_in_body();

			/*
			 * > A start tag whose tag name is "body"
			 */
			case '+BODY':
				$this->insert_html_element( $this->state->current_token );
				$this->state->frameset_ok    = false;
				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
				return true;

			/*
			 * > A start tag whose tag name is "frameset"
			 */
			case '+FRAMESET':
				$this->insert_html_element( $this->state->current_token );
				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET;
				return true;

			/*
			 * > A start tag whose tag name is one of: "base", "basefont", "bgsound",
			 * > "link", "meta", "noframes", "script", "style", "template", "title"
			 *
			 * Anything here is a parse error.
			 */
			case '+BASE':
			case '+BASEFONT':
			case '+BGSOUND':
			case '+LINK':
			case '+META':
			case '+NOFRAMES':
			case '+SCRIPT':
			case '+STYLE':
			case '+TEMPLATE':
			case '+TITLE':
				/*
				 * > Push the node pointed to by the head element pointer onto the stack of open elements.
				 * > Process the token using the rules for the "in head" insertion mode.
				 * > Remove the node pointed to by the head element pointer from the stack of open elements. (It might not be the current node at this point.)
				 */
				$this->bail( 'Cannot process elements after HEAD which reopen the HEAD element.' );
				/*
				 * Do not leave this break in when adding support; it's here to prevent
				 * WPCS from getting confused at the switch structure without a return,
				 * because it doesn't know that `bail()` always throws.
				 */
				break;

			/*
			 * > An end tag whose tag name is "template"
			 */
			case '-TEMPLATE':
				return $this->step_in_head();

			/*
			 * > An end tag whose tag name is one of: "body", "html", "br"
			 *
			 * Closing BR tags are always reported by the Tag Processor as opening tags.
			 */
			case '-BODY':
			case '-HTML':
				/*
				 * > Act as described in the "anything else" entry below.
				 */
				goto after_head_anything_else;
				break;
		}

		/*
		 * > A start tag whose tag name is "head"
		 * > Any other end tag
		 */
		if ( '+HEAD' === $op || $is_closer ) {
			// Parse error: ignore the token.
			return $this->step();
		}

		/*
		 * > Anything else
		 * > Insert an HTML element for a "body" start tag token with no attributes.
		 */
		after_head_anything_else:
		$this->insert_virtual_node( 'BODY' );
		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
		return $this->step( self::REPROCESS_CURRENT_NODE );
	}

	/**
	 * Parses next element in the 'in body' insertion mode.
	 *
	 * This internal function performs the 'in body' insertion mode
	 * logic for the generalized WP_HTML_Processor::step() function.
	 *
	 * @since 6.4.0
	 *
	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
	 *
	 * @see https://html.spec.whatwg.org/#parsing-main-inbody
	 * @see WP_HTML_Processor::step
	 *
	 * @return bool Whether an element was found.
	 */
	private function step_in_body(): bool {
		$token_name = $this->get_token_name();
		$token_type = $this->get_token_type();
		$op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
		$op         = "{$op_sigil}{$token_name}";

		switch ( $op ) {
			case '#text':
				/*
				 * > A character token that is U+0000 NULL
				 *
				 * Any successive sequence of NULL bytes is ignored and won't
				 * trigger active format reconstruction. Therefore, if the text
				 * only comprises NULL bytes then the token should be ignored
				 * here, but if there are any other characters in the stream
				 * the active formats should be reconstructed.
				 */
				if ( parent::TEXT_IS_NULL_SEQUENCE === $this->text_node_classification ) {
					// Parse error: ignore the token.
					return $this->step();
				}

				$this->reconstruct_active_formatting_elements();

				/*
				 * Whitespace-only text does not affect the frameset-ok flag.
				 * It is probably inter-element whitespace, but it may also
				 * contain character references which decode only to whitespace.
				 */
				if ( parent::TEXT_IS_GENERIC === $this->text_node_classification ) {
					$this->state->frameset_ok = false;
				}

				$this->insert_html_element( $this->state->current_token );
				return true;

			case '#comment':
			case '#funky-comment':
			case '#presumptuous-tag':
				$this->insert_html_element( $this->state->current_token );
				return true;

			/*
			 * > A DOCTYPE token
			 * > Parse error. Ignore the token.
			 */
			case 'html':
				return $this->step();

			/*
			 * > A start tag whose tag name is "html"
			 */
			case '+HTML':
				if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
					/*
					 * > Otherwise, for each attribute on the token, check to see if the attribute
					 * > is already present on the top element of the stack of open elements. If
					 * > it is not, add the attribute and its corresponding value to that element.
					 *
					 * This parser does not currently support this behavior: ignore the token.
					 */
				}

				// Ignore the token.
				return $this->step();

			/*
			 * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link",
			 * > "meta", "noframes", "script", "style", "template", "title"
			 * >
			 * > An end tag whose tag name is "template"
			 */
			case '+BASE':
			case '+BASEFONT':
			case '+BGSOUND':
			case '+LINK':
			case '+META':
			case '+NOFRAMES':
			case '+SCRIPT':
			case '+STYLE':
			case '+TEMPLATE':
			case '+TITLE':
			case '-TEMPLATE':
				return $this->step_in_head();

			/*
			 * > A start tag whose tag name is "body"
			 *
			 * This tag in the IN BODY insertion mode is a parse error.
			 */
			case '+BODY':
				if (
					1 === $this->state->stack_of_open_elements->count() ||
					'BODY' !== ( $this->state->stack_of_open_elements->at( 2 )->node_name ?? null ) ||
					$this->state->stack_of_open_elements->contains( 'TEMPLATE' )
				) {
					// Ignore the token.
					return $this->step();
				}

				/*
				 * > Otherwise, set the frameset-ok flag to "not ok"; then, for each attribute
				 * > on the token, check to see if the attribute is already present on the body
				 * > element (the second element) on the stack of open elements, and if it is
				 * > not, add the attribute and its corresponding value to that element.
				 *
				 * This parser does not currently support this behavior: ignore the token.
				 */
				$this->state->frameset_ok = false;
				return $this->step();

			/*
			 * > A start tag whose tag name is "frameset"
			 *
			 * This tag in the IN BODY insertion mode is a parse error.
			 */
			case '+FRAMESET':
				if (
					1 === $this->state->stack_of_open_elements->count() ||
					'BODY' !== ( $this->state->stack_of_open_elements->at( 2 )->node_name ?? null ) ||
					false === $this->state->frameset_ok
				) {
					// Ignore the token.
					return $this->step();
				}

				/*
				 * > Otherwise, run the following steps:
				 */
				$this->bail( 'Cannot process non-ignored FRAMESET tags.' );
				break;

			/*
			 * > An end tag whose tag name is "body"
			 */
			case '-BODY':
				if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'BODY' ) ) {
					// Parse error: ignore the token.
					return $this->step();
				}

				/*
				 * > Otherwise, if there is a node in the stack of open elements that is not either a
				 * > dd element, a dt element, an li element, an optgroup element, an option element,
				 * > a p element, an rb element, an rp element, an rt element, an rtc element, a tbody
				 * > element, a td element, a tfoot element, a th element, a thread element, a tr
				 * > element, the body element, or the html element, then this is a parse error.
				 *
				 * There is nothing to do for this parse error, so don't check for it.
				 */

				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY;
				return true;

			/*
			 * > An end tag whose tag name is "html"
			 */
			case '-HTML':
				if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'BODY' ) ) {
					// Parse error: ignore the token.
					return $this->step();
				}

				/*
				 * > Otherwise, if there is a node in the stack of open elements that is not either a
				 * > dd element, a dt element, an li element, an optgroup element, an option element,
				 * > a p element, an rb element, an rp element, an rt element, an rtc element, a tbody
				 * > element, a td element, a tfoot element, a th element, a thread element, a tr
				 * > element, the body element, or the html element, then this is a parse error.
				 *
				 * There is nothing to do for this parse error, so don't check for it.
				 */

				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY;
				return $this->step( self::REPROCESS_CURRENT_NODE );

			/*
			 * > A start tag whose tag name is one of: "address", "article", "aside",
			 * > "blockquote", "center", "details", "dialog", "dir", "div", "dl",
			 * > "fieldset", "figcaption", "figure", "footer", "header", "hgroup",
			 * > "main", "menu", "nav", "ol", "p", "search", "section", "summary", "ul"
			 */
			case '+ADDRESS':
			case '+ARTICLE':
			case '+ASIDE':
			case '+BLOCKQUOTE':
			case '+CENTER':
			case '+DETAILS':
			case '+DIALOG':
			case '+DIR':
			case '+DIV':
			case '+DL':
			case '+FIELDSET':
			case '+FIGCAPTION':
			case '+FIGURE':
			case '+FOOTER':
			case '+HEADER':
			case '+HGROUP':
			case '+MAIN':
			case '+MENU':
			case '+NAV':
			case '+OL':
			case '+P':
			case '+SEARCH':
			case '+SECTION':
			case '+SUMMARY':
			case '+UL':
				if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
					$this->close_a_p_element();
				}

				$this->insert_html_element( $this->state->current_token );
				return true;

			/*
			 * > A start tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"
			 */
			case '+H1':
			case '+H2':
			case '+H3':
			case '+H4':
			case '+H5':
			case '+H6':
				if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
					$this->close_a_p_element();
				}

				if (
					in_array(
						$this->state->stack_of_open_elements->current_node()->node_name,
						array( 'H1', 'H2', 'H3', 'H4', 'H5', 'H6' ),
						true
					)
				) {
					// @todo Indicate a parse error once it's possible.
					$this->state->stack_of_open_elements->pop();
				}

				$this->insert_html_element( $this->state->current_token );
				return true;

			/*
			 * > A start tag whose tag name is one of: "pre", "listing"
			 */
			case '+PRE':
			case '+LISTING':
				if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
					$this->close_a_p_element();
				}

				/*
				 * > If the next token is a U+000A LINE FEED (LF) character token,
				 * > then ignore that token and move on to the next one. (Newlines
				 * > at the start of pre blocks are ignored as an authoring convenience.)
				 *
				 * This is handled in `get_modifiable_text()`.
				 */

				$this->insert_html_element( $this->state->current_token );
				$this->state->frameset_ok = false;
				return true;

			/*
			 * > A start tag whose tag name is "form"
			 */
			case '+FORM':
				$stack_contains_template = $this->state->stack_of_open_elements->contains( 'TEMPLATE' );

				if ( isset( $this->state->form_element ) && ! $stack_contains_template ) {
					// Parse error: ignore the token.
					return $this->step();
				}

				if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
					$this->close_a_p_element();
				}

				$this->insert_html_element( $this->state->current_token );
				if ( ! $stack_contains_template ) {
					$this->state->form_element = $this->state->current_token;
				}

				return true;

			/*
			 * > A start tag whose tag name is "li"
			 * > A start tag whose tag name is one of: "dd", "dt"
			 */
			case '+DD':
			case '+DT':
			case '+LI':
				$this->state->frameset_ok = false;
				$node                     = $this->state->stack_of_open_elements->current_node();
				$is_li                    = 'LI' === $token_name;

				in_body_list_loop:
				/*
				 * The logic for LI and DT/DD is the same except for one point: LI elements _only_
				 * close other LI elements, but a DT or DD element closes _any_ open DT or DD element.
				 */
				if ( $is_li ? 'LI' === $node->node_name : ( 'DD' === $node->node_name || 'DT' === $node->node_name ) ) {
					$node_name = $is_li ? 'LI' : $node->node_name;
					$this->generate_implied_end_tags( $node_name );
					if ( ! $this->state->stack_of_open_elements->current_node_is( $node_name ) ) {
						// @todo Indicate a parse error once it's possible. This error does not impact the logic here.
					}

					$this->state->stack_of_open_elements->pop_until( $node_name );
					goto in_body_list_done;
				}

				if (
					'ADDRESS' !== $node->node_name &&
					'DIV' !== $node->node_name &&
					'P' !== $node->node_name &&
					self::is_special( $node )
				) {
					/*
					 * > If node is in the special category, but is not an address, div,
					 * > or p element, then jump to the step labeled done below.
					 */
					goto in_body_list_done;
				} else {
					/*
					 * > Otherwise, set node to the previous entry in the stack of open elements
					 * > and return to the step labeled loop.
					 */
					foreach ( $this->state->stack_of_open_elements->walk_up( $node ) as $item ) {
						$node = $item;
						break;
					}
					goto in_body_list_loop;
				}

				in_body_list_done:
				if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
					$this->close_a_p_element();
				}

				$this->insert_html_element( $this->state->current_token );
				return true;

			case '+PLAINTEXT':
				if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
					$this->close_a_p_element();
				}

				/*
				 * @todo This may need to be handled in the Tag Processor and turn into
				 *       a single self-contained tag like TEXTAREA, whose modifiable text
				 *       is the rest of the input document as plaintext.
				 */
				$this->bail( 'Cannot process PLAINTEXT elements.' );
				break;

			/*
			 * > A start tag whose tag name is "button"
			 */
			case '+BUTTON':
				if ( $this->state->stack_of_open_elements->has_element_in_scope( 'BUTTON' ) ) {
					// @todo Indicate a parse error once it's possible. This error does not impact the logic here.
					$this->generate_implied_end_tags();
					$this->state->stack_of_open_elements->pop_until( 'BUTTON' );
				}

				$this->reconstruct_active_formatting_elements();
				$this->insert_html_element( $this->state->current_token );
				$this->state->frameset_ok = false;

				return true;

			/*
			 * > An end tag whose tag name is one of: "address", "article", "aside", "blockquote",
			 * > "button", "center", "details", "dialog", "dir", "div", "dl", "fieldset",
			 * > "figcaption", "figure", "footer", "header", "hgroup", "listing", "main",
			 * > "menu", "nav", "ol", "pre", "search", "section", "summary", "ul"
			 */
			case '-ADDRESS':
			case '-ARTICLE':
			case '-ASIDE':
			case '-BLOCKQUOTE':
			case '-BUTTON':
			case '-CENTER':
			case '-DETAILS':
			case '-DIALOG':
			case '-DIR':
			case '-DIV':
			case '-DL':
			case '-FIELDSET':
			case '-FIGCAPTION':
			case '-FIGURE':
			case '-FOOTER':
			case '-HEADER':
			case '-HGROUP':
			case '-LISTING':
			case '-MAIN':
			case '-MENU':
			case '-NAV':
			case '-OL':
			case '-PRE':
			case '-SEARCH':
			case '-SECTION':
			case '-SUMMARY':
			case '-UL':
				if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name ) ) {
					// @todo Report parse error.
					// Ignore the token.
					return $this->step();
				}

				$this->generate_implied_end_tags();
				if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
					// @todo Record parse error: this error doesn't impact parsing.
				}
				$this->state->stack_of_open_elements->pop_until( $token_name );
				return true;

			/*
			 * > An end tag whose tag name is "form"
			 */
			case '-FORM':
				if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
					$node                      = $this->state->form_element;
					$this->state->form_element = null;

					/*
					 * > If node is null or if the stack of open elements does not have node
					 * > in scope, then this is a parse error; return and ignore the token.
					 *
					 * @todo It's necessary to check if the form token itself is in scope, not
					 *       simply whether any FORM is in scope.
					 */
					if (
						null === $node ||
						! $this->state->stack_of_open_elements->has_element_in_scope( 'FORM' )
					) {
						// Parse error: ignore the token.
						return $this->step();
					}

					$this->generate_implied_end_tags();
					if ( $node !== $this->state->stack_of_open_elements->current_node() ) {
						// @todo Indicate a parse error once it's possible. This error does not impact the logic here.
						$this->bail( 'Cannot close a FORM when other elements remain open as this would throw off the breadcrumbs for the following tokens.' );
					}

					$this->state->stack_of_open_elements->remove_node( $node );
					return true;
				} else {
					/*
					 * > If the stack of open elements does not have a form element in scope,
					 * > then this is a parse error; return and ignore the token.
					 *
					 * Note that unlike in the clause above, this is checking for any FORM in scope.
					 */
					if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'FORM' ) ) {
						// Parse error: ignore the token.
						return $this->step();
					}

					$this->generate_implied_end_tags();

					if ( ! $this->state->stack_of_open_elements->current_node_is( 'FORM' ) ) {
						// @todo Indicate a parse error once it's possible. This error does not impact the logic here.
					}

					$this->state->stack_of_open_elements->pop_until( 'FORM' );
					return true;
				}
				break;

			/*
			 * > An end tag whose tag name is "p"
			 */
			case '-P':
				if ( ! $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
					$this->insert_html_element( $this->state->current_token );
				}

				$this->close_a_p_element();
				return true;

			/*
			 * > An end tag whose tag name is "li"

Changelog

VersionDescription
6.4.0Introduced.

User Contributed Notes

You must log in before being able to contribute a note or feedback.