Opened 5 weeks ago
Closed 5 weeks ago
#64513 closed defect (bug) (invalid)
HTML_Processor gets wrong breadcrumbs for elements in <head>
| Reported by: |
|
Owned by: | |
|---|---|---|---|
| Milestone: | Priority: | normal | |
| Severity: | normal | Version: | 6.9 |
| Component: | HTML API | Keywords: | |
| Focuses: | Cc: |
Description
If you visit elements that typically appear in the <head>, like META or LINK, the HTML_Processor returns a breadcrumbs array with BODY instead of HEAD, e.g.
Array
(
[0] => HTML
[1] => BODY
[2] => META
)
How to reproduce:
- call
$processor = \WP_HTML_Processor::create_fragment($html)on a full HTML page, e.g. by hooking intowp_template_enhancement_output_buffer - get
$processor->next_tag('META')or some other tag that lives in<head> - get
$processor->get_breadcrumbs()and check the resulting array, e.g. withprint_r()or by doing anarray_diff()with the expected result['HTML', 'HEAD', 'META']
Wordpress: 6.9 (php8.2-apache docker)
Change History (2)
#2
@
5 weeks ago
- Keywords close removed
- Milestone Awaiting Review deleted
- Resolution set to invalid
- Status changed from new to closed
oops, I was prepping my response while @westonruter was responding. to add to what he wrote, the difference between those methods is that create_fragment() is specifically designed to operate within the context of inner HTML inside a specified element, the default being BODY.
Use this for cases where you are processing chunks of HTML that will be found within a bigger HTML document, such as rendered block output that exists within a post, the_content inside a rendered site layout.
If you use create_full_parser() it will assume that you are providing the full HTML for a page from start to finish and the initial META elements will appear within a HEAD. However, as per the HTML spec, when parsing a META tag, an element is to be created in the current element regardless of the parser’s current insertion mode.
You can see this demonstrated using your browser’s interpretation of the HTML, where META remains inside the BODY
@vicobot this is because you're parsing in BODY mode. You need to use
WP_HTML_Processor::create_full_parser()instead ofWP_HTML_Processor::create_fragment().