-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Clarify how PLAINTEXT elements may contain child nodes. #10540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Resolves whatwg#8009 All major HTML parsers reconstruct active formatting elements when inserting a new PLAINTEXT element, leaving formatting elements as children of the PLAINTEXT element. However, the spec implies that this should not happen, because it doesn't instruct reconstruction. The implication in the spec is that a PLAINTEXT element may contain no children other than the plaintext content of the remainder of the HTML document. > Once a start tag with the tag name "plaintext" has been seen, that > will be the last token ever seen other than character tokens > (and the end-of-file token), because there is no way to switch out > of the PLAINTEXT state. This patch updates the spec to conform to the existing implementations by adding the mention to trigger reconstruction.
|
See #8009 (comment) |
|
thanks @zcorpan - I have updated the patch and included screenshots of the changed section. I think that explicitly calling out that active format reconstruction may take place, and that PLAINTEXT elements may have child nodes, would be a worthwhile addition to the note. |
Co-authored-by: Simon Pieters <[email protected]>
|
Thanks @zcorpan! |
Co-authored-by: Anne van Kesteren <[email protected]>
|
@dmsnell you will also need to make your membership of the "automattic" GitHub organization public to satisfy the IPR bot. |
Co-authored-by: Anne van Kesteren <[email protected]>
|
Thanks again @annevk. I've lower-cased the |
Resolves #8009
When there are active formatting elements open when encountering a start tag whose name is PLAINTEXT, further character tokens may reconstruct the active formatting elements, but the spec implies that this should not happen, because PLAINTEXT effectively disables the HTML parsing after it.
This is confusing because while the tokenizer remains in PLAINTEXT state, the tree builder continues to apply the normal rules for its insertion mode, which is where active format reconstruction may be triggered.
While this is confusing, because it seems to contradict the purpose of the PLAINTEXT element, all major browsers follow this behavior, and a clarified note in the spec could help implementors to avoid mistaking this behavior (as I did).
Before

After

/parsing.html ( diff )