You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[3.12] gh-135661: Fix parsing start and end tags in HTMLParser according to the HTML5 standard (GH-135930) (GH-136268)
* Whitespaces no longer accepted between `</` and the tag name.
E.g. `</ script>` does not end the script section.
* Vertical tabulation (`\v`) and non-ASCII whitespaces no longer recognized
as whitespaces. The only whitespaces are `\t\n\r\f `.
* Null character (U+0000) no longer ends the tag name.
* Attributes and slashes after the tag name in end tags are now ignored,
instead of terminating after the first `>` in quoted attribute value.
E.g. `</script/foo=">"/>`.
* Multiple slashes and whitespaces between the last attribute and closing `>`
are now ignored in both start and end tags. E.g. `<a foo=bar/ //>`.
* Multiple `=` between attribute name and value are no longer collapsed.
E.g. `<a foo==bar>` produces attribute "foo" with value "=bar".
* Whitespaces between the `=` separator and attribute name or value are no
longer ignored. E.g. `<a foo =bar>` produces two attributes "foo" and
"=bar", both with value None; `<a foo= bar>` produces two attributes:
"foo" with value "" and "bar" with value None.
* Fix data loss after unclosed script or style tag (gh-86155).
Also backport test.support.subTests() (gh-135120).
---------
(cherry picked from commit 0243f97)
Co-authored-by: Ezio Melotti <[email protected]>
Co-authored-by: Waylan Limberg <[email protected]>
Copy file name to clipboardExpand all lines: Lib/html/parser.py
+70-75Lines changed: 70 additions & 75 deletions
Original file line number
Diff line number
Diff line change
@@ -29,15 +29,43 @@
29
29
piclose=re.compile('>')
30
30
commentclose=re.compile(r'--\s*>')
31
31
# Note:
32
-
# 1) if you change tagfind/attrfind remember to update locatestarttagend too;
33
-
# 2) if you change tagfind/attrfind and/or locatestarttagend the parser will
32
+
# 1) if you change tagfind/attrfind remember to update locatetagend too;
33
+
# 2) if you change tagfind/attrfind and/or locatetagend the parser will
34
34
# explode, so don't do it.
35
-
# see http://www.w3.org/TR/html5/tokenization.html#tag-open-state
36
-
# and http://www.w3.org/TR/html5/tokenization.html#tag-name-state
0 commit comments