HTML Parsing Hall of Fame

For related issues with text that aren’t related to parsing or decoding HTML, check out the Text Hall of Fame.

The title for a book review contains an <i> tag around the author, intended to italicize the author’s name, but instead it was rendered verbatim on the page.

The search results on DuckDuckGo’s mobile view shows a Citibank page whose summary is the syntax for an HTML<link> tag with a specific stylesheet referenced.

ChatGPT created a code example including embedded HTML inside a PHP snippet, then it parsed the output wrong and split the code snippet where it shouldn’t have.

Props to @artemiomorales

The online menu for a sake bar in Toyama didn’t bother to decode the character references from its source of the menu item descriptions, leaving “&quot;” instead of the actual double quote character, and “&#39;” instead of an apostrophe.

Screenshot

Many Slack bots I’ve seen fail to properly handle encoded HTML text. This one, from the WordPress.org Slack instance, failed to decode the content, leaving named character references like “&nbsp;”.

search previous next tag category expand menu location phone mail time cart zoom edit close