-
Notifications
You must be signed in to change notification settings - Fork 9.6k
Closed
Description
Provide the steps to reproduce
- Run LH on https://www.yfood.eu/ with JSON output
What is the current behavior?
The label audit for the .c-regionswitch__select--footer element has a badly-truncated nodeLabel:
{
"node": {
"type": "node",
"selector": ".c-regionswitch__select--footer",
"path": "1,HTML,1,BODY,1,DIV,4,DIV,0,FOOTER,1,DIV,0,DIV,0,SELECT",
"snippet": "<select class=\"c-regionswitch__select c-regionswitch__select--footer\">",
"explanation": "Fix any of the following:\n aria-label attribute does not exist or is empty\n aria-labelledby attribute does not exist, references elements that do not exist or references elements that are empty\n Form element does not have an implicit (wrapped) <label>\n Form element does not have an explicit <label>\n Element has no title attribute or the title attribute is empty",
"nodeLabel": "🇩🇪 Deutschland\n🇬🇧 United Kingdom\n🇵🇱 Polska\n🇳🇱 Nederland\n🇫🇷 France\n🇨\ud83c…"
}Not all JSON parsers are able to parse this correctly. For example PHP's json_decode function fails with an error: "Single unpaired UTF-16 surrogate".
Edit: golang's unmarshall also has problems.
What is the expected behavior?
Unicode surrogate pairs should be retained when truncating strings for better compatibility with JSON parsers.
Environment Information
- Affected Channels: CLI
- Lighthouse version: 6.4.1
- Operating System: Ubuntu 20.10 (Linux 5.8.0)