How would I parse character references as literal bytes and not codepoints?

I have an element like this:
```xml
<element>&#240;&#159;&#152;&#131;</element>
```

If those characters are literally interpreted, they should be the byte sequence `f0 9f 98 83`, which should be U+1F603, or `😃`. Instead, it expands to `c3 b0 c2 9f c2 98 c2 83` (this sequence is not printable, but you may inspect it [here](https://unicode.link/inspect/utf8:c3.b0.c2.9f.c2.98.c2.83)).

This is very much how this is meant to work, and I am aware of that. Unfortunately this decision wasn't made nor is it controlled by me. So, I'd like to know if there's an obvious way to change how escapes are done without having to do it by just iterating through the bytes returned by a Text event.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How would I parse character references as literal bytes and not codepoints? #667

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

How would I parse character references as literal bytes and not codepoints? #667

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions