Conversation
…ne Handling" section of XML 1.1 spec Also implement EOL normalization for HTML as described in "normalize newlines" section of HTML spec https://www.w3.org/TR/xml11/#sec-line-ends https://infra.spec.whatwg.org/#normalize-newlines
src/encoding.rs
Outdated
| match bytes { | ||
| Cow::Borrowed(bytes) => { | ||
| let text = self.decode(bytes)?; | ||
| match normalize_html_eols(&text) { |
There was a problem hiding this comment.
If the normalization function is the only difference between html_content() and xml_content() as appears to be the case, then to avoid duplicating the function body you could write a single utility function and pass in the normalization function as an argument.
There was a problem hiding this comment.
Good idea, will do.
…sCData` and `BytesRef`
…not in attributes yet) Use `xml_content` instead of `decode` in serde deserializer and tests
085c142 to
38b44d4
Compare
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #884 +/- ##
==========================================
- Coverage 60.74% 55.52% -5.22%
==========================================
Files 41 42 +1
Lines 16044 15511 -533
==========================================
- Hits 9746 8613 -1133
- Misses 6298 6898 +600
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Because we support XML and HTML parsing and the rules for EOL normalization is differs between them, this PR introduces two new methods for
BytesText,BytesCDataandBytesRefin addition todecode:xml_content()html_content()XML rules: https://www.w3.org/TR/xml11/#sec-line-ends
HTML rules: https://infra.spec.whatwg.org/#normalize-newlines
The new methods does not apply to attribute value normalization, this is left for 379.
Closes #806 (when use
xml_content())