Skip to content

Comments

Proper attribute value normalization#943

Merged
dralley merged 5 commits intotafia:masterfrom
Mingun:attr-val-normalization
Feb 21, 2026
Merged

Proper attribute value normalization#943
dralley merged 5 commits intotafia:masterfrom
Mingun:attr-val-normalization

Conversation

@Mingun
Copy link
Collaborator

@Mingun Mingun commented Feb 20, 2026

Rebased and finished #379.

Added

  • Attribute::normalized_value()
  • Attribute::normalized_value_with()
  • Attribute::decoded_and_normalized_value()
  • Attribute::decoded_and_normalized_value_with()

which ought to be used in place of deprecated

  • Attribute::unescape_value()
  • Attribute::unescape_value_with()
  • Attribute::decode_and_unescape_value()
  • Attribute::decode_and_unescape_value_with()

Deserializer now also uses normalized value of attributes, which fixes #674.

Closes #371, closes #843

@Mingun Mingun requested a review from dralley February 20, 2026 18:03
/// Attribute normalization includes expanding of general entities (`&entity;`)
/// which replacement text also could contain entities, which is also must be expanded.
/// If more than 128 entities would be expanded, this error is returned.
TooManyNestedEntities,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on automatic loop detection? (note from previous PR)

Also I've noticed that some implementations detect an entity loop immediately instead of processing until the recursion limit is reached. Should that be two separate errors in your opinion, or one error?

Obviously it would not need to be added in this PR. Just, as a general concept, should an issue be filed for it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just created #945

@dralley
Copy link
Collaborator

dralley commented Feb 21, 2026

Some build / test failures, obviously


match attr.decoded_and_normalized_value(V1_0, Decoder::utf8()) {
// TODO: is this divergence between range behavior of UnterminatedEntity
// and UnrecognizedEntity appropriate? existing unescape code behaves the same. (see: start index)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my comment obviously, just checking if you saw it and whether that can be dismissed or if it is something to resolve later.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely, I also do not like the current situation and I agree with you here. Will need to think about it later, probably when dealing with #625

dralley and others added 4 commits February 21, 2026 18:27
- `escaped` to `is_attr` because it is set to true only for attributes
- `from_part()` to `from_attr()` for the same reason
… err

failures:
  serde-issues:
    issue674

  serde-de:
    xml_version::v1_0_explicit
    xml_version::v1_0_implicit
    xml_version::v1_1
Fixed all tests:
  serde-de:
    xml_version::v1_0_explicit
    xml_version::v1_0_implicit
    xml_version::v1_1

  serde-issues:
    issue674
@Mingun Mingun force-pushed the attr-val-normalization branch from a0ec487 to a6fbb5c Compare February 21, 2026 13:58
@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 80.13136% with 121 lines in your changes missing coverage. Please review.
✅ Project coverage is 55.96%. Comparing base (2b21d40) to head (a6fbb5c).
⚠️ Report is 17 commits behind head on master.

Files with missing lines Patch % Lines
benches/microbenches.rs 0.00% 35 Missing ⚠️
src/events/attributes.rs 77.86% 29 Missing ⚠️
benches/macrobenches.rs 0.00% 16 Missing ⚠️
src/escape.rs 95.52% 15 Missing ⚠️
src/de/attributes.rs 0.00% 11 Missing ⚠️
examples/custom_entities.rs 0.00% 8 Missing ⚠️
examples/read_nodes.rs 0.00% 5 Missing ⚠️
src/lib.rs 90.00% 2 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #943      +/-   ##
==========================================
+ Coverage   55.00%   55.96%   +0.96%     
==========================================
  Files          44       44              
  Lines       16816    17402     +586     
==========================================
+ Hits         9249     9739     +490     
- Misses       7567     7663      +96     
Flag Coverage Δ
unittests 55.96% <80.13%> (+0.96%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Mingun
Copy link
Collaborator Author

Mingun commented Feb 21, 2026

@dralley, I fixed build failures (as usual, forgot about benchmarks) and other places you pointed.

@dralley dralley merged commit 3018141 into tafia:master Feb 21, 2026
7 checks passed
@Mingun Mingun deleted the attr-val-normalization branch February 21, 2026 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Attribute list parsing with serde does not treat \t as separator Attribute values aren't normalized properly

3 participants