Page MenuHomePhabricator

Bug 1963612 - Re-work configuration handling and sanitize-core. r?emilio
ClosedPublic

Authored by tschuster on Apr 30 2025, 2:33 PM.
Referenced Files
Unknown Object (File)
Mon, Apr 13, 5:05 AM
Unknown Object (File)
Thu, Apr 9, 10:31 PM
Unknown Object (File)
Thu, Apr 9, 6:11 AM
Unknown Object (File)
Tue, Apr 7, 2:46 AM
Unknown Object (File)
Tue, Apr 7, 12:04 AM
Unknown Object (File)
Mon, Apr 6, 10:18 PM
Unknown Object (File)
Mon, Apr 6, 10:17 PM
Unknown Object (File)
Mon, Apr 6, 9:39 PM
Subscribers

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
tschuster updated this revision to Diff 1101771.
tschuster updated this revision to Diff 1123007.
tschuster retitled this revision from WIP: Bug 1963612 - Re-work configuration handling and sanitize-core. to Bug 1963612 - Re-work configuration handling and sanitize-core. r?emilio.
tschuster added a reviewer: emilio.
smaug requested changes to this revision.Sep 18 2025, 2:52 PM
smaug added a subscriber: smaug.
smaug added inline comments.
dom/security/sanitizer/Sanitizer.cpp
110

Nit, move the . right after configuration

233–234

So the spec doesn't make sense? Or does the spec in some cases pass some dictionary to canonicalize-a-sanitizer-name algorithm where there namespace property doesn't have a default value? It has default value for elements and attributes.

274–275

Also here, 'namespace' exists always, no? Is that a spec issue?

508

It would be nice to use the concrete type here, since it isn't obvious to the reader whether name is just the localname or CanonicalName

522

Same here.

564

Isn't mDataAttributes now a Maybe and this is just a check that it isn't Nothing?

580

Here you do different kind of "If dataAttributes is true" check.

922

Hmm, so there is one IsDataAttribute method here, and the other one in CanonicalName. That is perhaps a bit surprising.
What if there was a variant of the method here which takes CanonicalName as an argument and compares name and namespace?

1647

How do we know that mDataAttributes isn't Nothing here?

1711–1712

How do we know that mDataAttributes isn't Nothing here? In other words, why is it safe to use * operator here?

dom/security/sanitizer/SanitizerTypes.cpp
57

There are of course many ways to fix this. BloomFilter and fall back to slow loop. Hashset. Tree...

dom/security/sanitizer/SanitizerTypes.h
23

So nothing checks that it is really an attribute. One could have element using data-foo naming. Worth to add a comment

This revision now requires changes to proceed.Sep 18 2025, 2:52 PM

Thank you Olli!

dom/security/sanitizer/Sanitizer.cpp
233–234

I think the spec here doesn't really make sense.

I thought I had raised this as an issue at some point in the repo, but I can't find it anymore so I probably didn't.

I should probably raise a new issue for this?

508

I think it's fine, but I will change it for you.

564

Arg, you got me. I really need to find a way to remove that footgun. Maybe we could disallow the bool operator for Maybe<bool>?

922

Do you mean two overloads, one that takes nsAtom/namespaceid and one that takes CanonicalName? Creating a CanonicalName from nsAtom/namespaceid would be a bit inefficient and I actually tried quite a bit to avoid it in the IsDefaultConfig case.

1418

Note to self, update this.

1647

Sanitizer::CanonicalizeConfiguration ensures that mDataAttributes always exists when mAttributes exists. dataAttributes is basically an "extension" of attributes.

1711–1712

For the default config we also always initialize mDataAttributes, because it conceptually has "attributes". (See Sanitizer::SetDefaultConfig)

dom/security/sanitizer/SanitizerTypes.cpp
57

I had opened bug 1989215 to look into an ordered hash set (and maybe a map).

dom/security/sanitizer/SanitizerTypes.h
23

Good point. Maybe it would be useful to make the type hierarchy better here instead? e.g. have CanonicalElement extend CanonicalName, CanonicalAttribute extend CanonicalName, and CanonicalElementWithAttributes extend CanonicalElement.

dom/security/sanitizer/Sanitizer.cpp
922

And I tried to keep mLocalName and mNamespace as protected.

tschuster updated this revision to Diff 1125192.

Am I looking at some wrong version of the spec or is the implementation just different to https://wicg.github.io/sanitizer-api/#sanitizerconfig-allow-an-element ?

dom/security/sanitizer/Sanitizer.cpp
564

How do we know mDataAttributes isn't Nothing() here? Since if it is, *mDataAttributes has a failing assertion.

580

Also this, why is this ok?

629

...since here we anyhow check that the value isSome

762

Why do we need MaybeMaterializeDefaultConfig here when the spec doesn't have such?
It is not enough to call it in constructor or something?

770

Step 2 in the spec is "If configuration["elements"] exists:"

smaug requested changes to this revision.Sep 22 2025, 9:52 PM

Sorry, takes a bit time to get through this all.

This revision now requires changes to proceed.Sep 22 2025, 9:52 PM
dom/security/sanitizer/Sanitizer.cpp
973–979

Why is step 3 before step 4 ?

973–979

Er, why isn't step 3 before step 4 :)

1133

Why is this here? Makes it hard to follow the algorithm when it doesn't follow the spec.

1356

Why is this check ok if mComments is Nothing()?

dom/security/sanitizer/Sanitizer.cpp
564

As as I said above, this a property of the Sanitizer. We are in the config[attributes] branch, so config[dataAttributes] must exist.

https://wicg.github.io/sanitizer-api/#configuration-canonicalize

If configuration["attributes"] exists and configuration["dataAttributes"] does not exist, then set configuration["dataAttributes"] to allowCommentsAndDataAttributes.

580

Same argument. Part of the canonical form.

629

This is the config[removeAttributes] branch, which is not supposed to have config[dataAttributes]. (So it's not added by the canonicalization)

762

The default config is not real or "materialized", it's basically constant data. If someone wants to modify a Sanitizer with the default config, we have to actually create the lists that can be modified.

770
973–979

To make the std::move work. The order is unobservable.

1133

This is again to make std::move work. It might actually be possible to just switch the order in the spec. (The difference here is obviously not observable)

1356

mComments is always initialized. I could probably just change it to a boolean to be honest.

tschuster updated this revision to Diff 1126050.

Maybe usage does make some code a bit hard to reason about.

r+

And I guess there will be another update to have the implementation follow the very latest spec.

dom/security/sanitizer/Sanitizer.cpp
762

And we don't materialize it in the constructor because of performance or what?
It is a bit hard for a random reader to know in which cases it should be called and in which case it isn't needed.

This revision is now accepted and ready to land.Sep 23 2025, 3:17 PM

This revision requires a Testing Policy Project Tag to be set before landing. Please apply one of testing-approved, testing-exception-unchanged, testing-exception-ui, testing-exception-elsewhere, testing-exception-other. Tip: this Firefox add-on makes it easy!

dom/security/sanitizer/Sanitizer.cpp
762

The main reason for having a lazy default config is performance and memory usage. The code in SanitizeChildren is templated for the default config case and can omit some of the complexity. When creating a Sanitizer with a default config after the first time we don't need to keep creating the huge lists with elements and attributes. This sharing of the config is both faster and more memory efficient.