Page MenuHomePhabricator

Bug 1845095: Bloom filter for fast-rejecting `:has()`. r=emilio,#layout-reviewers
ClosedPublic

Authored by dshin on Jul 25 2023, 10:34 PM.
Referenced Files
Unknown Object (File)
Tue, Apr 14, 10:16 AM
Unknown Object (File)
Tue, Apr 14, 7:28 AM
Unknown Object (File)
Tue, Apr 14, 6:33 AM
Unknown Object (File)
Tue, Apr 14, 2:01 AM
Unknown Object (File)
Tue, Apr 14, 12:38 AM
Unknown Object (File)
Mon, Apr 13, 9:35 PM
Unknown Object (File)
Mon, Apr 13, 5:52 PM
Unknown Object (File)
Mon, Apr 13, 7:04 AM
Subscribers

Event Timeline

phab-bot changed the visibility from "Custom Policy" to "Public (No Login Required)".
phab-bot changed the edit policy from "Custom Policy" to "Restricted Project (Project)".
phab-bot removed a project: secure-revision.
emilio requested changes to this revision.Jul 26 2023, 9:54 AM

Some thoughts, and a couple simplifications.

I thought part of this cache would be to also check it while we're traversing descendants, to avoid traversing parts of the tree. But maybe that's not the case?

I'm a bit confused about caching the subtree hashes only after one lookup. I mean, I /think/ it makes sense, because in the common case we check :has(), and the second time through we have a cached result. But when don't we?

servo/components/selectors/context.rs
148

I think keeping the name SelectorCaches is probably fine, wdyt? You can think of the bloom filter like an opportunistic cache of sorts.

SelectorCachesFilters looks odd, sounds like filters for the caches or something? But maybe it's me not being a native speaker :)

servo/components/selectors/parser.rs
526–527

So I don't think we have a use case for having Iter != InnerIter unless I'm missing something so probably:

  • Make this pub (crate).
  • Remove the extra InnerIter param?
622

You can just do AncestorIter rather than |iter| AncestorIter(iter) fwiw.

servo/components/selectors/relative_selector/filter.rs
23

So, the other thing we should probably store here is the ElementState flags... That'd allow fast-rejecting for :has(:hover), :has(:focus), etc.

But I guess that might require some more plumbing... Wdyt? If it seems worth it, maybe file a follow-up and reference it from here?

49

I think struct Key(OpaqueElement, TraversalKind) is probably good enough?

80

Same, I think just SingleCompoundIter instead of |iter| SingleCompoundIter would do.

112

I'm confused, so is the idea that we only do this when matching the selector twice? The first time would be Lookup then we return a filter?

That seems a bit weird, why is that? Is it just to avoid the traversal for selectors that are only matched once against an element? Or for selectors that match and thus miss the match cache?

129

Consider return false; here, then deindent the rest of the function?

171

Is this needed at all? I guess we could just return iter and rely on SelectorIter returning None at combinators.

This revision now requires changes to proceed.Jul 26 2023, 9:54 AM
dshin requested review of this revision.Jul 26 2023, 2:01 PM
dshin updated this revision to Diff 746954.
dshin marked 6 inline comments as done.
dshin added inline comments.
servo/components/selectors/context.rs
148

Hmmm - ok. I agree that there's a room for misunderstanding here, and your explanation makes sense. Thanks!

servo/components/selectors/relative_selector/filter.rs
23

Hm - That could be good. Though, yeah. Probably going to have to think on that a bit. Let me file a follow-up.

49

That's... true. Generally don't like referring to things by tuple index, but we don't really do that here..

112

Ahh.. This is kind of a leftover from when I tried to save the subtree portion of sibling subtrees, but I think the usefulness still stands.

Since the cache keys on selector + element, and the filter element + their children/subtree (And then runs the selector hashes against the filter), we are basically looking for subtree & children that gets looked at twice.

e.g. Given .a + .b + .c ... + .z with parent .par, if we had :has(.b), :has(.c), ..., :has(.z) or .par:has( <subtree lookup>) and :has(~ .b > <subtree lookup>)

I'll document this.

Is it just to avoid the traversal for selectors that are only matched once against an element?

Basically this, since bloom filters measure in the order of KiBs.

129

True

171

Ah - right.

emilio removed a reviewer: layout-reviewers.
emilio added inline comments.
servo/components/selectors/relative_selector/filter.rs
79

We might want to store these directly in the RelativeSelector, so that we only compute them once, wdyt? Follow-up is fine, but file and reference from here if you agree?

112

Basically this, since bloom filters measure in the order of KiBs.

FWIW if you care about this you need to Box<BloomFilter>, otherwise you're allocating the bloom filter space regardless.

This revision is now accepted and ready to land.Jul 27 2023, 9:23 AM
servo/components/selectors/relative_selector/filter.rs
112

Right - of course. thanks!

This revision is now accepted and ready to land.Jul 27 2023, 6:34 PM
This revision is now accepted and ready to land.Jul 28 2023, 4:11 PM