[ty] ensure IntersectionBuilder does not have multiple identical InnerIntersectionBuilders#22055
[ty] ensure IntersectionBuilder does not have multiple identical InnerIntersectionBuilders#22055mtshiba wants to merge 3 commits intoastral-sh:mainfrom
IntersectionBuilder does not have multiple identical InnerIntersectionBuilders#22055Conversation
Diagnostic diff on typing conformance testsNo changes detected when running ty on typing conformance tests ✅ |
|
|
| Lint rule | Added | Removed | Changed |
|---|---|---|---|
invalid-argument-type |
0 | 2 | 2 |
invalid-return-type |
0 | 0 | 3 |
| Total | 0 | 2 | 5 |
CodSpeed Performance ReportMerging #22055 will not alter performanceComparing Summary
Footnotes
|
| // create a union of intersections. | ||
| intersections: Vec<InnerIntersectionBuilder<'db>>, | ||
| /// Stores hash values of `intersections` to prevent adding identical `InnerIntersectionBuilder`s. | ||
| intersection_hashes: FxHashSet<u64>, |
There was a problem hiding this comment.
It seems risky to rely solely on the hash value, given the possibility of hash collisions.
There was a problem hiding this comment.
Especially since my understanding is that FxHasher is optimized for speed, not for collision-resistance?
There was a problem hiding this comment.
If the hash function were ideal, the probability of collisions would be almost negligible in this case (where the number of elements is only about 100).
The collision probability is calculated using the same formula as the birthday paradox. If the hash space is
Since
The problem is that fxhash is not an ideal hash function. In this case, the actual effective hash space may be much smaller.
There was a problem hiding this comment.
Using a cryptographic hash function seems too expensive, so it's better to simply use an equality check to remove duplicates.
There was a problem hiding this comment.
Could intersections be a FxIndexMap ir is the issue that we keep updating the entries?
There was a problem hiding this comment.
We can't use hashmaps because we're using intersections as a mutable iterator.
|
It looks like pydantic has a huge performance improvement here, but most other benchmarks regress: https://codspeed.io/astral-sh/ruff/branches/mtshiba%3Aintersection-hash?utm_source=github&utm_medium=comment&utm_content=header. I'd be interested in seeing if we still see the huge performance improvement on pydantic after #22044 has been merged. |
|
Can you rebase on |
a3cacc3 to
ffe7422
Compare
Hmm, it seems that the performance improvement on pydantic has disappeared after merging #22044.
|
659334e to
2dd406c
Compare
|
This change no longer appears to have significant standalone benefits, so I'm sort-of inclined to say it should again just be part of #17371 rather than being a standalone PR |
Summary
From #17371
See the comments in #17371 for the motivation for this change.
Test Plan
N/A