-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[jit] Optimize alias analysis #20899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f19d577 to
d33d47c
Compare
|
This looks good, but the PR is pretty big. Seems like a good opportunity to check out ghstack, our emulation of phabricator diff stacking. Can you put each of items 1-4 on their own stacked PR? |
suo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above comment. Also remember to clang-format :)
d33d47c to
380dc05
Compare
a4582fc to
c3bf65d
Compare
c3bf65d to
4be1efb
Compare
suo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far! Few comments inline.
2547097 to
f38ea25
Compare
|
@pytorchbot rebase this please |
|
I just realized - wouldn't this be a good use of bloom filters? The primary bottleneck is memory allocation, which bloom filters solve. Intersection/union can also be implemented as bit operations (which would be even faster than what we have now, since they could be sense). The only issue is that bloom filters are only probabilistically correct for membership queries. If an element is in the set it's guaranteed to return true. If it's not, then with arbitrarily high probability (depending on memory usage) it'll return false. Also, we never need to remove elements from the set. We've mentioned that we might want to remove edges, but regardless, we'd need to retraverse our DAG |
It could be, but I'd prefer to bottom out on how well we can do without going probabilistic first. |
…to optimizeAliasAnalysis
5f6f5c1 to
4549d42
Compare
7c3f3be to
1272912
Compare
5798c2d to
99674d1
Compare
suo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm except for some name bikeshedding
010543f to
04484fc
Compare
|
@pytorchbot rebase this please. |
fc993fa to
b044739
Compare
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Chillee is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Overall Improvements
unordered_setto sparse bitset.flat_hash_mapinstead ofunordered_mapin some places.Benchmarks (somewhat approximate, best of a couple runs)
I use the
sparse bitsettaken from https://llvm.org/doxygen/SparseBitVector_8h_source.html. I had to make some modifications to use__builtin_popcountland instructions like that instead of other transitive clang dependencies.Some notes on our graph topologies
In general, our graphs are very sparse, and most of the components aren't connected. For GoogleNet, we have 200k nodes, we do 2k
mayAliasqueries, and the sum of magnitudes of sets at each node is 500k (ie: every node, on average, reaches 2.5 leaves).PS: Holy crap macbooks throttle an insane amount with the default fan settings.