[jit] Optimize alias analysis #20899

Chillee · 2019-05-24T06:39:15Z

Overall Improvements

Switched from using unordered_set to sparse bitset.
Prevent some excessive memory allocations (thanks to @resistor )
Take advantage of the sparse bitset operations
Switch to flat_hash_map instead of unordered_map in some places.

Benchmarks (somewhat approximate, best of a couple runs)

InceptionNet (load + one forward pass): 19.8->13.3
GoogleNet(load + one forward pass): 10.0 -> 7.24
DenseNet (only load): 7.3 -> 5.3

I use the sparse bitset taken from https://llvm.org/doxygen/SparseBitVector_8h_source.html. I had to make some modifications to use __builtin_popcountl and instructions like that instead of other transitive clang dependencies.

Some notes on our graph topologies

In general, our graphs are very sparse, and most of the components aren't connected. For GoogleNet, we have 200k nodes, we do 2k mayAlias queries, and the sum of magnitudes of sets at each node is 500k (ie: every node, on average, reaches 2.5 leaves).

PS: Holy crap macbooks throttle an insane amount with the default fan settings.

suo · 2019-05-24T16:19:31Z

This looks good, but the PR is pretty big. Seems like a good opportunity to check out ghstack, our emulation of phabricator diff stacking. Can you put each of items 1-4 on their own stacked PR?

suo

see above comment. Also remember to clang-format :)

suo

Looks good so far! Few comments inline.

torch/csrc/jit/passes/utils/memory_dag.h

torch/csrc/jit/passes/utils/memory_dag.cpp

…mizeAliasAnalysis

Chillee · 2019-05-27T18:01:09Z

@pytorchbot rebase this please

torch/csrc/jit/passes/alias_analysis.cpp

torch/csrc/jit/passes/utils/memory_dag.cpp

c10/util/sparse_bitset.h

torch/csrc/jit/passes/utils/memory_dag.cpp

Chillee · 2019-05-28T16:57:21Z

I just realized - wouldn't this be a good use of bloom filters? The primary bottleneck is memory allocation, which bloom filters solve. Intersection/union can also be implemented as bit operations (which would be even faster than what we have now, since they could be sense).

The only issue is that bloom filters are only probabilistically correct for membership queries. If an element is in the set it's guaranteed to return true. If it's not, then with arbitrarily high probability (depending on memory usage) it'll return false.

Also, we never need to remove elements from the set. We've mentioned that we might want to remove edges, but regardless, we'd need to retraverse our DAG

resistor · 2019-05-28T17:32:25Z

I just realized - wouldn't this be a good use of bloom filters?

It could be, but I'd prefer to bottom out on how well we can do without going probabilistic first.

c10/util/sparse_bitset.h

…to optimizeAliasAnalysis

suo

lgtm except for some name bikeshedding

torch/csrc/jit/passes/utils/memory_dag.cpp

Chillee · 2019-05-30T00:47:16Z

@pytorchbot rebase this please.

facebook-github-bot

@Chillee is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2019-05-31T01:22:13Z

@Chillee merged this pull request in 4163576.

Chillee requested review from jamesr66a, resistor and suo May 24, 2019 06:39

pytorchbot added oncall: jit Add this issue/PR to JIT oncall triage queue module: internals Related to internal abstractions in c10 and ATen labels May 24, 2019

Chillee force-pushed the optimizeAliasAnalysis branch 2 times, most recently from f19d577 to d33d47c Compare May 24, 2019 06:51

Chillee changed the title ~~[WIP] Optimize alias analysis~~ [jit] Optimize alias analysis May 24, 2019

suo requested changes May 24, 2019

View reviewed changes

Started converting to sparse bitsets

d193e1b

Chillee force-pushed the optimizeAliasAnalysis branch from d33d47c to 380dc05 Compare May 24, 2019 17:37

Changed set to bit operations

b27345c

Chillee force-pushed the optimizeAliasAnalysis branch 3 times, most recently from a4582fc to c3bf65d Compare May 24, 2019 18:11

Simplified bfs

4be1efb

Chillee force-pushed the optimizeAliasAnalysis branch from c3bf65d to 4be1efb Compare May 24, 2019 18:31

suo requested changes May 24, 2019

View reviewed changes

suo reviewed May 24, 2019

View reviewed changes

torch/csrc/jit/passes/utils/memory_dag.cpp Show resolved Hide resolved

Chillee added 3 commits May 24, 2019 12:12

Responded to comments

e6687c9

Switched to unsigned

48f9e5f

Fixed misunderstanding of test_and_set

f38ea25

Chillee force-pushed the optimizeAliasAnalysis branch from 2547097 to f38ea25 Compare May 24, 2019 23:10

Merge branch 'master' of https://github.com/pytorch/pytorch into opti…

4d8409e

…mizeAliasAnalysis

pytorchbot and others added 2 commits May 27, 2019 18:01

Merge remote-tracking branch 'origin/master' into HEAD

49602ce

Tried reverting back to std hashset

d0297c5

resistor reviewed May 28, 2019

View reviewed changes

torch/csrc/jit/passes/utils/memory_dag.cpp Outdated Show resolved Hide resolved

resistor reviewed May 28, 2019

View reviewed changes

c10/util/sparse_bitset.h Outdated Show resolved Hide resolved

Merge branch 'optimizeAliasAnalysis' of github.com:Chillee/pytorch in…

4549d42

…to optimizeAliasAnalysis

Chillee force-pushed the optimizeAliasAnalysis branch from 5f6f5c1 to 4549d42 Compare May 28, 2019 17:40

Added llvm math library for msvc support

1272912

Chillee force-pushed the optimizeAliasAnalysis branch from 7c3f3be to 1272912 Compare May 28, 2019 18:37

Chillee added 3 commits May 28, 2019 15:15

Changed back to ska::flat_hash_set

29e7b97

responded to review

9bb9a7e

Responded to more suggestions

99674d1

Chillee force-pushed the optimizeAliasAnalysis branch from 5798c2d to 99674d1 Compare May 29, 2019 02:23

suo approved these changes May 29, 2019

View reviewed changes

torch/csrc/jit/passes/utils/memory_dag.cpp Outdated Show resolved Hide resolved

torch/csrc/jit/passes/utils/memory_dag.cpp Outdated Show resolved Hide resolved

torch/csrc/jit/passes/utils/memory_dag.cpp Outdated Show resolved Hide resolved

Chillee force-pushed the optimizeAliasAnalysis branch 2 times, most recently from 010543f to 04484fc Compare May 30, 2019 00:40

Did some naming changes in response to review

b044739

Chillee force-pushed the optimizeAliasAnalysis branch from fc993fa to b044739 Compare May 30, 2019 01:03

facebook-github-bot reviewed May 30, 2019

View reviewed changes

facebook-github-bot closed this in 4163576 May 30, 2019

facebook-github-bot added the merged label May 31, 2019

mruberry added the Merged label Oct 28, 2020

[jit] Optimize alias analysis #20899

[jit] Optimize alias analysis #20899

Uh oh!

Conversation

Chillee commented May 24, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overall Improvements

Benchmarks (somewhat approximate, best of a couple runs)

Some notes on our graph topologies

Uh oh!

suo commented May 24, 2019

Uh oh!

suo left a comment

Choose a reason for hiding this comment

Uh oh!

suo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Chillee commented May 27, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Chillee commented May 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

resistor commented May 28, 2019

Uh oh!

Uh oh!

suo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Chillee commented May 30, 2019

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented May 31, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Chillee commented May 24, 2019 •

edited

Loading

Chillee commented May 28, 2019 •

edited

Loading