-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Better hashing for constant pool #27733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@houseroad has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
Cool! Can you paste of a snippet of a graph that would be affected by excessive hash collisions? Also, have you tried this PR on some of the models with really long compile times? |
c36eb9c to
d8a58b3
Compare
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@houseroad has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
suo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, overall this looks fine to me.
|
@houseroad merged this pull request in eb52223. |
Summary: Some models many contain thousands constants (like list of ints) and Constant Pooling and CSE pass will move the constant around and update the constant pooling. However our existing hash function only consider the node type + input type + output node (https://bddppq.github.io/codebrowser/pytorch/pytorch/torch/csrc/jit/node_hashing.cpp.html#_ZNK5torch3jit8HashNodeclEPKNS0_4NodeE), which will have a lot of conflicts... I have profiled, one insert may take as long as about 0.2 second... And loading the model will take 200 second, which is insane. So we should fix this performance issue by considering the constant value as well to avoid the conflict. Pull Request resolved: pytorch#27733 Reviewed By: bddppq Differential Revision: D17873733 Pulled By: houseroad fbshipit-source-id: 2338d7bf67174a8e56caa19a30401199f68b592a
Some models many contain thousands constants (like list of ints) and Constant Pooling and CSE pass will move the constant around and update the constant pooling.
However our existing hash function only consider the node type + input type + output node (https://bddppq.github.io/codebrowser/pytorch/pytorch/torch/csrc/jit/node_hashing.cpp.html#_ZNK5torch3jit8HashNodeclEPKNS0_4NodeE), which will have a lot of conflicts... I have profiled, one insert may take as long as about 0.2 second... And loading the model will take 200 second, which is insane.
So we should fix this performance issue by considering the constant value as well to avoid the conflict.