Skip to content

Conversation

@apaszke
Copy link
Contributor

@apaszke apaszke commented Aug 13, 2018

This is the first of two changes that are supposed to improve how we handle RNNs in the JIT. They still get traced as PythonOps, but now it will be much easier to actually expose them to the JIT as e.g. aten::lstm, and ignore the Python interpreter entirely. This needs some symbolic adjustments that will be part of a second PR.

Even when we fix symbolics, there will still be a bit of a problem with statefulness of the cuDNN API (we need a mutable cache for the dropout state, but our IR has no way of representing that).

@zdevito @ezyang

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@ezyang
Copy link
Contributor

ezyang commented Aug 14, 2018

Haven't finished yet. If you want to merge this AOT and fix up residual comments later I'm OK with that.

@apaszke
Copy link
Contributor Author

apaszke commented Aug 14, 2018

Thanks for a quick review, but I'll wait. I can find unrelated tasks to work on in the meantime.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

zdevito pushed a commit to zdevito/ATen that referenced this pull request Aug 15, 2018
Summary:
This is the first of two changes that are supposed to improve how we handle RNNs in the JIT. They still get traced as `PythonOp`s, but now it will be much easier to actually expose them to the JIT as e.g. `aten::lstm`, and ignore the Python interpreter entirely. This needs some symbolic adjustments that will be part of a second PR.

Even when we fix symbolics, there will still be a bit of a problem with statefulness of the cuDNN API (we need a mutable cache for the dropout state, but our IR has no way of representing that).

zdevito ezyang
Pull Request resolved: pytorch/pytorch#10481

Reviewed By: ezyang

Differential Revision: D9341113

Pulled By: apaszke

fbshipit-source-id: 0ae30ead72a1b12044b7c12369d11e5ca8ec30b5
@apaszke apaszke deleted the rnn_move branch August 17, 2018 15:54
@goldsborough goldsborough mentioned this pull request Aug 21, 2018
facebook-github-bot pushed a commit that referenced this pull request Aug 22, 2018
Summary:
The optimized code for `linear()` which uses `addmm` when a bias is given was duplicated three times in the ATen and the C++ API. Let's just have `at::linear` and use that everywhere.

apaszke ezyang (who mentioned this in #10481)
Pull Request resolved: #10755

Differential Revision: D9443881

Pulled By: goldsborough

fbshipit-source-id: a64862d1649b5961043d58401625ec267d97d9f3
PenghuiCheng pushed a commit to PenghuiCheng/pytorch that referenced this pull request Sep 11, 2018
Summary:
The optimized code for `linear()` which uses `addmm` when a bias is given was duplicated three times in the ATen and the C++ API. Let's just have `at::linear` and use that everywhere.

apaszke ezyang (who mentioned this in pytorch#10481)
Pull Request resolved: pytorch#10755

Differential Revision: D9443881

Pulled By: goldsborough

fbshipit-source-id: a64862d1649b5961043d58401625ec267d97d9f3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants