Skip to content

Conversation

@szagoruyko
Copy link
Contributor

fix #586

@colesbury
Copy link
Member

Looks good. Can you add a test to test_nn.py?

@soumith soumith merged commit 2d01f38 into pytorch:master Jan 25, 2017
@soumith
Copy link
Contributor

soumith commented Jan 25, 2017

merging it as it's a critical fix, hoping that tests will come later :)

ashishfarmer pushed a commit to ashishfarmer/pytorch that referenced this pull request Mar 16, 2020
mrshenli pushed a commit to mrshenli/pytorch that referenced this pull request Apr 11, 2020
Update seq2seq_translation_tutorial.py
jjsjann123 pushed a commit to jjsjann123/pytorch that referenced this pull request Apr 11, 2021
* Separate interface for all exprs in a fusion as we should only be working on exprs required to produce outputs.

* Dropped off a nolint flag.

* Add some todos

* Remove mutate on fusion as it's never used and was accessing all exprs/vals in the fusion instead of those used to produce registered outputs.

* Remove traversing exprs in fusion not used to produce registered outputs in IterVisitor.

* Update tests.

* Minor cleanup.

* Re-enable printing all exprs in fusion.

* Minor cleanup.

* Cleanup IterVisitor.

* Refactor val origin so it's a member in Val.

* Move origin, is_output, and is_input to val member function, return nullptr origin if is_input.

* Refactor is_input/output to is_fusion_input/is_fusion_output.

* Refactor uses to be a member of Val instead of Fusion.

* Clear dead Exprs from TV->uses.

* Move fusion copy to a function that can return the ir_cloner used.

* Manual example with multiple kernels.

* Merge fixes.

* Basic mechanism to divide fusion into segments based on simple rule.

* Minor cleanup.

* Convert segments back to fusions.

* A lot of cleanup, running segmented fusion still in progress.

* Runtime for segmented fusion WIP.

* rename segment file

* refactor scheduler interface

* refactor; use heuristics matching;

* fix fusion logic and multifusion runtime

* add scheduler registry

* add normalization detection

* integrate scheduler registry in fusionSegRT

* use scheduling matching for fusion segment(WIP)

* cleanup and bug fix

* minor cleanup

* merge fix

* segment fusion only if orig fusion cannot schedule

* clang-format & clang-tidy

* clang-tidy

* clang-tidy

* clang-tidy

* rename;comment;refactor caching;

* minor cleanups

* minor cleanup

* allow mismatched broadcast normalizationSchedule

* allow mismatched broadcast in normalization

* rework red and norm canSchedule; minor fix

* style fix; unify debug print

* clang-tidy

* clang format

* clang format

* clang-tidy

* minor cleanup

Co-authored-by: shmsong <[email protected]>
KyleCZH pushed a commit to KyleCZH/pytorch that referenced this pull request Sep 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CUDNN batchnorm backprop doesn't work properly in evaluation mode

3 participants