Skip to content

Conversation

@jjsjann123
Copy link
Collaborator

@jjsjann123 jjsjann123 commented May 25, 2020

  1. torch.jit.fuser(str) context manager facilitates switch between backend fusers:
    str - 'fuser0' enables only legacy fuser;
    str - 'fuser1' enables only NNC;
    str - 'fuser2' enables only nvFuser;
  2. cleanup updated python tests.

1. nvFuser context manager facilitates switch to nvFuser over legacy fuser.
2. cleanup updated python tests.
@jjsjann123 jjsjann123 requested a review from csarofeen May 25, 2020 21:58
@jjsjann123 jjsjann123 requested a review from apaszke as a code owner May 25, 2020 21:58
@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label May 25, 2020
@dr-ci
Copy link

dr-ci bot commented May 25, 2020

💊 CI failures summary and remediations

As of commit 9cefa09 (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-CircleCI failure(s)

ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 14 times.

@ezyang ezyang self-requested a review May 26, 2020 15:55
@ezyang ezyang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 26, 2020
@ezyang ezyang requested review from soumith and removed request for ezyang May 26, 2020 15:55
Copy link
Contributor

@soumith soumith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to see torch.jit.nvFuser.

The public details or names of our different fuser backends shouldn't be exposed to the users.

Either rename torch.jit.nvFuser to torch.jit.fuser('fuser0') where fuser0 is the current default, fuser1 is nvFuser and fuser2 is NNC. That's one option.

Alternatively, ff you just want this for development / testing purposes, have something like with torch.jit._fuser('nv')

@jjsjann123 jjsjann123 changed the title [nvFuser] nvFuser context manager [nvFuser] add torch.jit.fuser context manager May 28, 2020
@jjsjann123
Copy link
Collaborator Author

merge master again hoping the rocm build error will go away.

@jjsjann123 jjsjann123 requested a review from soumith May 29, 2020 08:42
Copy link
Contributor

@soumith soumith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one small change. I see you are using _nv_fuser, can you make it _nvfuser, it is one word

@jjsjann123 jjsjann123 requested a review from soumith May 29, 2020 20:49
@jjsjann123
Copy link
Collaborator Author

rocm failure doesn't look like related to me and it doesn't want to go away after merging master. 😢

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soumith has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@soumith merged this pull request in 07518e1.

@ezyang
Copy link
Contributor

ezyang commented Jun 3, 2020

FYI for future reference, the context manager was missing a docblock. Please make sure all added public API has docblocks.

jjsjann123 added a commit to csarofeen/pytorch that referenced this pull request Jun 4, 2020
Summary:
1. `torch.jit.fuser(str)` context manager facilitates switch between backend fusers:
  str - 'fuser0' enables only legacy fuser;
  str - 'fuser1' enables only NNC;
  str - 'fuser2' enables only nvFuser;
2. cleanup updated python tests.
Pull Request resolved: pytorch#38993

Reviewed By: nairbv, pbelevich

Differential Revision: D21800620

Pulled By: soumith

fbshipit-source-id: 7fe855f5a5b97368e5e84c98c28d04b2e1276c85
jjsjann123 added a commit to csarofeen/pytorch that referenced this pull request Jun 5, 2020
Summary:
1. `torch.jit.fuser(str)` context manager facilitates switch between backend fusers:
  str - 'fuser0' enables only legacy fuser;
  str - 'fuser1' enables only NNC;
  str - 'fuser2' enables only nvFuser;
2. cleanup updated python tests.
Pull Request resolved: pytorch#38993

Reviewed By: nairbv, pbelevich

Differential Revision: D21800620

Pulled By: soumith

fbshipit-source-id: 7fe855f5a5b97368e5e84c98c28d04b2e1276c85
csarofeen pushed a commit to csarofeen/pytorch that referenced this pull request Jun 12, 2020
Summary:
1. `torch.jit.fuser(str)` context manager facilitates switch between backend fusers:
  str - 'fuser0' enables only legacy fuser;
  str - 'fuser1' enables only NNC;
  str - 'fuser2' enables only nvFuser;
2. cleanup updated python tests.
Pull Request resolved: pytorch#38993

Reviewed By: nairbv, pbelevich

Differential Revision: D21800620

Pulled By: soumith

fbshipit-source-id: 7fe855f5a5b97368e5e84c98c28d04b2e1276c85
csarofeen added a commit to csarofeen/pytorch that referenced this pull request Jun 18, 2020
* Simplify a few test cases

Replace custom exception checks with ASSERT_THROW macros.

* ExpressionEvaluator

* Stricter EvaluationContext binding rules

1. Don't allow overwriting concrete values
2. Don't allow binding values to expression results

* Fix clang-format errors

* Switch to Int::ScalarType

The expression evaluator is now using Int::ScalarType instead
of plain int.

* Avoid a fight with clang-tidy

* Check the numbers of kernel input and output parameters

* Add an optional arc from TensorView to its root domain

This is generated for detail_level >= DetailLevel::Explicit

* Checks kernel arguments

* Prefer pointers over references

* Bug fix

* Fix accidental construction of IValue

* Use noReduction

* Add const to const pointer

* Make an integer tensor an error as it is not yet supported

* clang-tidy

* Incorporate review feedback

* added lerp support in parser

* add missing addcmul parser and tests

* clang_format

* Return TensorView* from binary/compound/ternary ops

* clang-format

* Use TensorView* param in reductionOp and sum

* Prefer as instead of static_cast

* Transform replay refactor (#53)

Goal of this work is to have the transformation history be specific to IterDomains instead of TensorDomains. This should make it a lot easier to match up IterDomains during replay which can be complicated when taking into consideration reduction axes, rfactors, and broadcast axes.

Co-authored-by: Jie <[email protected]>
Co-authored-by: Kevin Stephano <[email protected]>

* python test fixes (#52)

fix python tests failure:

1. put Fusion inside cudaKernel to facilitate runtime arg check.
2. relax rank check for broadcast support in integration;
3. add shape propagation for newly added opeartion: [addcmul, lerp];
4. adding utility function to create FusionGuard from CudaKernel directly.

* [nvFuser] add torch.jit.fuser context manager (pytorch#38993) (#54)

Summary:
1. `torch.jit.fuser(str)` context manager facilitates switch between backend fusers:
  str - 'fuser0' enables only legacy fuser;
  str - 'fuser1' enables only NNC;
  str - 'fuser2' enables only nvFuser;
2. cleanup updated python tests.
Pull Request resolved: pytorch#38993

Reviewed By: nairbv, pbelevich

Differential Revision: D21800620

Pulled By: soumith

fbshipit-source-id: 7fe855f5a5b97368e5e84c98c28d04b2e1276c85

* Add another reduction example, change fusion printMath.

* Small test fix.

* Change Reduction4 test to use TIDx.x

* Minor cleanup.

* Clean up some noexcepts.

* More cleanup.

* Refactor computeAt, get first broadcast example working.

* Validate first non-trivial broadcast kernel.

* Fix replay when broadcast is merged with non-broadcast dim.

* Add constness in replay and index compute.

* Add another broadcast test. Rework index computation for producers, base on consumer computed indices.

* Val isCconst fix.

* Add dot product gemm example.

* Clang.

* Minor bug fixes.

* Format and add comments to GEMM test.

* WIP: Fix for enabling broadcast after reduction plus a Softmax test. (#66)

* Fix for enabling broadcast after reduction plus a Softmax test.

* Cleaner way of fixing checks for matching non-broadcast dims to non-reduction dims.

* Clang.

Co-authored-by: Kevin Stephano <[email protected]>
Co-authored-by: Christian Sarofeen <[email protected]>

* Backout bad merge conflict resolutions.

* More post rebase cleanup.

* Refix a few tests. Some from a bad rebase.

* Address comments.

* Missed some review comments.

* tmp

Co-authored-by: Lemo <[email protected]>
Co-authored-by: Naoya Maruyama <[email protected]>
Co-authored-by: Jie <[email protected]>
Co-authored-by: Kevin Stephano <[email protected]>
Co-authored-by: Kevin Stephano <[email protected]>
kevinstephano added a commit to csarofeen/pytorch that referenced this pull request Jul 6, 2020
* Simplify a few test cases

Replace custom exception checks with ASSERT_THROW macros.

* ExpressionEvaluator

* Stricter EvaluationContext binding rules

1. Don't allow overwriting concrete values
2. Don't allow binding values to expression results

* Fix clang-format errors

* Switch to Int::ScalarType

The expression evaluator is now using Int::ScalarType instead
of plain int.

* Avoid a fight with clang-tidy

* Check the numbers of kernel input and output parameters

* Add an optional arc from TensorView to its root domain

This is generated for detail_level >= DetailLevel::Explicit

* Checks kernel arguments

* Prefer pointers over references

* Bug fix

* Fix accidental construction of IValue

* Use noReduction

* Add const to const pointer

* Make an integer tensor an error as it is not yet supported

* clang-tidy

* Incorporate review feedback

* added lerp support in parser

* add missing addcmul parser and tests

* clang_format

* Return TensorView* from binary/compound/ternary ops

* clang-format

* Use TensorView* param in reductionOp and sum

* Prefer as instead of static_cast

* Transform replay refactor (#53)

Goal of this work is to have the transformation history be specific to IterDomains instead of TensorDomains. This should make it a lot easier to match up IterDomains during replay which can be complicated when taking into consideration reduction axes, rfactors, and broadcast axes.

Co-authored-by: Jie <[email protected]>
Co-authored-by: Kevin Stephano <[email protected]>

* python test fixes (#52)

fix python tests failure:

1. put Fusion inside cudaKernel to facilitate runtime arg check.
2. relax rank check for broadcast support in integration;
3. add shape propagation for newly added opeartion: [addcmul, lerp];
4. adding utility function to create FusionGuard from CudaKernel directly.

* [nvFuser] add torch.jit.fuser context manager (pytorch#38993) (#54)

Summary:
1. `torch.jit.fuser(str)` context manager facilitates switch between backend fusers:
  str - 'fuser0' enables only legacy fuser;
  str - 'fuser1' enables only NNC;
  str - 'fuser2' enables only nvFuser;
2. cleanup updated python tests.
Pull Request resolved: pytorch#38993

Reviewed By: nairbv, pbelevich

Differential Revision: D21800620

Pulled By: soumith

fbshipit-source-id: 7fe855f5a5b97368e5e84c98c28d04b2e1276c85

* Add another reduction example, change fusion printMath.

* Small test fix.

* Change Reduction4 test to use TIDx.x

* Minor cleanup.

* Clean up some noexcepts.

* More cleanup.

* Refactor computeAt, get first broadcast example working.

* Validate first non-trivial broadcast kernel.

* Fix replay when broadcast is merged with non-broadcast dim.

* Add constness in replay and index compute.

* Add another broadcast test. Rework index computation for producers, base on consumer computed indices.

* Val isCconst fix.

* Add dot product gemm example.

* Clang.

* Minor bug fixes.

* Format and add comments to GEMM test.

* WIP: Fix for enabling broadcast after reduction plus a Softmax test. (#66)

* Fix for enabling broadcast after reduction plus a Softmax test.

* Cleaner way of fixing checks for matching non-broadcast dims to non-reduction dims.

* Clang.

Co-authored-by: Kevin Stephano <[email protected]>
Co-authored-by: Christian Sarofeen <[email protected]>

* Backout bad merge conflict resolutions.

* More post rebase cleanup.

* Refix a few tests. Some from a bad rebase.

* Address comments.

* Missed some review comments.

* tmp

Co-authored-by: Lemo <[email protected]>
Co-authored-by: Naoya Maruyama <[email protected]>
Co-authored-by: Jie <[email protected]>
Co-authored-by: Kevin Stephano <[email protected]>
Co-authored-by: Kevin Stephano <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged oncall: jit Add this issue/PR to JIT oncall triage queue open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants