Skip to content

Conversation

@wanchaol
Copy link
Collaborator

@wanchaol wanchaol commented May 1, 2019

Stack from ghstack:

Summary:
This PR enables the addmm decomposition happens later (after custom fusion, before batchmm), this will enable the custom fusion to see high level operator informations (such as addmm and linear, instead of mm + add)

Test plan:
Test if the decomposition pass works.

Differential Revision: D15190355

split canonicalize_ops, make a decompose pass

gh-metadata: pytorch pytorch 19988 gh/wanchaol/2/head
@wanchaol wanchaol requested review from apaszke, bddppq, bwasti and zdevito May 2, 2019 01:08
split canonicalize_ops, make a decompose pass

gh-metadata: pytorch pytorch 19988 gh/wanchaol/2/head
@wanchaol wanchaol changed the title split canonicalize_ops, make a decompose pass [jit] split canonicalize_ops, make a decompose pass May 2, 2019
DecomposeOps(graph);
// Autodiff and decompositon pass will replace some parts of graph with new graph
// these new graphs usually miss shape information on nodes, so we propagate shapes
// PropagateInputShapes(graph);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this commented out?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some issues on nn.Dropout when enable this, I will need to look into that and will bring the pass back.

wanchaol added 5 commits May 5, 2019 21:15
…ake a decompose pass"

split canonicalize_ops, make a decompose pass

gh-metadata: pytorch pytorch 19988 gh/wanchaol/2/head
split canonicalize_ops, make a decompose pass

gh-metadata: pytorch pytorch 19988 gh/wanchaol/2/head
split canonicalize_ops, make a decompose pass

gh-metadata: pytorch pytorch 19988 gh/wanchaol/2/head
…pass"

split canonicalize_ops, make a decompose pass

gh-metadata: pytorch pytorch 19988 gh/wanchaol/2/head
split canonicalize_ops, make a decompose pass

gh-metadata: pytorch pytorch 19988 gh/wanchaol/2/head
Copy link
Contributor

@zdevito zdevito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the pass changes are good. Can you address the comments about using compilation unit to load code?

namespace torch {
namespace jit {

Value* decomposeOp(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this suppose to do? If it is suppose to execute this once, it doesn't work, it is allocating a std::once_flag each time through the function.

refactor:

static std::shared_ptr<Graph> addmm_graph = createAddMMGraph();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also prefer if we added a constructor to compilation unit so this can just be write. I don't want it to end up being the case that everyone copy/pastes this pattern.

static script::CompilationUnit addmm("....");

…t canonicalize_ops, make a decompose pass"

split canonicalize_ops, make a decompose pass

gh-metadata: pytorch pytorch 19988 gh/wanchaol/2/head
// these new graphs usually consists of control flow and miss shape information on nodes, so we run shape
// prop and constant prop, as well as loop unrolling in some cases
PropagateInputShapes(graph);
ConstantPropagation(graph);
Copy link
Contributor

@eellison eellison May 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should call peephole optimization as well i think, and is there a reason to not just call into runOptimization?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runOptimization have bunch of other passes that we don't need to run, just want to minimize the passes that needs to be run, plus it does not run propagateInputShapes, so I just make a separate group of passes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you call propagateInputShapes and then call into runOptimization? The downside is that we're now maintaining two separate sets of optimizations, and you might miss optimizations that you think you didn't need (as with prim::is_cuda because PeepholeOptimize wasn't called)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds legit. the only concern i was having is that this will increase our compilation time quite a bit, that's why I want separate the passes in the first place. But given that we are potentially missing bunch of optimizations, I will change it to just run the optimization :)

wanchaol added 3 commits May 7, 2019 19:25
…ass"

split canonicalize_ops, make a decompose pass

gh-metadata: pytorch pytorch 19988 gh/wanchaol/2/head
split canonicalize_ops, make a decompose pass

gh-metadata: pytorch pytorch 19988 gh/wanchaol/2/head
…it] split canonicalize_ops, make a decompose pass"

split canonicalize_ops, make a decompose pass

gh-metadata: pytorch pytorch 19988 gh/wanchaol/2/head
@wanchaol wanchaol requested a review from zdevito May 8, 2019 17:45
split canonicalize_ops, make a decompose pass

gh-metadata: pytorch pytorch 19988 gh/wanchaol/2/head
split canonicalize_ops, make a decompose pass

gh-metadata: pytorch pytorch 19988 gh/wanchaol/2/head
@zou3519 zou3519 deleted the gh/wanchaol/2/head branch May 9, 2019 00:24
@facebook-github-bot
Copy link
Contributor

@wanchaol merged this pull request in 4d676d5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module: build Build system issues oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants