Call .wait_tensor() in compiled region for dist.Work created in eager region #2485

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

Microve wants to merge 1 commit into meta-pytorch:main from Microve:export-D64275115

Contributor

Microve commented Oct 13, 2024

Summary:
In compiled region, instead of calling dist.Work.wait(), we will call torch.ops._c10d_functional.wait_tensor() on the dist.Work's output tensor. This way, we can capture the wait_tensor() op within the torch.compile graph (instead of graph-breaking on dist.Work.wait()), and the tensor will be waited on properly within the graph.

This diff also depends on pytorch/pytorch#137763 to function properly.

Differential Revision: D64275115

facebook-github-bot added the CLA Signed label

Contributor

facebook-github-bot commented Oct 13, 2024

This pull request was exported from Phabricator. Differential Revision: D64275115

facebook-github-bot added the fb-exported label

Microve force-pushed the export-D64275115 branch from a7df96c to 5c54383 Compare

October 22, 2024 21:31

Microve pushed a commit to Microve/torchrec that referenced this pull request


          Call .wait_tensor() in compiled region for dist.Work created in eager…

5c54383

… region (meta-pytorch#2485)

Summary:

In compiled region, instead of calling `dist.Work.wait()`, we will call `torch.ops._c10d_functional.wait_tensor()` on the dist.Work's output tensor. This way, we can capture the `wait_tensor()` op within the torch.compile graph (instead of graph-breaking on `dist.Work.wait()`), and the tensor will be waited on properly within the graph.

This diff also depends on pytorch/pytorch#137763 to function properly.

Reviewed By: Microve

Differential Revision: D64275115

Contributor

facebook-github-bot commented Oct 22, 2024

This pull request was exported from Phabricator. Differential Revision: D64275115

Microve pushed a commit to Microve/torchrec that referenced this pull request


          Call .wait_tensor() in compiled region for dist.Work created in eager…

c5137f6

… region (meta-pytorch#2485)

Summary:

In compiled region, instead of calling `dist.Work.wait()`, we will call `torch.ops._c10d_functional.wait_tensor()` on the dist.Work's output tensor. This way, we can capture the `wait_tensor()` op within the torch.compile graph (instead of graph-breaking on `dist.Work.wait()`), and the tensor will be waited on properly within the graph.

This diff also depends on pytorch/pytorch#137763 to function properly.

Reviewed By: Microve

Differential Revision: D64275115

Microve force-pushed the export-D64275115 branch from 5c54383 to c5137f6 Compare

October 26, 2024 19:04

Contributor

facebook-github-bot commented Oct 26, 2024

This pull request was exported from Phabricator. Differential Revision: D64275115

Microve force-pushed the export-D64275115 branch from c5137f6 to 31c90e3 Compare

November 1, 2024 04:53

Microve pushed a commit to Microve/torchrec that referenced this pull request


          Call .wait_tensor() in compiled region for dist.Work created in eager…

31c90e3

… region (meta-pytorch#2485)

Summary:

In compiled region, instead of calling `dist.Work.wait()`, we will call `torch.ops._c10d_functional.wait_tensor()` on the dist.Work's output tensor. This way, we can capture the `wait_tensor()` op within the torch.compile graph (instead of graph-breaking on `dist.Work.wait()`), and the tensor will be waited on properly within the graph.

This diff also depends on pytorch/pytorch#137763 to function properly.

Reviewed By: Microve

Differential Revision: D64275115

Contributor

facebook-github-bot commented Nov 1, 2024

This pull request was exported from Phabricator. Differential Revision: D64275115


          Call .wait_tensor() in compiled region for dist.Work created in eager…

fed098a

… region (meta-pytorch#2485)

Summary:

In compiled region, instead of calling `dist.Work.wait()`, we will call `torch.ops._c10d_functional.wait_tensor()` on the dist.Work's output tensor. This way, we can capture the `wait_tensor()` op within the torch.compile graph (instead of graph-breaking on `dist.Work.wait()`), and the tensor will be waited on properly within the graph.

This diff also depends on pytorch/pytorch#137763 to function properly.

Reviewed By: Microve

Differential Revision: D64275115

Microve force-pushed the export-D64275115 branch from 31c90e3 to fed098a Compare

November 2, 2024 08:54

Contributor

facebook-github-bot commented Nov 2, 2024

This pull request was exported from Phabricator. Differential Revision: D64275115

TroyGarden closed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed fb-exported