Improvements for associative_scan - vmap fixes #133013

bohnstingl · 2024-08-08T16:38:11Z

This is part of a series of PRs to improve the functionality of the associatve_scan functionality. This specific PR fixes issues with the current vmap implementation. This PR has been derived from #129307.

@ydwu4 @Chillee @zou3519

pytorch-bot · 2024-08-08T16:38:13Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133013

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 2b29abd with merge base 2ba60a1 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

test/functorch/test_control_flow.py

ydwu4 · 2024-08-12T16:47:02Z

test/functorch/test_control_flow.py

+
+        with torch._dynamo.utils.disable_cache_limit():
+            associative_scan1 = torch.compile(
+                torch.vmap(associative_scan_fct, in_dims=0), fullgraph=True


Currently, this seems to be testing torch.compile x vmap x associative_scan vs vmap x associative_scan. We probably should test a simpler version vmap x associative_scan vs associative_scan?

Yes, this is correct. I split this into two testcases. One for torch.compile x vmap x associative_scan vs associative_scan and one for vmap x associative_scan vs associative_scan. However, there is a problem when using vmap x associative_scan

torch._dynamo.exc.Unsupported: torch.func.vmap(fn) requires the function to be inlined by dynamo

Do you have any suggestions how to handle this?

test/functorch/test_control_flow.py

ydwu4 · 2024-08-12T16:57:57Z

torch/_higher_order_ops/associative_scan.py


-    res = associative_scan_op(combine_fn, input_unwrapped, dim + 1)
+    with interpreter.lower():
+        res = associative_scan_op(combine_fn, input_unwrapped, dim + 1)


how are vmap's in_dims, out_dims handled? Can you add some tests to convince me this is correct?

There is now a requirement that the scan dim and the in_dims of vmap needs to be different. When vmapping, the batch dimension is moved to the 0-th dimension and thus the associative_scan is performed on dim+1. Then the result is concatenated together and the 0-th dimension is moved to the out_dims as specified for the vmap. I added a dedicated testcase to test this behavior. Does this convince you?

…ative_scan_3

zou3519 · 2024-08-19T19:38:42Z

test/functorch/test_control_flow.py

+    @unittest.skipIf(not SM70OrLater, "triton")
+    @unittest.skipIf(not torch.cuda.is_available(), "Test requires CUDA.")
+    @parametrize("device", [torch.device("cuda")])
+    def test_pointwise_associative_scan_vmap(self, device):


@bohnstingl Could you also add a test that:

adds associative_scan to the hop_db (https://github.com/pytorch/pytorch/blob/main/torch/testing/_internal/hop_db.py). This involves adding some "sample inputs" to associative scan

calls opinfo_vmap_test? (https://github.com/pytorch/pytorch/blob/fb26b843906bbad5e28d1edccf298c74b8e00492/test/functorch/test_vmap.py#L4272C14-L4272C30)

That would be the ultimate vmap test

zou3519 · 2024-08-19T19:41:15Z

torch/_higher_order_ops/associative_scan.py

+    if dim in input_bdims:
+        raise ValueError("Vmap in_dim may not conincide with dim of associative_scan")


Is it ever possible for a user to hit this error?

Yes, the user can hit this error if the call is invoked as

def associative_scan_fct(x): return associative_scan(add, x, 0, reverse=reverse) associative_scan1 = torch.vmap(associative_scan_fct, in_dims=0, out_dims=0)

The point here is that the user has full over what is the scan dimension and what is the vmap dimension. The two may coincide, which is what's prevented here.

The dimension passed to scan inside the vmap is different from the in_dims in vmap. Inside the vmap, dimension 0 actually means "the 0th dimension not including the vmapped dimension". Based on that I am not sure we can actually hit the assertion -- the vmapped dimension should always be different from the scan dimension.

Hmm, then I think something may be off here. This code snippet

x = torch.tile( torch.unsqueeze( torch.arange( 0, 10, device=device, dtype=torch.float32, requires_grad=True ), 0, ), (4, 1), ) torch.compiler.reset() def associative_scan_fct(x): return associative_scan(add, x, 0, reverse=reverse) associative_scan1 = torch.compile( torch.vmap(associative_scan_fct, in_dims=1, out_dims=1), fullgraph=True ) result1 = associative_scan1(x)

Calls the vmap with the input x of shape 4x10, a in_dims=1 and a scan dim of 0. The result is that input_bdims=[1] in associative_scan_batch_rule. Now, when I change in_dims=0, then input_bdims=[0] and the scan dim is 0.

zou3519

at a high level this looks reasonable to me. I suggested additional test cases

…ative_scan_3

bohnstingl · 2024-08-19T21:08:32Z

@zou3519 thank you for looking at the code. I will add the additional testcases as you suggested.
However, I would have two questions where I would love to get your opinion on:

There is an issue when only using vmap with associative_scan. For example, the testcase test_pointwise_associative_scan_vmap fails because with the error

torch._dynamo.exc.Unsupported: If you are reaching here, it means dynamo failed for one of the following reasons:
- Calling torch.func.vmap(compiled_fn) function from eager mode is not supported. Ensure that torch.func.vmap is also wrapped within a torch.compile function. For more information, see PyTorch issue #128711.
- torch.func.vmap(fn) requires the function to be inlined by dynamo

I integrated the reverse flag from this PR and I stumbled over the following issue. When using vmap with associative_scan, e.g., in testcase test_pointwise_associative_scan_vmap_comp, then the flip operation in the case of reverse=True causes the shape of the leaves to change and the bdim is not the same as for the case with reverse=False
For example, for the testcase test_pointwise_associative_scan_vmap_comp, inside associative_scan_batch_rule:
The input shape is 4x10 and the bdim is 1 for reverse=False, while for reverse=True, the shape is 10x4 and the bdim is 0. If I just remove the flip operation and just execute leaves = [elem for elem in leaves], the bdim does not change.

zou3519 · 2024-08-20T14:35:03Z

For (2) -- there's no guarantee that the bdim passed to the operator is the same, even for the same operator. The value of the bdim is an implementation detail of vmap. Because of this, many vmap rules will permute the bdim to the front of the tensor

zou3519 · 2024-08-20T14:40:35Z

For (1): this is a known issue (#134000). For the additional testcases I suggested, we could refactor opinfo_vmap_test to apply compile. I'm not sure how to resolve this, this needs more thinking.

bohnstingl · 2024-08-21T09:49:08Z

Thank you @zou3519 for your comments.

For (2) -- there's no guarantee that the bdim passed to the operator is the same, even for the same operator. The value of the bdim is an implementation detail of vmap. Because of this, many vmap rules will permute the bdim to the front of the tensor

If for vmap there is no guarantee that the same bdim is used, then the flip operation would be problematic, as this relies on dim being the same always. So if with vmap x associative_scan, associative_scan is invoked once with 4x10, bdim=1 and once with 10x4, bdim[0], then the flip with a fixed dim wouldn't work?
If there a way to detect the vmap call in associative_scan, for example with is_batchedtensor? Then the flip could be deferred to associative_scan_batch_rule, where the bdim is always moved to the 0-th dimension. However, this currently fails with
torch._dynamo.exc.Unsupported: torch.* op returned non-Tensor bool call_function <built-in method is_batchedtensor of PyCapsule object at 0x7fb6a0a25f80>

The second problem I am facing is that test_vmap.py raises:
torch._dynamo.exc.UserError: Dynamic control flow is not supported at the moment. Please use functorch.experimental.control_flow.cond to explicitly capture the control flow. For more information about this error, see: https://pytorch.org/docs/main/generated/exportdb/index.html#cond-operands

In general, I am struggling quite a bit with the test_vmap.py as it runs thousands of tests and it is difficult for me to isolate my newly added tests and debug them. Is there a way to specifically run only the part for the associative_scan, to better figure out whats wrong?

zou3519 · 2024-08-26T16:13:58Z

In general, I am struggling quite a bit with the test_vmap.py as it runs thousands of tests and it is difficult for me to isolate my newly added tests and debug them. Is there a way to specifically run only the part for the associative_scan, to better figure out whats wrong?

python test/test_vmap.py -v -k "your_test_name_here"

…ative_scan_3

bohnstingl · 2024-08-28T19:20:39Z

@zou3519 since my understanding is still that vmap does cannot guarantee that the dimensions are not "shuffled", and I cannot detect the vmap case in associative_scan using is_batchedtensor, I don't know how to handle the reverse flag. Thus, I marked the test that involve reverse and vmap as skip.

Furthermore, I would have some questions regarding test_vmap.py. I tried to adjust the opinfo_vmap_test to account for torch.compile and I execute python test_vmap.py -v -k test_vmap_exhaustive_associative_scan_cuda_float32. However, I was unsuccessful in implementing the test properly as this part

vmapvmap_output = torch.compile(vmap(
        vmap(f, inner_in_dims, out_dims=out_dim), outer_in_dims, out_dims=out_dim
    ))(dummy, *batched_args, **kwarg_values)

Causes issues for the associative_scan, which I am not sure how to resolve. In particular:

torch._dynamo.exc.TorchRuntimeError: Failed running call_function <built-in method _remove_batch_dim of PyCapsule object at 0x7fd605232d00>(*(BatchedTensor(lvl=1, bdim=0, value=
    FakeTensor(..., device='cuda:0', size=(2, s1, s2, s3))
), 2, 1, 0), **{}):
Cannot call sizes() on tensor with symbolic sizes/strides
Exception raised from throw_cannot_call_with_symbolic at /data_malta3_ssd/pytorch_git/c10/core/TensorImpl.cpp:298 (most recent call first):

Using

vmapvmap_output = torch.compile(vmap(
        torch.compile(vmap(f, inner_in_dims, out_dims=out_dim)), outer_in_dims, out_dims=out_dim
    ))(dummy, *batched_args, **kwarg_values)

Raises a shape mismatch, because if the associative_scan_batch_rule is called, we don't explicitly add a dimension in case no bdim is given to vmap.

Could you please take a look and help me out?

github-actions · 2024-10-27T19:33:58Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

bohnstingl added 3 commits August 8, 2024 18:03

Fixed vmap issues

0c4ae72

Removed unnecessary changes

dccbe05

Minor revert

e01c7f8

pytorchbot added the open source label Aug 8, 2024

zou3519 requested review from Chillee, ydwu4 and zou3519 August 9, 2024 12:32

zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 9, 2024

Updated testcase with parameterization

9d430bd

ydwu4 reviewed Aug 12, 2024

View reviewed changes

bohnstingl added 3 commits August 13, 2024 15:10

Added additional testcase for in_dims and out_dims

d364261

Merge branch 'main' of github.com:pytorch/pytorch into generic_associ…

22edd15

…ative_scan_3

Added skip decorator for too old CUDA versions

3fbbde0

bohnstingl requested a review from ydwu4 August 15, 2024 20:18

bohnstingl added 4 commits August 15, 2024 22:35

Merge branch 'main' of github.com:pytorch/pytorch into generic_associ…

f86f05c

…ative_scan_3

Fixed lintrunner issues

68611e1

Fixed testcases

a312cbb

WIP: integrating reverse flag

26f1892

zou3519 reviewed Aug 19, 2024

View reviewed changes

bohnstingl added 2 commits August 19, 2024 22:29

Integrated reverse feature of associative_scan

df30168

Merge branch 'main' of github.com:pytorch/pytorch into generic_associ…

08f94bb

…ative_scan_3

bohnstingl added 2 commits August 20, 2024 21:20

WIP: more extensive vmap tests

7dfe707

WIP: test_vmap.py

16d1e64

bohnstingl requested a review from kshitij12345 as a code owner August 21, 2024 09:47

bohnstingl requested review from mruberry and tugsbayasgalan as code owners August 21, 2024 09:47

bohnstingl requested a review from zou3519 August 22, 2024 13:36

bohnstingl added 2 commits August 28, 2024 08:07

Merge branch 'main' of github.com:pytorch/pytorch into generic_associ…

dc18bc3

…ative_scan_3

WIP test_vmap.py

2b29abd

bohnstingl mentioned this pull request Oct 7, 2024

Added host-side associative scan function #129307

Closed

github-actions bot added the Stale label Oct 27, 2024

github-actions bot closed this Nov 26, 2024

		if dim in input_bdims:
		raise ValueError("Vmap in_dim may not conincide with dim of associative_scan")

Improvements for associative_scan - vmap fixes #133013

Improvements for associative_scan - vmap fixes #133013

Uh oh!

Conversation

bohnstingl commented Aug 8, 2024

Uh oh!

pytorch-bot bot commented Aug 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133013

✅ No Failures

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zou3519 Aug 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

bohnstingl commented Aug 19, 2024

Uh oh!

zou3519 commented Aug 20, 2024

Uh oh!

zou3519 commented Aug 20, 2024

Uh oh!

bohnstingl commented Aug 21, 2024

Uh oh!

zou3519 commented Aug 26, 2024

Uh oh!

bohnstingl commented Aug 28, 2024

Uh oh!

github-actions bot commented Oct 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pytorch-bot bot commented Aug 8, 2024 •

edited

Loading

zou3519 Aug 19, 2024 •

edited

Loading