`pow`: fix meta function output argument dtype check. #140287

ysiraichi · 2024-11-11T18:22:41Z

Stack from ghstack (oldest at bottom):

-> pow: fix meta function output argument dtype check. #140287

Tracking issue: #138399

This PR changes the pow C++ implementation, making its C++ meta kernel consistent with
its Python ref implementation. The following example shows the inconsistency between the
two:

def run(device):
    S = (5,)
    a = torch.rand(S, device=device, dtype=torch.float32)
    b = 2
    out = torch.empty(S, device=device, dtype=torch.float64)
    return torch.pow(a, b, out=out)

>>> run("cpu")
Traceback (most recent call last):
  File "test.py", line 34, in run
    return torch.pow(a, b, out=out)
RuntimeError: Found dtype Double but expected Float

>>> run("meta")
tensor(..., device='meta', size=(5,), dtype=torch.float64)

~~Update:~~

Note that this happens only for pow.Tensor_Scalar overloads. Therefore, this PR needed
further 2 modifications:

Split the pow ref implementation, making pow.Tensor_Scalar error on mismatching
output dtypes
~~Create a dispatch for pow when _refs.pow() is called~~

Update:

Changing the TensorIteratorConfig for pow.Tensor_Scalar was easier and,
after the discussion below, more correct. The solution was to change the
TensorIteratorBase::build_output_borrowing_argument_owning_unary_op function,
setting:

cast_common_dtype_to_outputs; and
enforce_safe_casting_to_output.

[ghstack-poisoned]

pytorch-bot · 2024-11-11T18:22:45Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/140287

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[DomainsOnly] Jobs fail with GLIBC version not found

✅ No Failures

As of commit 0a69f95 with merge base f4ce9ac ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

Tracking issue: #138399 This PR changes the `pow` ref implementation, making its meta kernel consistent with its CPU implementation. The following example shows the inconsistency between the two: ```python def run(device): S = (5,) a = torch.rand(S, device=device, dtype=torch.float32) b = 2 out = torch.empty(S, device=device, dtype=torch.float64) return torch.pow(a, b, out=out) >>> run("cpu") Traceback (most recent call last): File "test.py", line 34, in run return torch.pow(a, b, out=out) RuntimeError: Found dtype Double but expected Float >>> run("meta") tensor(..., device='meta', size=(5,), dtype=torch.float64) ``` ghstack-source-id: 5371f04 Pull Request resolved: #140287

ysiraichi · 2024-11-12T00:32:14Z

While the old version of this PR did make the meta implementation of pow.Tensor_Scalar overloads consistent with its C++ version, I didn't think that was the right way to go.

According to the developer FAQ:

For operations that do not participate in type promotion the device and dtype of the source and destination tensors must match. For operations that do participate in type promotion the copy can be to a different dtype, but the destination of the copy cannot be a lower "type kind" than the source.

Which means that: if pow.Tensor_Scalar does participate in dtype promotion (it does call at::result_type(...)), we can safe copy to an output tensor with a different dtype (not lower kind, though). Which means that the C++ meta function should be changed (current state of this PR).

Note: while this implementation works, we are running the whole computation in the output argument dtype, instead of just safe-copying the results. Technically, I think this does not reflect on the developer FAQ specification. For that, we would need to change the TensorIterator a bit.

@ezyang @amjames Any thoughts on this?

amjames · 2024-11-12T15:51:34Z

I agree that the behavior of running the entire op in the wider dtype does not seem to match the spec outlined in the developer FAQ, however I suppose it does preserve the user visible behavior. On the other hand, modifying TensorIterator to handle this situation in a more correct way would be a fairly invasive change, and may have lots of side effects.

I suppose we could try making that modification to TensorIterator here and see what happens, but what you have here looks okay to me.

[ghstack-poisoned]

Tracking issue: #138399 This PR changes the `pow` C++ implementation, making its C++ meta kernel consistent with its Python ref implementation. The following example shows the inconsistency between the two: ```python def run(device): S = (5,) a = torch.rand(S, device=device, dtype=torch.float32) b = 2 out = torch.empty(S, device=device, dtype=torch.float64) return torch.pow(a, b, out=out) >>> run("cpu") Traceback (most recent call last): File "test.py", line 34, in run return torch.pow(a, b, out=out) RuntimeError: Found dtype Double but expected Float >>> run("meta") tensor(..., device='meta', size=(5,), dtype=torch.float64) ``` ghstack-source-id: 67a3ec5 Pull Request resolved: #140287

ysiraichi · 2024-11-12T21:29:05Z

I think I figured out how to change TensorIterator without being invasive! As far as I understand build_output_borrowing_argument_owning_unary_op function is used only for pow, which is very convenient. So, I just had to tweak it, so that it would do a safe copy to the output tensor.

ezyang

beh looks like this is a semantics change though :P

[ghstack-poisoned]

Tracking issue: #138399 This PR changes the `pow` C++ implementation, making its C++ meta kernel consistent with its Python ref implementation. The following example shows the inconsistency between the two: ```python def run(device): S = (5,) a = torch.rand(S, device=device, dtype=torch.float32) b = 2 out = torch.empty(S, device=device, dtype=torch.float64) return torch.pow(a, b, out=out) >>> run("cpu") Traceback (most recent call last): File "test.py", line 34, in run return torch.pow(a, b, out=out) RuntimeError: Found dtype Double but expected Float >>> run("meta") tensor(..., device='meta', size=(5,), dtype=torch.float64) ``` ghstack-source-id: b902a55 Pull Request resolved: #140287

ysiraichi · 2024-11-18T23:21:22Z

While it does change the semantics, in the sense that we don't expect the output tensor to be of an exact dtype, I think it brings us closer to the out= specification.

ysiraichi · 2024-11-19T14:05:38Z

The CI failure is unrelated to this PR.

ysiraichi · 2024-11-19T14:05:58Z

@pytorchbot merge -i

pytorch-bot · 2024-11-19T14:06:03Z

This PR needs to be approved by an authorized maintainer before merge.

ysiraichi · 2024-11-19T14:10:20Z

@ezyang Could you take a look at this PR? I have commented on the semantic change. I don't think this would break anything, though, since the change only makes pow less strict regarding output tensor dtype.

ezyang · 2024-11-20T00:12:53Z

@pytorchbot merge -r

pytorchmergebot · 2024-11-20T00:14:22Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

[ghstack-poisoned]

pytorchmergebot · 2024-11-20T00:14:35Z

Successfully rebased gh/ysiraichi/71/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/140287)

Tracking issue: #138399 This PR changes the `pow` C++ implementation, making its C++ meta kernel consistent with its Python ref implementation. The following example shows the inconsistency between the two: ```python def run(device): S = (5,) a = torch.rand(S, device=device, dtype=torch.float32) b = 2 out = torch.empty(S, device=device, dtype=torch.float64) return torch.pow(a, b, out=out) >>> run("cpu") Traceback (most recent call last): File "test.py", line 34, in run return torch.pow(a, b, out=out) RuntimeError: Found dtype Double but expected Float >>> run("meta") tensor(..., device='meta', size=(5,), dtype=torch.float64) ``` ghstack-source-id: ce865dd Pull Request resolved: #140287

pytorchmergebot · 2024-11-20T00:15:52Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-11-20T06:13:15Z

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

ysiraichi · 2024-11-20T13:26:39Z

@pytorchbot merge

pytorchmergebot · 2024-11-20T13:28:23Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Tracking issue: pytorch#138399 This PR changes the `pow` C++ implementation, making its C++ meta kernel consistent with its Python ref implementation. The following example shows the inconsistency between the two: ```python def run(device): S = (5,) a = torch.rand(S, device=device, dtype=torch.float32) b = 2 out = torch.empty(S, device=device, dtype=torch.float64) return torch.pow(a, b, out=out) >>> run("cpu") Traceback (most recent call last): File "test.py", line 34, in run return torch.pow(a, b, out=out) RuntimeError: Found dtype Double but expected Float >>> run("meta") tensor(..., device='meta', size=(5,), dtype=torch.float64) ``` **~Update:~** ~Note that this happens only for `pow.Tensor_Scalar` overloads. Therefore, this PR needed further 2 modifications:~ - ~Split the `pow` ref implementation, making `pow.Tensor_Scalar` error on mismatching output dtypes~ - ~Create a dispatch for `pow` when `_refs.pow()` is called~ **Update:** Changing the `TensorIteratorConfig` for `pow.Tensor_Scalar` was easier and, after the discussion below, more correct. The solution was to change the `TensorIteratorBase::build_output_borrowing_argument_owning_unary_op` function, setting: - `cast_common_dtype_to_outputs`; and - `enforce_safe_casting_to_output`. Pull Request resolved: pytorch#140287 Approved by: https://github.com/ezyang

Update

efaf5a0

[ghstack-poisoned]

ysiraichi requested a review from mruberry as a code owner November 11, 2024 18:22

Rebased.

d8b44df

[ghstack-poisoned]

ysiraichi added the topic: not user facing topic category label Nov 11, 2024

pytorchbot added the open source label Nov 11, 2024

Update

99a85a0

[ghstack-poisoned]

Update

201cce2

[ghstack-poisoned]

Update

bb80609

[ghstack-poisoned]

ezyang approved these changes Nov 13, 2024

View reviewed changes

ezyang requested changes Nov 13, 2024

View reviewed changes

Fix test.

ff9f9c0

[ghstack-poisoned]

Rebased.

f8587a6

[ghstack-poisoned]

ezyang approved these changes Nov 20, 2024

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 20, 2024

Update

0a69f95

[ghstack-poisoned]

pytorchmergebot added the merging label Nov 20, 2024

pytorchmergebot added the Merged label Nov 20, 2024

pytorchmergebot closed this in 446ea2a Nov 20, 2024

pytorchmergebot removed the merging label Nov 20, 2024

github-actions bot deleted the gh/ysiraichi/71/head branch December 21, 2024 02:05

pow: fix meta function output argument dtype check. #140287

pow: fix meta function output argument dtype check. #140287

Uh oh!

Conversation

ysiraichi commented Nov 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/140287

❗ 1 Active SEVs

✅ No Failures

Uh oh!

ysiraichi commented Nov 12, 2024

Uh oh!

amjames commented Nov 12, 2024

Uh oh!

ysiraichi commented Nov 12, 2024

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

ysiraichi commented Nov 18, 2024

Uh oh!

ysiraichi commented Nov 19, 2024

Uh oh!

ysiraichi commented Nov 19, 2024

Uh oh!

pytorch-bot bot commented Nov 19, 2024

Uh oh!

ysiraichi commented Nov 19, 2024

Uh oh!

ezyang commented Nov 20, 2024

Uh oh!

pytorchmergebot commented Nov 20, 2024

Uh oh!

pytorchmergebot commented Nov 20, 2024

Uh oh!

pytorchmergebot commented Nov 20, 2024

Merge started

Uh oh!

pytorchmergebot commented Nov 20, 2024

Uh oh!

ysiraichi commented Nov 20, 2024

Uh oh!

pytorchmergebot commented Nov 20, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

`pow`: fix meta function output argument dtype check. #140287

`pow`: fix meta function output argument dtype check. #140287

ysiraichi commented Nov 11, 2024 •

edited

Loading

pytorch-bot bot commented Nov 11, 2024 •

edited

Loading