Skip to content

Conversation

@bobrenjc93
Copy link
Contributor

@bobrenjc93 bobrenjc93 commented Oct 22, 2024

Stack from ghstack (oldest at bottom):

As discussed with @ezyang, this set of diffs are extracting fixes to problems discovered to flipping specialize_float=False in #137782. Since these codepaths are exercised in existing tests, I'm going to bias towards shipping speed and put these up with the primary test plan as the global CI. These code paths are all tested via existing tests when specialize_float=False and it feels a bit wonky to add more gated tests that only test behavior when this flag is True, especially since these code paths are already covered. That being said, I'm happy to add individual tests if reviewers insist or have a different POV.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames @rec

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 22, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138598

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 9c706af with merge base 2e48788 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

bobrenjc93 added a commit that referenced this pull request Oct 22, 2024
@bobrenjc93 bobrenjc93 added the topic: not user facing topic category label Oct 22, 2024
@bobrenjc93 bobrenjc93 marked this pull request as ready for review October 22, 2024 19:35
Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test too!

tx, torch.scalar_tensor
).call_function(tx, [self], {})
).call_function(
tx, [self], {"dtype": VariableTracker.build(tx, torch.float64)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't do it this way, right? Need to handle int and bool too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must have misunderstood your comment here: https://github.com/pytorch/pytorch/pull/137782/files#r1809259927

I thought you meant this code path would only be exercised by symfloats so therefore the conditional was unnecessary. Now I'm curious, what did you mean by the comment? Did you mean to enumerate all the possible dtypes?

I'll push up a new commit with the conditionals added back and add a test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your old PR you specifically tested for float and only passed dtype arg when it was a float. My suggestion was that you should always compute the right dtype (whether it is float or not) and always pass it in, instead of ever relying on type inference

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha that makes sense. I've updated the PR accordingly.

As discussed with ezyang, this set of diffs are extracting fixes to problems discovered to flipping `specialize_float=False` in #137782. Since these codepaths are exercised in existing tests, I'm going to bias towards shipping speed and put these up with the primary test plan as the global CI. These code paths are all tested via existing tests when `specialize_float=False` and it feels a bit wonky to add more gated tests that only test behavior when this flag is True, especially since these code paths are already covered. That being said, I'm happy to add individual tests if reviewers insist or have a different POV.  

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames rec

[ghstack-poisoned]
As discussed with ezyang, this set of diffs are extracting fixes to problems discovered to flipping `specialize_float=False` in #137782. Since these codepaths are exercised in existing tests, I'm going to bias towards shipping speed and put these up with the primary test plan as the global CI. These code paths are all tested via existing tests when `specialize_float=False` and it feels a bit wonky to add more gated tests that only test behavior when this flag is True, especially since these code paths are already covered. That being said, I'm happy to add individual tests if reviewers insist or have a different POV.  

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames rec

[ghstack-poisoned]
cf = torch.compile(backend="eager", fullgraph=True)(f)
x = 1.1234567891234568
y = 1.1234567891234569
self.assertAlmostEqual(f(x, y), cf(x, y))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was shocked to find that if I did assertEqual here instead of assertAlmostEqual, the test would erroneously pass without my fix. I still haven't found a good explanation as to why, since it seems like under the hood assertEqual should use ==.

If you print out print(f(x, y), cf(x, y)) you will see they differ.

If you assertTrue(f(x, y) == cf(x, y)) it will fail

But if you assertEqual(f(x, y), cf(x, y)) it will pass...

Reviewers, if you happen to know and could enlighten me on what's going on, that'd be much appreciated!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertEqual has built in rtol/atol. If you want to test bitwise equivalence use frexp

@bobrenjc93 bobrenjc93 requested a review from ezyang October 23, 2024 02:01
As discussed with ezyang, this set of diffs are extracting fixes to problems discovered to flipping `specialize_float=False` in #137782. Since these codepaths are exercised in existing tests, I'm going to bias towards shipping speed and put these up with the primary test plan as the global CI. These code paths are all tested via existing tests when `specialize_float=False` and it feels a bit wonky to add more gated tests that only test behavior when this flag is True, especially since these code paths are already covered. That being said, I'm happy to add individual tests if reviewers insist or have a different POV.  

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames rec

[ghstack-poisoned]
bobrenjc93 added a commit that referenced this pull request Oct 23, 2024
Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

@bobrenjc93
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 24, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

@bobrenjc93
Copy link
Contributor Author

@pytorchbot merge -f "ci failures unrelated"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

wdvr added a commit that referenced this pull request Oct 27, 2024
@github-actions github-actions bot deleted the gh/bobrenjc93/77/head branch November 25, 2024 02:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants