Skip to content

Conversation

@ezyang
Copy link
Contributor

@ezyang ezyang commented Sep 8, 2024

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 8, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/135429

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 5cbd75f with merge base 94e341c (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added ciflow/inductor release notes: fx release notes category labels Sep 8, 2024
ezyang added a commit that referenced this pull request Sep 8, 2024
Signed-off-by: Edward Z. Yang <[email protected]>

ghstack-source-id: 2dc16a9
Pull Request resolved: #135429
@albanD albanD removed their request for review September 13, 2024 19:38
isuruf pushed a commit to isuruf/pytorch that referenced this pull request Sep 23, 2024
Signed-off-by: Edward Z. Yang <[email protected]>

ghstack-source-id: 2dc16a9
Pull Request resolved: pytorch#135429
[ghstack-poisoned]
[ghstack-poisoned]
ezyang added a commit that referenced this pull request Sep 25, 2024
Signed-off-by: Edward Z. Yang <[email protected]>

ghstack-source-id: eee20ea
Pull Request resolved: #135429
@ezyang ezyang added topic: not user facing topic category ciflow/trunk Trigger trunk jobs on your pull request labels Sep 25, 2024
@ezyang
Copy link
Contributor Author

ezyang commented Sep 27, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@malfet
Copy link
Contributor

malfet commented Sep 27, 2024

@ezyang do you want to merge -f it if you are confident it'll fix the problem? (I'm still not sure if it was caused by this or previous PR in the stack)

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: inductor-periodic / cuda12.1-py3.10-gcc9-sm80 / test (inductor_torchbench_smoketest_perf, 1, 1, linux.gcp.a100)

Details for Dev Infra team Raised by workflow job

@ezyang
Copy link
Contributor Author

ezyang commented Sep 27, 2024

I confirmed the previous commit on the stack broke the particular test case, but I'm not particularly in a hurry for this one so sure let's wait for a validation again

@ezyang
Copy link
Contributor Author

ezyang commented Sep 27, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

@ezyang
Copy link
Contributor Author

ezyang commented Sep 28, 2024

@pytorchbot merge -f "looks fine"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@ezyang
Copy link
Contributor Author

ezyang commented Sep 30, 2024

@pytorchbot revert -c nosignal -m "apparently this breaks executorch"

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Collaborator

Reverting PR 135429 failed

Reason: Command git -C /home/runner/work/pytorch/pytorch revert --no-edit 1d6e0412f5205b1cd709e034526d7f21d6f2d56f returned non-zero exit code 1

Auto-merging test/dynamo/test_misc.py
Auto-merging torch/fx/experimental/symbolic_shapes.py
CONFLICT (content): Merge conflict in torch/fx/experimental/symbolic_shapes.py
error: could not revert 1d6e0412f5... Don't uselessly recompute axiom dict every static eval call (#135429)
hint: After resolving the conflicts, mark them with
hint: "git add/rm <pathspec>", then run
hint: "git revert --continue".
hint: You can instead skip this commit with "git revert --skip".
hint: To abort and get back to the state before "git revert",
hint: run "git revert --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Details for Dev Infra team Raised by workflow job

@ezyang
Copy link
Contributor Author

ezyang commented Sep 30, 2024

@pytorchbot revert -c nosignal -m "try again"

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Collaborator

@ezyang your PR has been successfully reverted.

@ezyang
Copy link
Contributor Author

ezyang commented Oct 2, 2024

Notes on the executorch failure: It's whack. This is what the logs should be:

INFO:torch.fx.experimental.symbolic_shapes:create_symbol s1 = 2 for __meta_utils_unknown_tensor105.size()[0] [2, int_oo] (executorch/exir/pass_base.py:238 in make_val), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_CREATE_SYMBOL="s1"
INFO:torch.fx.experimental.symbolic_shapes:create_symbol s2 = 3 for __meta_utils_unknown_tensor105.size()[2] [2, int_oo] (executorch/exir/pass_base.py:238 in make_val), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_CREATE_SYMBOL="s2"
INFO:torch.fx.experimental.symbolic_shapes:set_replacement s1 = 2 (range_refined_to_singleton) VR[2, 2]
INFO:torch.fx.experimental.symbolic_shapes:eval Eq(2, s1) [guard added] (_subclasses/fake_impls.py:715 in conv), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Eq(2, s1)"
INFO:torch.fx.experimental.symbolic_shapes:eval 6 >= s2 [guard added] (_subclasses/fake_impls.py:715 in conv), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="6 >= s2"
INFO:torch.fx.experimental.symbolic_shapes:eval Ne(7 - s2, 1) [guard added] (_prims_common/__init__.py:457 in compute_elementwise_output_logical_to_physical_perm), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Ne(7 - s2, 1)"
INFO:torch.fx.experimental.symbolic_shapes:eval Ne(-2*s0*s2 + 14*s0, 0) [guard added] (_prims_common/__init__.py:454 in compute_elementwise_output_logical_to_physical_perm), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Ne(-2*s0*s2 + 14*s0, 0)"
INFO:torch.fx.experimental.symbolic_shapes:eval Ne(7 - s2, 1) [guard added] (_prims/__init__.py:1285 in _broadcast_in_dim_meta), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Ne(7 - s2, 1)"
INFO:torch.fx.experimental.symbolic_shapes:eval 9 - s2 >= s2 [guard added] (_subclasses/fake_impls.py:715 in conv), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="9 - s2 >= s2"
INFO:torch.fx.experimental.symbolic_shapes:eval Ne(-2*s0*s2 + 14*s0, 0) [guard added] (_subclasses/fake_impls.py:715 in conv), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Ne(-2*s0*s2 + 14*s0, 0)"
INFO:torch.fx.experimental.symbolic_shapes:runtime_assert 10 - 2*s2 >= 0 [guard added] (_refs/__init__.py:4758 in new_empty), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="10 - 2*s2 >= 0"
INFO:torch.fx.experimental.symbolic_shapes:eval Ne(10 - 2*s2, 0) [guard added] (_meta_registrations.py:2273 in meta_conv), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Ne(10 - 2*s2, 0)"
INFO:torch.fx.experimental.symbolic_shapes:eval 1 < 20 - 4*s2 [guard added] (_meta_registrations.py:2273 in meta_conv), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="1 < 20 - 4*s2"
INFO:torch.fx.experimental.symbolic_shapes:eval Ne(-4*s0*s2 + 20*s0, 0) [guard added] (_prims_common/__init__.py:454 in compute_elementwise_output_logical_to_physical_perm), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Ne(-4*s0*s2 + 20*s0, 0)"
INFO:torch.fx.experimental.symbolic_shapes:runtime_assert Eq(7 - s2, 10 - 2*s2) [guard added] (_subclasses/fake_impls.py:817 in infer_size), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Eq(7 - s2, 10 - 2*s2)"
INFO:torch.fx.experimental.symbolic_shapes:set_replacement s2 = 3 (range_refined_to_singleton) VR[3, 3]
INFO:torch.fx.experimental.symbolic_shapes:eval Eq(s2, 3) [guard added] (executorch/exir/pass_base.py:678 in migrate_meta_val), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Eq(s2, 3)"
E 00:00:17.357053 executorch:method.cpp:941] Output 0 is memory planned, or is a constant. Cannot override         the existing data pointer.

after this PR, the logs become

INFO:torch.fx.experimental.symbolic_shapes:create_symbol s0 = 8 for __meta_utils_unknown_tensor12.size()[0] [2, int_oo] (utils/_pytree.py:803 in unflatten), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_CREATE_SYMBOL="s0"      
INFO:torch.fx.experimental.symbolic_shapes:create_symbol s1 = 16 for __meta_utils_unknown_tensor12.size()[1] [2, int_oo] (utils/_pytree.py:803 in unf
latten), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_CREATE_SYMBOL="s1"                                                                        INFO:torch.fx.experimental.symbolic_shapes:create_symbol s2 = 32 for __meta_utils_unknown_tensor12.size()[2] [2, int_oo] (utils/_pytree.py:803 in unf
latten), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_CREATE_SYMBOL="s2"       
INFO:torch.fx.experimental.symbolic_shapes:create_symbol s3 = 64 for __meta_utils_unknown_tensor12.size()[3] [2, int_oo] (utils/_pytree.py:803 in unflatten), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_CREATE_SYMBOL="s3"      
INFO:torch.fx.experimental.symbolic_shapes:set_replacement s2 = 32 (range_refined_to_singleton) VR[32, 32]
INFO:torch.fx.experimental.symbolic_shapes:eval Eq(32, s2) [guard added] (<eval_with_key>.560:9 in forward), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Eq(32, s2)"                                                                                                                   
INFO:torch.fx.experimental.symbolic_shapes:create_symbol s1 = 2 for __meta_utils_unknown_tensor105.size()[0] [2, int_oo] (executorch/exir/pass_base.p
y:238 in make_val), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_CREATE_SYMBOL="s1"                                                             INFO:torch.fx.experimental.symbolic_shapes:create_symbol s2 = 3 for __meta_utils_unknown_tensor105.size()[2] [2, int_oo] (executorch/exir/pass_base.p
y:238 in make_val), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_CREATE_SYMBOL="s2"
INFO:torch.fx.experimental.symbolic_shapes:set_replacement s1 = 2 (range_refined_to_singleton) VR[2, 2]                                              INFO:torch.fx.experimental.symbolic_shapes:eval Eq(2, s1) [guard added] (_subclasses/fake_impls.py:715 in conv), for more info run with TORCHDYNAMO_E
XTENDED_DEBUG_GUARD_ADDED="Eq(2, s1)"                               
INFO:torch.fx.experimental.symbolic_shapes:eval 6 >= s2 [guard added] (_subclasses/fake_impls.py:715 in conv), for more info run with TORCHDYNAMO_EXT
ENDED_DEBUG_GUARD_ADDED="6 >= s2"                                         
INFO:torch.fx.experimental.symbolic_shapes:eval Ne(7 - s2, 1) [guard added] (_prims_common/__init__.py:457 in compute_elementwise_output_logical_to_physical_perm), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Ne(7 - s2, 1)"
INFO:torch.fx.experimental.symbolic_shapes:eval Ne(-2*s0*s2 + 14*s0, 0) [guard added] (_prims_common/__init__.py:454 in compute_elementwise_output_lo
gical_to_physical_perm), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Ne(-2*s0*s2 + 14*s0, 0)"                                     INFO:torch.fx.experimental.symbolic_shapes:eval 9 - s2 >= s2 [guard added] (_subclasses/fake_impls.py:715 in conv), for more info run with TORCHDYNAM
O_EXTENDED_DEBUG_GUARD_ADDED="9 - s2 >= s2"                    
INFO:torch.fx.experimental.symbolic_shapes:runtime_assert 10 - 2*s2 >= 0 [guard added] (_refs/__init__.py:4758 in new_empty), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="10 - 2*s2 >= 0"                                                                                              
INFO:torch.fx.experimental.symbolic_shapes:eval Ne(10 - 2*s2, 0) [guard added] (_meta_registrations.py:2273 in meta_conv), for more info run with TOR
CHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Ne(10 - 2*s2, 0)"                                                                                               
INFO:torch.fx.experimental.symbolic_shapes:eval 1 < 20 - 4*s2 [guard added] (_meta_registrations.py:2273 in meta_conv), for more info run with TORCHD
YNAMO_EXTENDED_DEBUG_GUARD_ADDED="1 < 20 - 4*s2"                                                                                                     INFO:torch.fx.experimental.symbolic_shapes:eval Ne(-4*s0*s2 + 20*s0, 0) [guard added] (_prims_common/__init__.py:454 in compute_elementwise_output_lo
gical_to_physical_perm), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Ne(-4*s0*s2 + 20*s0, 0)"
INFO:torch.fx.experimental.symbolic_shapes:runtime_assert Eq(7 - s2, 10 - 2*s2) [guard added] (_subclasses/fake_impls.py:817 in infer_size), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="Eq(7 - s2, 10 - 2*s2)"      
INFO:torch.fx.experimental.symbolic_shapes:eval -2*s0*s2 + 14*s0 >= 2 [guard added] (_prims_common/__init__.py:236 in is_contiguous), for more info r
un with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="-2*s0*s2 + 14*s0 >= 2"                                                                               INFO:torch.fx.experimental.symbolic_shapes:eval -4*s0*s2 + 20*s0 >= 2 [guard added] (_prims_common/__init__.py:236 in is_contiguous), for more info r
un with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="-4*s0*s2 + 20*s0 >= 2"
INFO:torch.fx.experimental.symbolic_shapes:eval 20 - 4*s2 > 2 [guard added] (_prims_common/__init__.py:479 in should_swap), for more info run with TORCHDYNAMO_EXTENDED_DEBUG_GUARD_ADDED="20 - 4*s2 > 2"

and then it fails because we never manage the replacement on s2. Strange strange...

https://fb.workplace.com/groups/pytorch.edge.users/permalink/1605864123617208/

@ezyang
Copy link
Contributor Author

ezyang commented Oct 8, 2024

gonna chat with @angelayi about the high level pass structure

Signed-off-by: Edward Z. Yang <ezyangmeta.com>

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames rec

[ghstack-poisoned]
laithsakka added a commit that referenced this pull request Oct 26, 2024
Signed-off-by: Edward Z. Yang <[email protected]>

ghstack-source-id: 81ee6b3
Pull Request resolved: #135429
@laithsakka
Copy link
Contributor

looking at this now

@ezyang ezyang closed this Oct 28, 2024
@github-actions github-actions bot deleted the gh/ezyang/2926/head branch November 28, 2024 02:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged module: dynamo release notes: fx release notes category Reverted topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants