[CUDA 12] Autograd engine follow up by Aidyn-A · Pull Request #94929 · pytorch/pytorch

Aidyn-A · 2023-02-15T20:36:05Z

This PR is follow up to #91191.
In my latest local builds USE_CUDA not being defined by default, therefore the following piece of code never enters the guards:

pytorch/torch/csrc/autograd/engine.cpp

Lines 351 to 357 in fa1ea9f

    
           #if defined(USE_CUDA) 
        
             if (at::detail::getCUDAHooks().hasPrimaryContext(device)) { 
        
               set_device(device); 
        
             } 
        
           #else 
        
             set_device(device); 
        
           #endif

In this PR USE_CUDA is redefined by applying the compile flag.

pytorch-bot · 2023-02-15T20:36:09Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/94929

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 7ae5ba6:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

albanD

cc @malfet

albanD · 2023-02-15T20:53:39Z

caffe2/CMakeLists.txt

  )
  set_source_files_properties(${TORCH_SRC_DIR}/csrc/jit/passes/frozen_conv_add_relu_fusion.cpp PROPERTIES COMPILE_FLAGS "-DUSE_CUDA=1")
  set_source_files_properties(${TORCH_SRC_DIR}/csrc/jit/codegen/cuda/interface.cpp PROPERTIES COMPILE_FLAGS "-DUSE_CUDA=1")
+  set_source_files_properties(${TORCH_SRC_DIR}/csrc/autograd/engine.cpp PROPERTIES COMPILE_FLAGS "-DUSE_CUDA=1")


Why isn't this set for all files already. That sounds like a bug if it is not?

Apparently USE_CUDA was not set. Or could it be that defined function in #if defined(USE_CUDA) is misbehaving? All I see that the code never checks the existence of primary context.

Confirmed, USE_CUDA is just not being defined.

@ngimel do you know how this is setup by any chance?

No, I don't, sorry. There was this dream that we can have device-agnostic pieces of code, and thus not have any ifdefs in them, like engine.cpp, but apparently this doesn't work.
This would also need corresponding changes in the internal builds, cc @dagitses

albanD · 2023-03-01T18:47:03Z

Should we just wait until we have cuda 12 CI until fixing this?

Aidyn-A · 2023-03-01T18:56:59Z

@albanD I believe one of these #94929, #92354 should be merged sooner than later, since there are users who build manually with CUDA 12.
Both PRs (#94929, #92354) target the same goal, which do you think is the safest?

albanD · 2023-03-01T19:44:53Z

But until we have CI all of these will be pretty flaky as we can't actually test they do anything?

apply -DUSE_CUDA flag

7ae5ba6

Aidyn-A requested review from albanD and soulitzer as code owners February 15, 2023 20:36

pytorchbot added the open source label Feb 15, 2023

albanD reviewed Feb 15, 2023

View reviewed changes

ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 16, 2023

Aidyn-A closed this Mar 1, 2023

Aidyn-A mentioned this pull request Mar 1, 2023

[CUDA12] Autograd engine use current device only #92354

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUDA 12] Autograd engine follow up#94929

[CUDA 12] Autograd engine follow up#94929
Aidyn-A wants to merge 1 commit intopytorch:masterfrom
Aidyn-A:cuda12_autograd_engine_follow_up

Aidyn-A commented Feb 15, 2023

Uh oh!

pytorch-bot bot commented Feb 15, 2023 •

edited

Loading

Uh oh!

albanD left a comment

Uh oh!

albanD Feb 15, 2023

Uh oh!

Aidyn-A Feb 15, 2023

Uh oh!

Aidyn-A Feb 15, 2023

Uh oh!

albanD Feb 15, 2023

Uh oh!

ngimel Feb 15, 2023

Uh oh!

albanD commented Mar 1, 2023

Uh oh!

Aidyn-A commented Mar 1, 2023 •

edited

Loading

Uh oh!

albanD commented Mar 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	#if defined(USE_CUDA)
	if (at::detail::getCUDAHooks().hasPrimaryContext(device)) {
	set_device(device);
	}
	#else
	set_device(device);
	#endif

Conversation

Aidyn-A commented Feb 15, 2023

Uh oh!

pytorch-bot bot commented Feb 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/94929

✅ No Failures

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

albanD Feb 15, 2023

Choose a reason for hiding this comment

Uh oh!

Aidyn-A Feb 15, 2023

Choose a reason for hiding this comment

Uh oh!

Aidyn-A Feb 15, 2023

Choose a reason for hiding this comment

Uh oh!

albanD Feb 15, 2023

Choose a reason for hiding this comment

Uh oh!

ngimel Feb 15, 2023

Choose a reason for hiding this comment

Uh oh!

albanD commented Mar 1, 2023

Uh oh!

Aidyn-A commented Mar 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

albanD commented Mar 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pytorch-bot bot commented Feb 15, 2023 •

edited

Loading

Aidyn-A commented Mar 1, 2023 •

edited

Loading