some_tensor.to("cpu", non_blocking=True) becomes sync under PT2 while async in eager mode

### 🐛 Describe the bug

From an internal report. We're seeing some_tensor.to("cpu", non_blocking=True) becomes sync under PT2 while async in eager mode. Under eager, the profiler trace shows Memcpy DtoH (Device -> Pinned) while under PT2, it shows Memcpy DtoH (Device -> Pageable)

### Versions

main

cc @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @aakhundov

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

some_tensor.to("cpu", non_blocking=True) becomes sync under PT2 while async in eager mode #155121

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

some_tensor.to("cpu", non_blocking=True) becomes sync under PT2 while async in eager mode #155121

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions