Skip to content

[ROCm] [Flex attention] Memory access fault on nested_tensor UT #139754

@jataylo

Description

@jataylo

🐛 Describe the bug

#136792 accidentally disabled flex attention UTs on ROCm. We are re-enabling testing with #139632 but there is a memory access fault in a unit test.

We will skip the unit test to start running ROCm flex attention UTs again, but opening this issue for tracking.

cc: @drisspg @jbschlosser

https://hud.pytorch.org/pr/pytorch/pytorch/139632#32490576539

2024-11-04T18:11:36.8788561Z test_nestedtensor.py::TestNestedTensorSubclassCUDA::test_flex_attention_noncontig_with_holes_False_cuda_float32 Memory exception on virtual address 0x7fa191005000, node id 3 : Address does not belong to a known buffer
2024-11-04T18:11:36.8789792Z Memory access fault by GPU node-3 (Agent handle: 0xaf07060) on address 0x7fa191005000. Reason: Unknown.
2024-11-04T18:11:36.8790324Z Fatal Python error: Aborted
2024-11-04T18:11:36.8790497Z 
2024-11-04T18:11:36.8790643Z Thread 0x00007fa189cff700 (most recent call first):
2024-11-04T18:11:36.8790978Z   <no Python frame>
2024-11-04T18:11:36.8791118Z 

Will be reassessed with 3.2 triton #139175

Versions

CI

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @hongxiayang @naromero77amd @ezyang @chauhang @penguinwu @zou3519 @ydwu4 @bdhirsh @yf225 @Chillee @drisspg @yanboliang @BoyuanFeng

Metadata

Metadata

Assignees

Labels

module: flex attentionmodule: higher order operatorstorch.cond and similarmodule: pt2-dispatcherPT2 dispatcher-related issues (e.g., aotdispatch, functionalization, faketensor, custom-op,module: rocmAMD GPU support for Pytorchoncall: pt2triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions