Skip to content

Comments

Fixes SWDEV-483388#74

Merged
TedThemistokleous merged 3 commits intorocm6.3_internal_testingfrom
xinyazhang/SWDEV-483388-internal_testing
Nov 11, 2024
Merged

Fixes SWDEV-483388#74
TedThemistokleous merged 3 commits intorocm6.3_internal_testingfrom
xinyazhang/SWDEV-483388-internal_testing

Conversation

@xinyazhang
Copy link

GPU memory copy should use the provided stream.

} else {
// copy from other CPU memory to GPU, this is blocking
HIP_CALL_THROW(hipMemcpy(dst_data, src_data, bytes, hipMemcpyHostToDevice));
HIP_CALL_THROW(hipMemcpyWithStream(dst_data, src_data, bytes, hipMemcpyHostToDevice, static_cast<hipStream_t>(stream.GetHandle())));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix lint. Also, I thought we wanted to block in this case, but I suppose this allows only MIGraphX EP to block on stream now since this gpu_data_transfer isn't shared between both EPs.

auto output_tensor = ctx.GetOutput(i, ort_shape.data(), ort_shape.size());
void* output_data = output_tensor.GetTensorMutableRawData();
HIP_CALL_THROW(hipMemcpy(output_data, gpu_res.data(), res_shape.bytes(), hipMemcpyDeviceToDevice));
HIP_CALL_THROW(hipMemcpyWithStream(output_data,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With MIGraphX we perform the run_async as part of the eval() which streams execution. Were you seeing stale data somehow with this memcopy for networks with many outputs? Tried this with Bert and didn't see a change in perf but curious as to why this changed.

@TedThemistokleous TedThemistokleous merged commit d9f0ef8 into rocm6.3_internal_testing Nov 11, 2024
@TedThemistokleous
Copy link
Collaborator

merging this as this was put in upstream

TedThemistokleous pushed a commit that referenced this pull request Jan 3, 2025
* Fix SWDEV-483388: mistakenly invoke hipMemcpy when ORT_ENABLE_STREAM=True

* Fix another hipMemcpy

* Remove MIGRAPHX_STREAM_SYNC guard which does not exists in current version.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants