Skip to content

Python Bindings need a way for AOT code to manage GPU buffers #6868

@steven-johnson

Description

@steven-johnson

From #6846, a number of issues:

  • The .py.cpp code that we produce for AOT-compiled Python extensions operates solely on the Python Buffer protocol, which is CPU-memory only (it has no provision for a buffer's memory to live on device). As such it really should always call copy_to_host() on all output buffers, to ensure results are flushed properly, but it does not. (This should be an easy fix).

  • More problematically, there isn't currently a way to avoid needless copy-to-host calls right now, since Python Buffer doesn't support anything but host, and we don't have the equivalent of Halide::Runtime::Buffer in our Python bindings. We could add such an equivalent, but it would be vastly preferable to adopt an existing solution already in use by other GPU-accelerated Python libraries. dlpack (https://dmlc.github.io/dlpack/latest/index.html) appears to be a likely candidate but investigation is needed.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions