Skip to content

Conversation

@mgharbi
Copy link
Contributor

@mgharbi mgharbi commented Jan 27, 2022

This PR does two things:

  1. split the helper function that wraps PyTorch tensors into Halide buffers into 2: one for CPU tensors, one for GPU tensors. Before the PR, the wrapper may fail on CPU-only machine because halide_cuda_device_interface is missing.
  2. add a default __user_context = nullptr in CPU-only ops. The auto-scheduler can create intermediate functions that have a __user_context input. This case was not handled by the CodeGen, so compilation would fail. We now create a default null context instead.

Copy link
Contributor

@steven-johnson steven-johnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending green, with nits

@steven-johnson
Copy link
Contributor

Please also use the run-clang-format.sh script, or manually fix errors if you prefer

@steven-johnson steven-johnson merged commit c450bf4 into halide:master Jan 27, 2022
jrprice pushed a commit to jrprice/Halide that referenced this pull request Feb 4, 2022
* fixes pytorch op compilation for CPU only machines, adds default user context for auto-scheduled-ops

* rm redundant declarations

* fix spacing

Co-authored-by: Michael Gharbi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants