Skip to content

Conversation

@abadams
Copy link
Member

@abadams abadams commented Feb 13, 2023

Pre-vectorization, the code in the included test lowers to:

      let t16 = f0[((f1.s3.r6$x.r6$x*16) + f1.s3.r6$x.r7) + 2] + f0[((f1.s3.r6$x.r6$x*16) + f1.s3.r6$x.r7) + 1]
      let t17 = (((f1.s3.r6$x.r6$x*16) + f1.s3.r6$x.r7) - f1.0.min.0) + 2
      atomic {
        f1.0[t17] = t16
      }
      let t18 = ((f1.s3.r6$x.r6$x*16) + f1.s3.r6$x.r7) - f1.0.min.0
      let t19 = f0[((f1.s3.r6$x.r6$x*16) + f1.s3.r6$x.r7) + 2] + f0[((f1.s3.r6$x.r6$x*16) + f1.s3.r6$x.r7) + 1]
      let t20 = (((f1.s3.r6$x.r6$x*16) + f1.s3.r6$x.r7) - f1.0.min.0) + 2
      atomic {
        f1.1[t20] = f1.0[t18] + t19
      }

which then scalarizes, even though those atomic nodes could be vectorized just fine, because each atomic operation is non-recursive. This PR enables vectorization of such atomic nodes.

Note that the code in the test uses an unsafe override of the associativity/commutativity check, and while it's safe to vectorize as written, if you change it a little bit, e.g. by flipping the order of the tuple components, it's not.

@steven-johnson
Copy link
Contributor

which then scalarizes

Related: IMHO we should consider emitting warnings when code scalarizes; at this point it should be pretty uncommon to happen, but when it does it's a gigantic time sink and the cause won't be obvious to the casual coder.

@abadams abadams requested a review from alexreinking February 14, 2023 17:25
@abadams abadams merged commit 18eb7d8 into main Feb 15, 2023
ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024
* Vectorization of non-recursive atomic operations

* Remove dead Vars
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants