Skip to content

Conversation

@abadams
Copy link
Member

@abadams abadams commented Dec 18, 2021

40% faster, and I fixed the issue where the low bits of our noise have a low period. This combined with #6506 makes it possible to generate low-bit-width-noise very cheaply for things like dithering.

@abadams abadams changed the title Make random 2x faster by putting the innermost var last Make random faster by putting the innermost var last Dec 19, 2021
By pulling constant additions outside of quadratics, we can shave off a
few add instructions in the inner loop for random number generation,
which uses a quadratic modulo 2^32

I also removed the !overflows predicates, because rules already fail to
match if a fold overflows.

New rules formally verified.
@abadams abadams requested a review from rootjalex January 3, 2022 18:16
@abadams
Copy link
Member Author

abadams commented Jan 4, 2022

The new simplifier rules have been formally verified

@steven-johnson steven-johnson self-requested a review January 4, 2022 16:31
@abadams abadams merged commit 0021165 into master Jan 4, 2022
@steven-johnson
Copy link
Contributor

FWIW, the new simplification rules now cause errors of the form Signed integer overflow occurred during constant-folding. Signed integer overflow for int32 and int64 is undefined behavior in Halide in code that has never failed in this way before -- I'd like to revert this change pending further investigation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants