Skip to content

Reimplemented pow5Factor using modular inverse.#188

Merged
ulfjack merged 4 commits intoulfjack:masterfrom
cassioneri:master
Jan 27, 2021
Merged

Reimplemented pow5Factor using modular inverse.#188
ulfjack merged 4 commits intoulfjack:masterfrom
cassioneri:master

Conversation

@cassioneri
Copy link
Contributor

@cassioneri cassioneri commented Nov 26, 2020

  1. All tests pass;
  2. Overall benchmark gets slightly faster than the original; (An isolated non scientific benchmark is here https://quick-bench.com/q/eKYe8GR1GAIVBtPjy5iROUquuuw)
  3. Emitted code gets shorter. (See https://godbolt.org/z/3eG8K6.)

@cassioneri cassioneri marked this pull request as draft November 26, 2020 21:30
@cassioneri cassioneri marked this pull request as ready for review November 26, 2020 21:55
@ulfjack
Copy link
Owner

ulfjack commented Dec 7, 2020

This PR did pass all tests on Travis, but apparently, Travis didn't post the status here. Not sure why.

@ulfjack
Copy link
Owner

ulfjack commented Dec 7, 2020

I started to read the article (https://accu.org/journals/overload/27/154/overload154.pdf#page=13), but I still don't completely follow what's going on. I'd like to better understand it before I merge it.

@cassioneri
Copy link
Contributor Author

cassioneri commented Dec 7, 2020

Here is a quick explanation (further details in the article mentioned above). Since 5 is odd:

  1. There exists an integer m in [0, 2^64[ such that m * 5 = 1. (This equality must be read in mod 2^64 arithmetic.)
  2. There exists an integer M in [0, 2^64[ such that x is multiple of 5, if and only if, m * x <= M. (Again, m * x in mod 2^64 arithmetic).

Now suppose x = 5^p * y where gcd(y, 5) = 1. We want to show that the loop (in the new implementation of pow5Factor) completes exactly p times, that is, "++count" is executed exactly p times.

If p = 0, then x is not multiple of 5. Hence, (2) gives m * x > M and the loop breaks immediately (++count is not executed).

Suppose that p > 0 and, by induction, that for x' = 5^{p - 1} * y the loop completes exactly p - 1 times. Then, x = 5 * x' is a multiple of 5 and, again from (2), we get m * x <= M, that is, the loop doesn't break and "++count" is executed a first time. Moreover, x is updated to m * x = m * 5 * x', which from (1) equals to x'. Hence, the loop will still complete another p - 1 times and the result follows.

I've calculated m and M which, in the code, are denoted m_inv_5 and m_div_5.

I reckon the new implementation needs a dedicated unit test which I'm happy to implement before the PR is accepted.

@ulfjack
Copy link
Owner

ulfjack commented Dec 9, 2020

Where does the second statement come from?

@ulfjack
Copy link
Owner

ulfjack commented Dec 9, 2020

I found this other page, which has an explanation:
http://duriansoftware.com/joe/Optimizing-is-multiple-checks-with-modular-arithmetic.html

I also fixed the Travis integration, so we get presubmit results now.

I think the comment needs to be updated, and I'm happy to merge this afterwards.

@cassioneri
Copy link
Contributor Author

Where does the second statement come from?

I was about to reply :-) will post for the record...

It is a (not-so-)straightforward consequence of (1). It works like this. (Equalities below are in mod 2^64 arithmetic.)

Let m be the modular inverse (mod 2^64) of 5, that is, 5 * m = 1. Set f(x) = m * x and let's show that f is injective (it's actually a bijection). If f(x) = f(y), then

m * x = m * y  ==>  5 * m * x = 5 * m * y  ==>  x = y.

Let M be the largest integer such that 5 * M < 2^64. Hence, the multiples of 5 in [0, 2^64[ are 5 * 0, 5 * 1, ..., 5 * M. Since f(5 * n) = m * 5 * n = n, multiples of 5 are mapped into 0, 1, ..., M, that is, into [0, M].

Reciprocally, since f is injective, non-multiples of 5 must be mapped into ]M, 2^64[.

FWIW:

  1. A more general form of this argument is in my article (last paragraph of page 12) but, I believe, it first appeared in "Division by invariant integers using multiplication" by Torbjorn Granlund and Peter Lawrence Montgomery, ACM SIGPLAN Notices June 1994. (See https://gmplib.org/~tege/divcnst-pldi94.pdf section 9.)

  2. Divisibility checks based on modular inverse have been implemented recently (~1.5y) by clanc and gcc. (https://godbolt.org/z/fW3GdG) Notice they use the same magic numbers as me, except that clang evaluates m * x < M + 1 and gcc evaluates m * x <= M to check whether x is multiple of 5 whereas I evaluate m * x > M to check whether x is not a multiple of 5. Potayto, potahto!

  3. Despite point 2, compilers are not clever enough to extend the argument to powers of 5 as I did in my previous post and which my implementation of pow5Factor is based on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants