ENH: special: Extend Riemann Zeta function to complex inputs by steppi · Pull Request #21744 · scipy/scipy

steppi · 2024-10-23T18:30:56Z

Reference issue

What does this implement/fix?

This PR extends the Riemann Zeta function to complex inputs. It follows a Python reference implementation written by @mdhaber and posted in a Colab notebook here. There are a couple of minor tweaks I'm aware of which could improve accuracy slightly, but I will leave those for a follow-up to make this easier to review. As can be seen in the linked notebook, results are sufficiently accurate for this to get in as is.

The primary ways the implementation here differs from the reference implementation are that

Coefficients for expansions are stored in lookup tables.
Some factors (well logs of factors) in the Euler-Maclaurin sum are updated incrementally in each iteration, rather than recomputed from scratch.
A slight algebraic manipulation as been made to the Borwein series. It is now able to handle z with arbitrarily large real part, so there is no need to have a special case for z with large real part.
The logsumexp calculation for the log of sinpi must be done manually, since logsumexp is not something that's available currently in xsf. I added simple (private) implementation of complex exppi ($exp(\pi z)$) for use here.

I've added test cases comparing with the mpmath reference implementation.

Additional information

Some things that may need attention:

The Riemann zeta function and the Hurwitz Zeta function are currently combined into a single function zeta which dispatches to the ufunc _riemann_zeta when an argument q=None, and otherwise dispatches to a ufunc _zeta for the Hurwitz zeta function. I've only added complex support for _riemann_zeta, and zeta will error out when called on complex input when q is not None. Having accepted input types depend on a keyword argument isn't ideal, but I think it's acceptable here, and complex support for Hurwitz Zeta is on the way (@mdhaber has written a reference implementation for this too). We could However change this so that complex inputs are still accepted when q is not None, but NaN is returned when off the real line. This is similar to what currently happens for inputs x < 1. They are only supported for q = None, but not for any other q.

Is there a better place for the Python reference implementation than a colab notebook? Perhaps we could add reference implementations to the xsf repo? I think it's fine not to worry about this now though.

mdhaber

Great! High level comments to start; I'll take a look at the details this afternoon.

Co-authored-by: Matt Haberland <[email protected]>

mdhaber

Math looks pretty good except that the $\log \sin$ part, asymptotic approximation of $\frac{|B_{2n}|}{(2k)!}$), the last term of the EM formula, and the EM estimate are much harder to follow than the rest. It's probably best if I ask for your help there, since I imagine that future readers would have trouble, too.

Co-authored-by: Matt Haberland <[email protected]>

mdhaber · 2024-10-25T19:06:22Z

+        #     reference = complex(mp.zeta(t))
+        #     cases.append((complex(t), reference, default_rtol))
+
+        # # Very large imaginary part


How long do these take for you to run? (I know it doesn't really matter since it's all offline, but it is taking forever.)

Extremely fast

In [2]: %timeit run zeta_cases.py 23.1 ms ± 136 μs per loop (mean ± std. dev. of 7 runs, 1 loop each)

Note sure what's going on. My mpmath version is 1.3.0 and I just use the default dps.

steppi · 2024-10-25T19:55:53Z

For the failing test cases, I think it's fine to punt on the Intel oneAPI failures, and I've attempted to xfail those. I also loosened the rtol for a couple test cases.

Co-authored-by: Matt Haberland <[email protected]>

mdhaber

OK, I wrote before:

Math looks pretty good except that the $\log \sin$ part, asymptotic approximation of $\frac{|B_{2n}|}{(2k)!}$), the last term of the EM formula, and the EM estimate are much harder to follow than the rest.

I've checked the $\log \sin$ part (easier as a separate function), the asymptotic approximation of $\frac{|B_{2n}|}{(2k)!}$) (much easier now that I know the equation you were starting with), and the error estimate (easier with the indexing adjusted to match the reference - but I'll want to take another look after the corrections).

I tried checking the $\sum T_{k, n}$ part but it is still challenging to cross the i's and dot the t's. If there's anything else you're willing to do to make it look more like the reference, I'd appreciate it. Every little bit would help, like using -z * log(n) in log_factor instead of log(b) and changing the name of the loop variable. Whatever happens, next time I sit down I'm sure I'll be able to verify it. I also need to verify some of the new mpmath test cases at my home computer, since they were running too slowly on Colab.

Then I'll probably play with this locally a bit and do one more superficial pass, and I'll probably be happy with it.

Do you also want someone more familiar with the special/C++-specific stuff to review from that perspective, or are you comfortable with that as long as I've checked the algorithmic stuff?

steppi · 2024-10-25T23:21:48Z

I tried checking the ∑ T k , n part but it is still challenging to cross the i's and dot the t's. If there's anything else you're willing to do to make it look more like the reference, I'd appreciate it. Every little bit would help, like using -z * log(n) in log_factor instead of log(b) and changing the name of the loop variable.

Noted. Things should be more clear now.

Do you also want someone more familiar with the special/C++-specific stuff to review from that perspective, or are you comfortable with that as long as I've checked the algorithmic stuff?

Feel free to merge based on your algorithmic review. This is all pretty straightforward from a C++ perspective.

Co-authored-by: Matt Haberland <[email protected]>

mdhaber · 2024-10-28T22:40:32Z

Playing around with this, I think it's pretty good. I think you already knew that 0 + 0j is a problem spot.

(The positive axis labels are something like 16 + np.log10(diff), where diff is the distance from the center point.)

z = (1e-15 + 1e-15j)
res = complex(scipy_zeta(z))
ref = complex(mpmath_zeta(z))
print(res, ref, abs(res-ref)/abs(ref))
# (-0.5097353193036964+0.010130110463593661j) (-0.5000000000000009-9.189385332046748e-16j) 0.02809950746540121

Similar story at 1 + 0j.

z = (1+1e-15 + 1e-15j)
res = complex(scipy_zeta(z))
ref = complex(mpmath_zeta(z))
print(res, ref, abs(res-ref)/abs(ref))
# (496745969635765.9-443048778321942.06j) (497279149540589.06-447909238514022.3j) 0.007306002718971802

At negative even integers (trivial zeros), there is only a problem on the real axis (at least imaginary offsets greater than 1e-16).

z = (-2+1e-15)
res = complex(scipy_zeta(z))
ref = complex(mpmath_zeta(z))
print(res, ref, abs(res-ref)/abs(ref))
# (-3.6806848447762256e-17+0j) (-3.3804578090538615e-17+0j) 0.08881253743746413

But the absolute error is OK, and this is a problem with the existing real implementation, not this PR. It might be worth a second look at whether we should just use the complex implementation even when the input is real. IIRC, there were problems at the trivial zeros in my implementation, so I replaced it on the entire real axis, but maybe we should just special-case the trivial zeros.

When we zoom out, though, the error is typically quite good. Consider white to mean esentially zero error: on the left, both mpmath and scipy return an infinite result, and on the right, the result of both is essentially just 1.0.

Looking near the critical strip, we get some inaccuracy near the transition between algorithms.

But typically the error near the critical strip is not bad.

It seems fine very close to the real axis.

There is a transition near 50 + 50j that looks fine.

The transitions at 50 - 50j looks about the same, and the transitions at the negative reals are invisible, probably because the error is all due to use of the log-reflection (but it's on the order of 1e-13, so nothing to worry about).

In the critical strip with imaginary part close to 1e9, the error is around 1e-7. It takes quite a long time where EM is used here, but probably not so long that people would think it hangs. You've added some protection against long execution times for real part less than 2.5, but it can start taking a long time and get inaccurate when the real and imaginary parts are greater, e.g.

z = 2.51 + 1e13*1j
res = complex(scipy_zeta(z))
ref = complex(mpmath_zeta(z))
print(res, ref, abs(res-ref)/abs(ref))
# (0.9381747827300801-0.21060417060191572j) (0.938250825152661-0.2106505908517201j) 9.26485094426602e-05

But I think this PR is about the main algorithms, not the fine tuning or special cases. We know where some additional work may be needed, but let's merge the main implementation. Thanks @steppi! (I'll merge later tonight if you're happy, too.)

h-vetinari · 2024-11-12T23:53:53Z

Was just updating the release notes for the factorial changes, and noticed that this doesn't have an entry yet.

steppi · 2024-11-13T00:26:20Z

Was just updating the release notes for the factorial changes, and noticed that this doesn't have an entry yet.

Thanks @h-vetinari. There's a few other things I need to add release notes for too. I'll try to get them all done by the end of the month.

steppi added 10 commits October 23, 2024 11:45

Add complex riemann zeta function

9822e96

Plug complex zeta into infrastructure

2c67d76

Document complex valued zeta function

30b14fc

Add complex doctest example for zeta

e05e994

Add complex zeta test

5d524a6

Fix bug in log em coeff calculation

2787cb3

Add link for python reference implementation

7869c3f

Tighten test tolerance

b0833d7

Return NaN when euler-maclaurin would take too long

35da884

Add note about unsupported region

4796fe7

steppi requested a review from person142 as a code owner October 23, 2024 18:30

github-actions Bot added scipy.special C/C++ Items related to the internal C/C++ code base enhancement A new feature or improvement labels Oct 23, 2024

steppi requested review from mdhaber and removed request for person142 October 23, 2024 18:38

mdhaber reviewed Oct 23, 2024

View reviewed changes

Apply suggestions from code review

cf9cfcf

Co-authored-by: Matt Haberland <[email protected]>

mdhaber reviewed Oct 23, 2024

View reviewed changes

Comment thread scipy/special/xsf/zeta.h Outdated

mdhaber reviewed Oct 23, 2024

View reviewed changes

steppi added 10 commits October 24, 2024 10:58

Only consider partial termination of zeta partial sum when convergent

96755d7

Add comment on early termination

3c6d8ee

Adjust indexing of euler-maclaurin coeffs

5b60a68

Fix bug, incorrect check for pole

5803d6c

Adjust region where don't even try em sum

dbaa40a

Simplify borwein implementation

ae0b8b7

Add comment with script to generate Borwein algorithm 2coefficients

bc72c7c

Add comment with script for generating log abs em coeffs

af6054f

Add more explanatory comments

b518689

Split logsinpi into separate function

d16972c