-
-
Notifications
You must be signed in to change notification settings - Fork 2k
FYI: ERFA wrapper factorization #3176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Just some additional thoughts. It might be interesting to take the factorized part out of the template, and keep it under version control. So that, if we have additional fixes to implement, they appear in the git history. And we are talking about only ~200 lines of code or so. |
|
Hah, I was right in the middle of working something similar to this, though my approach doesn't use any templates at all. Still, this is helpful to look at. |
|
In general though I'm glad to see we're in agreement that most of the code for the wrapper functions could be generalized (as you did in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's already a function like this here:
https://github.com/astropy/astropy/blob/master/astropy/modeling/utils.py#L25
Maybe it would be worth starting some module in astropy.utils for Numpy-related utilities and move it there, then this could use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little confused why np.broadcast is not used (both here and in modeling.utils). As that is written in C, it should nominally be faster, and that seems to be confirmed by a simple test::
In [1]: import numpy as np
In [2]: from astropy.modeling.utils import check_broadcast
In [3]: a = np.arange(125).reshape(5,5,5)
In [4]: b = np.arange(5).reshape(1,5,1)
In [5]: np.broadcast(a,b)
Out[5]: <numpy.broadcast at 0x3651dc0>
In [6]: %timeit np.broadcast(a, b).shape
1000000 loops, best of 3: 481 ns per loop
In [7]: %timeit check_broadcast(a.shape, b.shape)
100000 loops, best of 3: 4.14 µs per loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of modeling I can tell you exactly-- For whatever reason np.broadcast only works on existing Numpy arrays. I needed a utility to calculate broadcast shapes without necessarily having created all the arrays involved in the operation yet. In other words, I needed a utility that could compute the shape of the resulting np.broadcast just given a list of array shapes, not the arrays themselves.
I think maybe the same thing was needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inner dimensions needed by ERFA are removed before the call to broadcast_array (in setup_iter). If one wants to use np.broadcast, you need to index the arrays passed in, in order to removed these consumed inner dimensions, using something like [(Ellipsis,)+(0,)*len(args_shape[i])].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, sounds like both cases are indeed different, both from np.broadcast and each other. Given that np.broadcast is so much faster it may still be worth using, but I think that is much better left for later optimization.
make_output_scalar -> make_outputs_scalar
d5f3641 to
21643dd
Compare
|
This should fix the numpy <= 1.7 issue. I also rebased to get one single wrapper update at the end. |
21643dd to
8152fd9
Compare
8152fd9 to
7c29bcd
Compare
|
@jwoillez Thanks--that fixed it. I'm almost finished with some changes on top of your changes in this PR that would remove the generation of Python code altogether, as has been discussed in other threads. I don't know if this will look good to anyone else or not but I'm almost ready to put it out there. Since my tweaks are built on top of this PR would you consider a PR to your PR, or should I just make a separate one? (Or maybe I can just point you to the branch when it's ready and you can tell me what you think). Thanks! |
|
@embray - for me it would be easier to review if you submitted a separate PR that is based on the work here, i.e., includes its commits. EDIT -- actually not as much review, but test: for a direct PR to astropy I know how to get the branch down on my repository, but if I have to grab different pieces in different places, I have a perhaps unbased fear that it'll become a mess. |
|
Whether I submit a PR to @jwoillez's fork, or a separate PR, it would include all relevant commits and checking out the branch for testing is the same. |
|
@embray - Please go for a separate PR, as I will not be able to do anything for the next few weeks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should really read all comments before replying -- looks like the optimization can be done already!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just used my check_broadcast utility. The version in #3181 is here:
This started from your code but, I think, made a few simplifications (it requires fewer loops over the arguments, mainly).
|
No wonder I'm confused. I thought this PR was already merged... |
|
Looks like this PR is superseded by #3181: closing. |

This is an attempt to factorize most of the code on the python side of the ERFA wrapper. This follows what @mdboom has done in #3164. It reduces the memory footprint, but may have a small performance penalty per ERFA call. I have no precise quantification of these two aspects.
PS: I will be out of touch for the next four weeks. Thought I would push this out, in case somebody wants to look into it. This is a very low priority in comparison to all the other ERFA related PRs...