Skip to content

Conversation

@jwoillez
Copy link
Member

@jwoillez jwoillez commented Dec 3, 2014

This is an attempt to factorize most of the code on the python side of the ERFA wrapper. This follows what @mdboom has done in #3164. It reduces the memory footprint, but may have a small performance penalty per ERFA call. I have no precise quantification of these two aspects.

PS: I will be out of touch for the next four weeks. Thought I would push this out, in case somebody wants to look into it. This is a very low priority in comparison to all the other ERFA related PRs...

@mdboom
Copy link
Contributor

mdboom commented Dec 3, 2014

FWIW: On my machine from astropy import erfa uses about 300k less with this PR. There is a slight performance penalty, but maybe it doesn't matter given the improved readability of this approach.

c_vs_cython

@jwoillez
Copy link
Member Author

jwoillez commented Dec 3, 2014

Just some additional thoughts. It might be interesting to take the factorized part out of the template, and keep it under version control. So that, if we have additional fixes to implement, they appear in the git history. And we are talking about only ~200 lines of code or so.

@embray
Copy link
Member

embray commented Dec 3, 2014

Hah, I was right in the middle of working something similar to this, though my approach doesn't use any templates at all. Still, this is helpful to look at.

@embray
Copy link
Member

embray commented Dec 3, 2014

In general though I'm glad to see we're in agreement that most of the code for the wrapper functions could be generalized (as you did in setup_iter).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already a function like this here:
https://github.com/astropy/astropy/blob/master/astropy/modeling/utils.py#L25

Maybe it would be worth starting some module in astropy.utils for Numpy-related utilities and move it there, then this could use it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused why np.broadcast is not used (both here and in modeling.utils). As that is written in C, it should nominally be faster, and that seems to be confirmed by a simple test::

In [1]: import numpy as np

In [2]: from astropy.modeling.utils import check_broadcast

In [3]: a = np.arange(125).reshape(5,5,5)

In [4]: b = np.arange(5).reshape(1,5,1)

In [5]: np.broadcast(a,b)
Out[5]: <numpy.broadcast at 0x3651dc0>

In [6]: %timeit np.broadcast(a, b).shape
1000000 loops, best of 3: 481 ns per loop

In [7]: %timeit check_broadcast(a.shape, b.shape)
100000 loops, best of 3: 4.14 µs per loop

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of modeling I can tell you exactly-- For whatever reason np.broadcast only works on existing Numpy arrays. I needed a utility to calculate broadcast shapes without necessarily having created all the arrays involved in the operation yet. In other words, I needed a utility that could compute the shape of the resulting np.broadcast just given a list of array shapes, not the arrays themselves.

I think maybe the same thing was needed here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inner dimensions needed by ERFA are removed before the call to broadcast_array (in setup_iter). If one wants to use np.broadcast, you need to index the arrays passed in, in order to removed these consumed inner dimensions, using something like [(Ellipsis,)+(0,)*len(args_shape[i])].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, sounds like both cases are indeed different, both from np.broadcast and each other. Given that np.broadcast is so much faster it may still be worth using, but I think that is much better left for later optimization.

make_output_scalar -> make_outputs_scalar
@jwoillez
Copy link
Member Author

jwoillez commented Dec 5, 2014

This should fix the numpy <= 1.7 issue. I also rebased to get one single wrapper update at the end.

@embray
Copy link
Member

embray commented Dec 5, 2014

@jwoillez Thanks--that fixed it.

I'm almost finished with some changes on top of your changes in this PR that would remove the generation of Python code altogether, as has been discussed in other threads. I don't know if this will look good to anyone else or not but I'm almost ready to put it out there.

Since my tweaks are built on top of this PR would you consider a PR to your PR, or should I just make a separate one? (Or maybe I can just point you to the branch when it's ready and you can tell me what you think). Thanks!

@mhvk
Copy link
Contributor

mhvk commented Dec 5, 2014

@embray - for me it would be easier to review if you submitted a separate PR that is based on the work here, i.e., includes its commits.

EDIT -- actually not as much review, but test: for a direct PR to astropy I know how to get the branch down on my repository, but if I have to grab different pieces in different places, I have a perhaps unbased fear that it'll become a mess.

@embray
Copy link
Member

embray commented Dec 5, 2014

Whether I submit a PR to @jwoillez's fork, or a separate PR, it would include all relevant commits and checking out the branch for testing is the same.

@jwoillez
Copy link
Member Author

jwoillez commented Dec 5, 2014

@embray - Please go for a separate PR, as I will not be able to do anything for the next few weeks.

@mhvk
Copy link
Contributor

mhvk commented Dec 5, 2014

OK, unbased fear... I guess the only disadvantage then is that any comments would be under @jwoillez's github account rather than @astropy's.
...
anyway, seems like a separate PR is better independently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe replace the 4 lines above by:

outer_args = [args[i][(Ellipsis,)+(0,)*len(args_shape[i])] for i in range(len(args))]
iter_shape = numpy.broadcast(*outer_args).shape

and remove def calculate_broadcast() above.

@embray - To use in #3181?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should really read all comments before replying -- looks like the optimization can be done already!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just used my check_broadcast utility. The version in #3181 is here:

https://github.com/embray/astropy/blob/erfa-factorization/astropy/erfa/erfa.py#L174

This started from your code but, I think, made a few simplifications (it requires fewer loops over the arguments, mainly).

@astrofrog astrofrog added the erfa label Jan 13, 2015
@embray
Copy link
Member

embray commented Jan 13, 2015

No wonder I'm confused. I thought this PR was already merged...

@jwoillez
Copy link
Member Author

Looks like this PR is superseded by #3181: closing.

@jwoillez jwoillez closed this Mar 12, 2015
@embray
Copy link
Member

embray commented Mar 12, 2015

@jwoillez That's not really how I felt about it--there are still some performance issues I need to resolve in #3181 before it can be of use and I have higher priorities at the moment.

@jwoillez jwoillez deleted the erfa-factorization branch August 20, 2015 07:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants