-
-
Notifications
You must be signed in to change notification settings - Fork 2k
Don't use Cython for ERFA wrappers #3164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This is |
|
@mdboom - thanks! I suspect the biggest differences we will see (if any) are going to be for the fast ERFA functions. |
|
This is very much subjective, but my preference is for Cython over C if there's not a signficiant performance difference. My reasoning is that contributors who want to work on this and don't know C very well can read and edit Cython quite easily (it looks like python). But using the raw C API can be challenging and therefore error-prone for the uninitiated. Of course the definition of "significant" matters a lot here... |
|
hmm, 750kb of memory usage is not insignificant. That said, I agree with @eteq that it helps the uninitiated like me if most of the code is in python. I don't want to ask for needless effort, but would it be easier if this followed @jwoillez's split, and had python parts for ensuring that all the arrays could be broadcast together (and undo any broadcasting of the results) and C only for the actual iteration? It does seem that that is a very natural division, and in that case it would be much less likely the C code template would ever need to be edited. |
|
The difference is more pronounced with Like I said, for me it's more a matter of preference -- I almost never find Cython to be easier in the long run, because it hides so much of what is really going on. It works well when it works, but when something goes wrong it's a major pain without proper debugging support. My experience definitely doesn't back up @eteq's assertion that the C API is more challenging than Cython when doing this kind of low-level C wrapping work (writing algorithms from scratch, such as astropy's convolution routines is another matter). But that's obviously fairly subjective. @mhvk: Your suggestion to separate out along the lines of @jwoillez's split certainly makes sense. This PR is a hack I began before that split was made. I suspect the timing would be pretty close to master, particularly based on the two benchmarks above. The inner loop in both cases is pretty optimal: it's the one-time overhead of setting everything up where there is a speed difference, and by not doing that in C, you'd lose that advantage here. (All that based on a hunch, of course). So, I think it comes down to:
|
|
@mdboom - I like the way you factored out the setup part of each wrapper (broadcasts, shape checks, iterator configurations...). This might explain part of the reduced memory footprint. In any case this is certainly something to keep, whatever the direction is chosen. I know that I would not have been able to come up with a pure C version of the wrapper without some significant time overhead. Yet, after looking at what the C code generated by cython looks like, I feel like converting the cython leftover to pure C might not be a bad idea. |
|
Well, likewise, it would have taken a lot longer for me to write the C If we stick with the Cython approach currently on master, I agree that On Tue Dec 02 2014 at 2:23:51 AM Julien Woillez [email protected]
|
|
@mdboom - I think it's mostly a question of familiarity. You're certainly right that this is easier to follow than the Cython-generated C for someone already comfortable with the python and numpy C APIs. But the side I'm coming from is that most of the user base are likely to know python well but are not going to be as experienced with the C API (that includes me). Having them able to provide changes to this is good, and for them Cython is simpler to understand and change. Regardless, I agree with you that performance/compile time/memory is a more important concern. Do you think if this were split into python/C chunks like what is now in master would also singificantly shrink down Also, I think it might make sense to hold off on this a bit until some of the other ERFA PRs are in - we certainly don't want to lead down @mdboom down a wild goose chase of rebasing if we don't actually end up using this... |
|
The generated C file here is roughly the same size (mainly because it includes the docstrings in a verbose but MSVC-compatible format, not because of greater code complexity). It is an improvement in runtime, though. I don't really care much about the size of the generated code matters much except insofar as it affects compile-time speed and memory usage. |
|
I think I agree with @eteq that python is easier to read, especially the input/output/broadcasting stuff, which becomes nearly unreadable in C -- indeed, it is this saner handling of I/O that attracted me to python in the first place (from fortran, but C is no better). However, the inner loops are easy to understand in both C and python, so I think for those we can just decide based on the benefits. Furthermore, those are unlikely ever to need changing. Hence, my preference would be to keep the setup in python, as in the current implementation. For the inner loop, I would slightly prefer a pure C option, as it is cleaner and uses less memory, but it is not a big deal. As @mdboom said, this may not give any speed-up anymore, though I do think that the python setup could still be optimized a bit further (which would obviously benefit both approaches). |
|
Ok. I'll probably wait until the additional improvements to the ERFA wrappers die down and then rewrite this to only to the inner loops -- should be quite straightforward. |
|
Just a remark about C vs. Cython:
|
|
When I look more closely at the current master, I can see the point that if only the inner loop is in C, then dropping the Cython probably makes sense, as the inner loop is much more straightforward.
I kind of want to add the COBOL suggestion as an issue for "Future" now 😉 |
|
On Dec 4, 2014, at 4:29 PM, Erik Tollerud wrote:
Could I suggest INTERCAL or something of that ilk? No point in half measures...= |
|
I'm reminded of this bizarre message on the python-dev mailing list last month. It's bad they reached out in the wrong direction because I'm sure if they asked in the right place they'd have learned a lot, but alas the typical antipathy of development mailinglists to off-topic messages. So on that note, I think we should we should write this in "the lowest level of computer programming" and ship Turing machines to all Astropy users. |
|
Closed in favor of #3521. |

This work is much less necessary now that #3141 is merged, but I still thought I'd share it. The measurable differences are fairly small, so I think it ultimately comes down to maintainability and elegance. I, personally, don't care much for Cython-based wrappers as it can be very difficult to optimize and debug due to the extra level of indirection, but I know there are those that prefer it to C.
This is basically a direct port of the @jwoillez' Cython wrappers to pure C. It shaves 3 seconds off of the compile time vs. current master (after the Cython/Python splitting of the ERFA wrappers).
from astropy import erfauses about 750k less memory in this version.As for speed, there is a speed advantage to this, particularly for small arrays (with larger arrays the time of ERFA itself dominates):