-
-
Notifications
You must be signed in to change notification settings - Fork 2k
Speed up Column.__getitem__ in general #4075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The one thing I don't like about this is the duplicate code in |
Column.__getitem__ speedup branch of astropy#4075, and made adjustments to take advantage of it in the indices code. This included getting rid of the `def get_item(...)` hacks, which I think are no longer necessary--the base Column.__getitem__ is now almost exactly as fast as the stock implementation in the basic case of single item scalar access. In the process I also refactored the Cython code a bit so that it could be less repetitive. For now, tests that failed due to the change in behavior (indicies now *are* copied by default when slicing) are commented out; but all other tests pass. It should also that the 'copy_on_getitem' index mode is now a no-op since that's the default behavior. We could still change this so that it's not the default behavior, if we prefer, or add a no_copy_on_getitem mode instead.
|
This looks good to me, and hurts my brain less, as long as it's fast! |
|
I'll try to get some performance results up on this PR. From my rough testing though this adds < 1% overhead compared to the base |
… base classes. This eliminates most of the overhead associated with calling pure-Python __getitem__, and speeds up the body of the method as well.
…er we need access to in this code is 'nd')--this allows us to avoid a global variable lookup of 'ndarray' when calling 'ndarray.tp_as_mapping.mp_subscript'. Now, in the common case of ndim==1, Column (but not MaskedColumn) adds virtually no overhead compared to the built-in __getitem__.
…n this branch also fix that issue.
Column.__getitem__ speedup branch of astropy#4075, and made adjustments to take advantage of it in the indices code. This included getting rid of the `def get_item(...)` hacks, which I think are no longer necessary--the base Column.__getitem__ is now almost exactly as fast as the stock implementation in the basic case of single item scalar access. In the process I also refactored the Cython code a bit so that it could be less repetitive. For now, tests that failed due to the change in behavior (indicies now *are* copied by default when slicing) are commented out; but all other tests pass. It should also that the 'copy_on_getitem' index mode is now a no-op since that's the default behavior. We could still change this so that it's not the default behavior, if we prefer, or add a no_copy_on_getitem mode instead.
9f7d6fb to
1d68540
Compare
|
Updated to incorporate #3915 / #4202. As I noted previously, this makes the 'copy_on_getitem' index mode is now a no-op since that's the default behavior. We could still change this so that it's not the default behavior, if we prefer, or add a no_copy_on_getitem mode instead. @taldcroft ? @mdmueller ? |
|
(To clarify, we could still make the default behavior be to not copy the indices on a slice. I wasn't sure which way is preferable, and if the current non-copy default was just chosen to not slow down |
|
@embray - the current non-copy default was more chosen to not slow down |
|
Okay, I wondered about that. In that case I'll restore the original default. This PR should still speed things up in either case since it does away with the penalty of having a |
|
Sounds good, thanks. |
…evel classes / functions. Put module docstring before imports.
…is *not* the default (and re-enabled the tests that I previously disabled before I was sure what we were going to do here). This required going back to using a __class__ switcheroo to enable the index copying inside the index mode context manager. Originally I just wanted to add a flag to enable or disable it, but since that flag would have to be accessible from Python that just added too much slowdown to the general case. So this ends up adopting a middle ground between my original version of this code and @mdmueller's version.
|
Okay, I restored the original default behavior. In order to get this working right without sacrificing performance I went back on some of @mdmueller's original implementation, or at least something that looks like it (but still without any need for the |
|
Apparently I suck at testing on Python 2... |
|
@embray - I don't have time this weekend for detailed review on this, but I'm trusting that the table testing is sufficiently complete. Have you done some simple %timeit performance testing comparing this to master? |
|
@embray - this looks good, and cleaner than before (though I cannot judge the cython very well). What seems to be missing, though, are some comments that explain why we are doing this at all. The original shim class had relevant comments; maybe put those either in the |
|
p.s. I even wonder if we should have a section in the developer docs with "tips & tricks", where we summarize these types of speed-ups and their rationale. Indeed, my shape-changing mixin would be another good addition... |
|
I'll add some better explanatory comments before merging. @taldcroft Concerning timing this adds a tiny slowdown compared to the stock |
|
@embray - that's definitely in the noise, so no problem! Thanks for working on this. |
|
@embray - could you add the comments in the next hour or so? It would be nice to get this in 1.1 if possible. |
|
I'm going to branch v1.1, but we can consider merging this in after since it's just a performance fix. |
|
Oh, this is still open... let me add those aforementioned comments and then merge. |
|
👍 |
Speed up Column.__getitem__ in general
Speed up Column.__getitem__ in general
Column.__getitem__ speedup branch of astropy#4075, and made adjustments to take advantage of it in the indices code. This included getting rid of the `def get_item(...)` hacks, which I think are no longer necessary--the base Column.__getitem__ is now almost exactly as fast as the stock implementation in the basic case of single item scalar access. In the process I also refactored the Cython code a bit so that it could be less repetitive. For now, tests that failed due to the change in behavior (indicies now *are* copied by default when slicing) are commented out; but all other tests pass. It should also that the 'copy_on_getitem' index mode is now a no-op since that's the default behavior. We could still change this so that it's not the default behavior, if we prefer, or add a no_copy_on_getitem mode instead.
Column.__getitem__ speedup branch of astropy#4075, and made adjustments to take advantage of it in the indices code. This included getting rid of the `def get_item(...)` hacks, which I think are no longer necessary--the base Column.__getitem__ is now almost exactly as fast as the stock implementation in the basic case of single item scalar access. In the process I also refactored the Cython code a bit so that it could be less repetitive. For now, tests that failed due to the change in behavior (indicies now *are* copied by default when slicing) are commented out; but all other tests pass. It should also that the 'copy_on_getitem' index mode is now a no-op since that's the default behavior. We could still change this so that it's not the default behavior, if we prefer, or add a no_copy_on_getitem mode instead.
Alternative to #3930--speed up
Column.__getitem__using Cython base classes. This eliminates most of the overhead associated with calling pure-Python__getitem__, and speeds up the body of the method as well.This is a more general solution than the one in #3930--I think this could be useful for #3915. Checking whether the indices need to be updated adds only a trivial amount of overhead, so we can perform that check in the actual
__getitem__without incurring additional overhead. So unless we still want to be able to usetable[slice]without incurring index adjustment overhead, theget_itemmethods won't be necessary.