-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
Cython ball tree #316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cython ball tree #316
Conversation
|
After a quick overview this looks really good. I like the |
|
I have checked the memory issue: see |
|
I was thinking about this while on my bicycle (it's when I get my best thinking done) and had a flash of insight: if we adjust the definition of |
|
For the common routines (copy, sorting, searching), it is also possible to use the Numpy C API [0], thus avoiding the python overhead. I'm thinking of PyArray_ArgSort, PyArray_Max, PyArray_Copy, etc ... |
|
Interesting thought, but I'm not sure if that would lead to faster code. In the case of sorting distances and indices, for example, it seems faster to use the current simultaneous in-place quicksort of the two arrays, rather than using |
|
I think @fabianp wanted to emphasize code reuse (for easier maintenance) rather than performance. Indeed using the numpy C-API might induce a bit of overhead compared with your impl so I am ok not to reuse it in this case if you think this is not worth it. |
|
Ah, I see. I also would put more stock in speed over code reuse in this situation. Fabian's suggestion leads to an interesting thought, though: the main reason I switched from numpy arrays to raw array pointers was the overhead of numpy array slicing. Using the numpy C-API directly to more quickly create array views/slices without the python overhead would have been another possible route. At this point, though, unless there's a very compelling reason, I'm not sure it's worth investing in a full rewrite of the module. |
|
Indeed, at this point it would be too much work and the code is good enough. I checked performance on some synthetic dataset, and get impressive improvements: with this version with the old one: |
|
Merged |
|
\o/ |
|
However, I can still not pickelize the BallTree object, here is a test case: |
|
It needs |
|
Why so? Is there no way to make it work with default protocol? |
|
As far as I can tell, because cython classes don't expose their |
|
Also, pickling seems not to work at all when BallTree is imported in certain ways, even with protocol 2! It's strange: the pickling doctest passed, but Gael's test doesn't. I'm not sure what's going on here. I seem to remember reading that pickling can behave strangely at times with nested imports. That may be part of the issue. |
|
I've got pickling working with all protocols: the code is at http://github.com/jakevdp/pyTree |
|
I think I've found the source of the problem: The import isn't recognizing the full path to This problem didn't come up in the pickling doc-test that I wrote, probably because that test happens at a local level. Any ideas about how to fix this? |
|
Good catch! I wonder if the lack of introspectable path is related to the |
|
Could be... I'm going to be out camping until the middle of next week. I'll try to track down the problem once I'm back! |
|
Ok. FYI I have tried to hack a |
|
I opened an issue to keep trac of this: #323 |
This is a complete re-write of
BallTree, as recently discussed on the mailing list.Main advantages:
dynamically allocated C arrays. This allows a constructed
BallTree to be pickled and unpickled without the need to rebuild the tree.
efficiently use any minkowski p-distance (similar to
scipy.spatial.cKDTree). Because the distance functions are written in
cython, there's potential to more easily add support for other metrics.
faster by a factor of 5-8 for building the tree, and about 30-50% for
querying the tree, depending on the type of input data
Disadvantages:
class-based approach is much more intuitive. Pseudo-code for a class-based
implementation is included in the implementation notes.
too much call overhead. This makes the code more difficult to modify,
and I couldn't use numpy routines for things like searching and sorting
arrays - I wrote fast cython routines for a few of these basic algorithms.
Because they're purpose-built for this implementation, they perform well.
flexibility. The old C++ implementation would easily allow extensions
to different construction methods, inline data addition and subtraction, etc. Adding
these sorts of extensions to the new module would be much more difficult.
Memory allocation is not exact(fixed in commit below)As noted on the mailing list, I think the advantages far outweigh the disadvantages. Currently all tests pass: I think it's ready to go, barring any further input. For quick comparison of execution times between this implementation and the old implementation, see http://github.com/jakevdp/pyTree