BUG: Solves errors when calling series methods in DataFrame.query with numexpr#43301
BUG: Solves errors when calling series methods in DataFrame.query with numexpr#43301jreback merged 10 commits intopandas-dev:masterfrom AlexisMignon:master
Conversation
…call with the numexpr engine
|
The condition on numexpr is may be too strong, may be we should do the trick only for unhashable results and when using numexpr ? |
|
Hello @AlexisMignon! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2021-09-24 08:09:00 UTC |
pandas/core/computation/expr.py
Outdated
| kwargs[key.arg] = self.visit(key.value).value | ||
|
|
||
| return self.const_type(res(*new_args, **kwargs), self.env) | ||
| if self.engine == "numexpr": |
There was a problem hiding this comment.
can we just always do this?
There was a problem hiding this comment.
Indeed, This was one of my previous comments. It could depend on the type of the result returned by the function. But since the result has to be embedded in an expression string passed to numexpr.evaluate() it's probably safer to use variable names instead of string representations.
Unless it has an interest for some very specific cases like some literals of base types.
There was a problem hiding this comment.
ok if you can try that would be great
There was a problem hiding this comment.
I've done some tests with numexpr.
For instance this works:
numexpr.evaluate("2 * a + b", {"a": np.array([0.0, 0.0]), "b": [1, 2]})while this doesn't:
numexpr.evaluate("2 * a + [1, 2]", {"a": np.array([0.0, 0.0])})So I guess the number of cases where it's a bad idea to use variables instead of literals is quite reduced.
It would basically work only for integer literals (doing it for floats would lead to truncation). Is it worth making such an exception knowing that it will work using variables anyway ?
There was a problem hiding this comment.
no, try to make the change to just fix this (of course if you can additional tests would be great)
There was a problem hiding this comment.
Sorry, I'm not sure to understand what additional changes you have in mind. As it is, the PR does the job.
As for the additional tests, while I agree that the proposed tests are more functional non regression tests than unit tests, I'm a bit short on the way to make proper unit tests. If you have suggestions, I would be happy to implement them.
There was a problem hiding this comment.
no what i mean is that i would like to have your patch handle all of the cases and not special case (which i think it works).
jreback
left a comment
There was a problem hiding this comment.
can you add a whatsnew note, in 1.4. bug fixes, indexing is ok.
…in all cases not only when using numexpr.
|
very nice @AlexisMignon thanks! |
The initial issue #22435 (check that there is no conflicting names failing) was actually hiding a deeper one leading to a failure when calling
numexpr.evaluate().The commit proposed here solves both issues by adding a temporary variable to the scope for the results of function / method calls when using numexpr engine.