-
-
Notifications
You must be signed in to change notification settings - Fork 2k
Conceptual issue with float conversion when quantity is dimensionless #669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conceptual issue with float conversion when quantity is dimensionless #669
Conversation
|
The more I think about it, the more I think it only makes sense for float/int etc to work for dimensionless quantities. The attached code leads to the following behavior, which I think it clearer and less ambiguous: Of course, the tests will fail and the docs need updating, but I'd be interested in getting feedback on this. |
|
This looks like what you want: @mdboom might know better, but I think this was the compromise between wanting to keep dimensionless quantities like m/kpc intact by default, but still have an easy way to decompose that. I wouldn't place high priority on it for now, but it might be nice if it didn't do this just in the case where you have the same unit just with different prefixes (such as your case with nm and m) but don't automatically decompose if you're mixing two different units like m and pc (even if they represent the same physical unit). |
|
(Note: The explicit |
|
I think I agree with @astrofrog here. The code @iguananaut shows is indeed the right way as it works right now, but I agree it's counter-intuitive (to me) that the example didn't work. It's unique to the case where a value is dimensionless, because, when you write out the math, normally you don't worry about units for dimensionless quantities. So we have to be extra-careful, because no one even expects units to matter at all in that situation. I'm ok with the solution presented here because it just makes people use Another possibility might be to chang the |
|
I think this is taking away too much functionality that I would expect to "just work". The entire point of the Instead, of raising a type error, why not just decompose the units if they're dimensionless, and if not leave them alone? That is, I think that would cover the two most general cases in a way that's most expected in most cases. It would break, for example, the m/kpc case but that can just be an example of a special case maybe? |
|
As @eteq said, I'm indeed only suggesting we change the behavior of the @iguananaut - I think that what you are suggesting would be confusing, because What I'm essentially suggesting is that those functions act more intuitively. One cannot conceptually do because the units can't be converted, and it's a good fail-safe, because actually, in most cases, you're probably doing something wrong if you are taking the exp of something with units (log is a special case where people do do it, but again I think they should be careful about the units the quantity is in rather than trusting the result blindly). So long story short, I think that it would be much better to only allow implicit float conversion in cases where the quantity has no units, and use the Note that I think we could try and define a custom behavior for @mdboom @taldcroft - do you have any thoughts on this? |
|
I think I'm with @astrofrog on this. I'd further be fine with no implicit float conversion at all, but I think @astrofrog is suggesting a nice middle ground that avoids the large potential for confusion and shooting one's foot. |
|
@astrofrog @mdboom Let me try again. This has nothing to do in my mind with what is or isn't correct with respect to units. The entire point of this functionality in the first place was for interoperability of Say for example I have a bunch of 3-tuples representing velocities and I want to normalize them. With normal ints or floats I can just do this: This "just works" because as long as all the elements in a tuple can be converted to an array type Numpy will do that automatically. True you get back an array instead of a new tuple, but that's fine. Maybe I'm going to pass it on to some visualization code anyways. Now say I'm working with Astropy quantities, which I prefer to because they contain more information and all these things. Right now I can do the exact same thing as in the previous example without any change to the code other than that the tuple is now a tuple of Quantities: Sure the returned value loses the unit information (though this will be less of an issue once we can start carrying that around in NDData arrays). So better still, it also works (aside from #679 for which I've implemented a temporary fix for the sake of this example) to do: Now you do get back a Quantity with the units attached which is nice, and this object should continue to "just work" fairly deep into third-party library code since The short version of this is that, I think it's a big win to say "You can start using Quantitys in your code Right Now and you don't have to change anything else or even make any special conversions." If we do this then the best we can say is, "Well you can use Quantities with your existing code but only in special cases, the rest of the time you have to do [q.value for q in v] if you want to use your Quantity objects outside Astropy code. In that case you're throwing away information about the units anyways, and it's the coder's responsibility to be sure that those values are in the correct units. |
|
I think supporting the second example is fine, because there is only one unit involved: However, what about this: If we could, it should convert everything to the same unit before performing the operation. But without monkey-patching Numpy, I don't know how we could, it you're just likely to get confusion. I really think it's better to disallow this. If someone has to say q.value they're understanding more that that the value is just a scale on top of a scale in the unit and not some universal value that can be used with other values. |
|
In the latter case is probably a bad example. I think that if one wants to have a collection of quantities like a vector or matrix and ensure that they're all the same unit one should use the quantity array support. I agree that mixing together a bunch of quantities with the same units but different scale factors could be dangerous if done carelessly. But that's a danger one can get into in either case. If a user is passing a bunch of Quantity objects to a function that's expecting values in m/s and they're carelessly doing And besides, maybe there are cases where one would want a collection of some coordinates, for example, that have different units. |
|
I'd further add, that if this feature were to only work in a special case, I'd argue we should take it out altogether--an implicit special case that only sometimes works is just user-hostile even if it's documented. |
|
@iguananaut - I agree that it's possible for the user to make the mistakes with Silently works but is almost certainly wrong. That said, I see your point that supporting this for arrays and not floats is problematic, and as @mdboom said, the We could have it be a warning instead of a failure. But I'm slightly against that because it seems like we should just pick one side and accept the consequences (I just don't know which is the right side). Regardless, though, I think @astrofrog's original case of doing |
|
I'm not clear on how directly using |
|
Actually I'd be totally happy with a warning. |
That's why I suggested in the first place that in the dimensionless case use to |
|
I need to think about this more, but one could actually revert to the position held by some originally that In any case, while of course using If we want to be really safe, then we could:
This still allows some pragmatism in that the ideal view would be to get rid of |
|
(of course, if people agreed to switch over entirely to using |
|
@astrofrog: That actually makes a lot of sense. I think "power users" may still want the ability to just "get the !@#$ value", but Note it seems to me that |
|
Saying "nothing is lost" in the dimensionless case is not quite right either. What if you really have a use case where you need the value to be a ratio of, say, meters to nanometers ( a trivial case since that's just a base-10 scale factor, but less trivial if we're talking say lightyears per kpc ). Now granted I'm not the intended audience so I don't claim to know right from wrong as far as what would be most useful. But I have already been using this framework in my own work and I've occasionally needed to represent ratios like that unambiguously. I completely agree that the case you found that started this issue is surprising and dangerous. That said, such dimensionless ratios are probably not the most common use case, and they can also be restored if really needed. For example: Which also leads to the question: What's the difference between But again, I don't know what the best use case is. |
|
@iguananaut - the problem is that in a lot of cases, the units are not going to be as simple as Since this is controversial, I wonder whether this means that we should, for 0.2, remove some of these features until we agree on them? We could remove the automatic float conversion altogether, and the @mdboom - in my mind, |
|
I get the And that raises the question of, if quantities don't have a default value in any units (and really they do--it's whatever value/unit was passed in when the quantity was initialized) how do you represent what quantity the Quantity object represents? Like, when one does a |
|
@iguananaut the difference is the following: From an idealistic point of view (not what I am necessarily advocating), having But in the end, I am fine with pragmatic, hence my proposed 'middle-ground' solution of just keeping |
|
Also, I would argue that if one is combining pre-defined constants with other quantities and their not sure what units that constant is in, to always use |
|
I'd be fine with removing the automatic |
|
@iguananaut - I do agree that the dimensionless case only really makes more sense to me because that's what I ran into, so maybe that is indeed an argument for just removing implicit float conversion. It would be a nice feature, but the potential for confusion/error is large... I think we should think about this over the weekend and see what conclusion we reach by Monday. What I meant before was that maybe we should choose to be deliberately conservative for 0.2, and turn off automatic float conversion, even if it's a bit of a pain for users. However, I do think that a lot of use cases will actually not involve float conversion at all. What about (for 0.2):
? |
|
@astrofrog @iguananaut - I would prefer not to have That said, I think I agree with dropping the automatic float conversion for now (e.g. 0.2). It's really not that onerous to have to use And I actually think the dimensionless case is a very important special case, because it's the only mathematically valid argument to |
|
Is it possible to at least get Finally, I also agree on deferring a decision on |
|
I'm with @eteq on this one: same number of characters in So I'm for keeping |
|
Still not convinced that telling users "just use As for If we go this route Quantities need to be completely changed to not have an implicit unit at all. It would have an internal _unit to keep track of what units the quantity was originally specified in, but otherwise it doesn't have a unit unless you ask for a value in specific units. But this is annoying because if I make a bunch of Quantities in 'm/s' and I know the only units I'm going to care about in my application are 'm/s' it's going to get really annoying really fast to have to write So to avoid having to feel responsible if someone crashes the spaceship, why not just produce a warning when getting an implicit value from an unreduced dimensionless unit? |
|
One offer for a possible middle ground: By default go with something like I described in my last comment, where a Quantity has no implicit value or unit (just an internal value and unit used to define it). To get a value for that quantity use However, add a new optional The biggest question in my mind with this idea is how to propagate the |
|
Seems fine to change it if everyone was able to come to rapid agreement. |
|
So yeah, is this good to go at last? |
|
@iguananaut - no, #701 needs to be merged first (but once it is, then yes) |
|
Then are we changing #701 to 0.2? |
|
@iguananaut - I think so, but #701 should now be trivial to finalize (there was a small issue with the docs that @mdboom needs to fix) |
|
@iguananaut @mdboom @eteq - I've removed #701 from this branch, and instead have defined my own |
|
And while I agree that the test I am doing may be overly strict, it's only used to raise a warning, so the warning is always raised, except in a very special case where the unit is |
|
As I mentioned in #701, I'm not crazy about the new definition of |
|
@mdboom - hmm, so just to make sure I understand, you are saying that if the units are dimensionless, and the scale of the decomposed unit is 1, then we shouldn't have to raise a warning, because the units all cancel out and there is no scale? I guess I'm fine with that - the only reason I did it this way was to be on the absolute safe side - but if you don't foresee any issues, I'll change it. I'll still keep it separate from #701 since we don't know in what order they will be merged. |
|
@astrofrog: Yes, exactly. And the new |
|
@iguananaut @eteq - I'll merge this evening if there are no objections. |
Conceptual issue with float conversion when quantity is dimensionless
|
LOL |
Conceptual issue with float conversion when quantity is dimensionless
|
@astrofrog - think this is solved by #929 too! |
Conceptual issue with float conversion when quantity is dimensionless

I was using
astropy.unitsto check some answers in a Physics tutorial, and one thing I tried instinctively was something like:of course, the real answer is something like 2.55, because a / d = 0.93, but by default, float just uses the
valueof the quantity, and:so P ends up being zero, which is bad. I'm not exactly sure how we should deal with cases like this - should Quantities always check if they are actually dimensionless, and if so, automagically simplify themselves? So then we'd have:
but then, sometimes people might actually want to keep a Quantity in e.g.
m/kpc, so this doesn't seem like a good solution.This makes me wonder whether maybe it is too dangerous to have automatic float conversion work for everything, and whether we should make it so automatic float conversion only works for dimensionless quantities? (with an exception raised if not). It does mean that for a value in
m/kpc,q.valuewill be different fromfloat(q), but maybe that's ok since then they have different purposes?cc @mdboom @eteq @iguananaut @taldcroft