Dimensionless unit looks the same regardless of scale #701

mdboom · 2013-01-30T15:13:31Z

This comes out of a discussion in #669.

Right now, the following behavior is in astropy:

>>> from astropy import units as u
>>> u1 = (u.km / u.m).decompose()
>>> u2 = (u.m / u.m).decompose()
>>> u1
Unit(dimensionless)
>>> u2
Unit(dimensionless)
>>> u1 == u.dimensionless
False
>>> u2 == u.dimensionless
True
>>> u1.is_dimensionless()
True
>>> u2.is_dimensionless()
True

This is counter-intuitive, because two things that say they are "dimensionless" give different behavior when compared to u.dimensionless.

This is "correct" behavior, because two units with different scale factors are different, but the fact that they both appear to be the same is problematic.

cc @mdboom

mdboom · 2013-01-29T23:32:28Z

Would updating the __repr__ to display the scale be enough, do you think?

The (u.km / u.m) != u.dimensionless is correct behavior, IMHO, but perhaps u.dimensionless should be renamed? Note that (u.km / u.m).is_equivalent(u.dimensionless) is True, which is also correct. Perhaps just making the __repr__ of u.dimensionless be Unit(1 dimensionless) will be sufficient to alleviate confusion.

embray · 2013-01-29T23:44:58Z

I agree that (u.km / u.m) != u.dimensionless is correct. But (u.km / u.m).decompose() != u.dimensionless is what confused me. The reason for it is because they both have different scales.

I think displaying the scale in the __repr__ would help.

eteq · 2013-01-29T23:53:28Z

Definitely adding the scale to __repr__ makes sense. I would say it's even fine to do Unit(dimensionless) if the scale is 1, but display the scale for all other cases.

Going that route, then leaving dimensionless with that name is fine. I think the natural assumption is that "dimensionless" also means "without scale"... in fact, I think the naturalness of that assumption is what causes the confusion about this in the first place!

I think it might make sense to add a paragraph or two in the docs about this, though. There's already a section called "The dimensionless unit", so stating there that dimensionless also means without scale would help, as would and including a few examples like the one I have at the top of this issue.

astrofrog · 2013-01-30T08:18:24Z

@mdboom - this is what confused me:

In [2]: (u.km / u.m).is_dimensionless()
Out[2]: True  # this is confusing
In [3]: (u.km / u.m) == u.dimensionless
Out[3]: False

to me, both are syntactically the same, so as a user, I don't understand why they are different.

mdboom · 2013-01-30T14:46:36Z

@astrofrog: Your confusion suggest to me that one or the other should probably be renamed. But to what? u.dimensionless_and_unscaled? (Not a really suggestion, given how verbose it is).

embray · 2013-01-30T16:24:42Z

What about is_unity?

mdboom · 2013-01-30T16:46:16Z

I would expect is_unity to check for both unscaled and dimensionless units, whereas the current behavior of is_dimensionless is only about the units.

astrofrog · 2013-01-30T17:40:06Z

What about having a variable u.none instead of u.dimensionless? (faster to type than u.dimensionless_and_unscaled.

eteq · 2013-01-30T19:44:03Z

I don't like none because it's too close to None.

This is a slightly more substantial suggestion, but seems less ambiguous to me:

leave dimensionless exactly as is
remove the is_dimensionless method
change UnitBase.physical_type to give 'dimensionless' instead of the current result, which is 'unknown' (this would be good to do regardless)
replace all uses of <something>.is_dimensionless() with <something>.physical_type == 'dimensionless'

That's really what we mean with is_dimensionless, right? That the type of the unit be dimensionless, regarless of the scale? Then the dimensionless object is dimensionless and scale free, but the @astrofrog's syntactic ambiguity with is_dimensionless goes away.

mdboom · 2013-01-31T01:57:23Z

@eteq: That's not a bad suggestion, but I wonder if it's wise to remove the fast path for the is_dimensionless check, as shown by @astrofrog's benchmarks in #669? I think we still need something to specifically serve that purpose, under some name or other.

I do agree that physical_type should be updated to do the right thing with dimensionless units in any event.

astrofrog · 2013-01-31T07:44:33Z

We could have a check is_unity() which checks that bases==[] and scale==1? (since that means it's the same as Unit(1). Then is_dimensionless() can stay the same?

astrofrog · 2013-01-31T17:16:17Z

This looks a bit confusing:

In [2]: (u.km / u.m).decompose()
Out[2]: Unit(1000.0 dimensionless)

I wonder whether for dimensionless, the output should be

In [2]: (u.km / u.m).decompose()
Out[2]: Unit(dimensionless with scale 1000.0)

while for any other units, what you have is fine, e.g.

In [3]: (u.km).decompose()
Out[3]: Unit("1.000000e+03 m")

mdboom · 2013-01-31T18:21:31Z

@astrofrog: That makes sense -- to put the scale at the end. I'll make that change -- and I guess you'll want to merge or cherry-pick the new commit.

astrofrog · 2013-01-31T18:34:06Z

docs/units/standard_units.rst

This example will need to be updated

I used to, before the last commit. I thought it was more helpful to print it than not.

Sorry, I meant - shouldn't it say

Unit(dimensionless with a scale of 1000.0)

?

astrofrog · 2013-01-31T23:55:19Z

Apart from my small comments above, this will be ready to go. I've re-tagged as 0.2 since it's needed fo #669 which we also want in 0.2.

eteq · 2013-02-01T06:16:18Z

I think we still want to get rid of is_dimensionless. I say this mainly because without doing this, it doesn't address the original problem that @astrofrog pointed out; that is_dimensionless and == u.dimensionless ( or possibly is u.dimensionless) don't mean the same thing.

I have a fairly comprehensive set of timings below (taken from this PR's branch). The only case where is_dimensionless is faster that .physical_type == 'dimensionless' is for an irreducible (or prefix) unit. That might be an important enough case that we need to work around it somehow, though. Perhaps the physical type should be cached?

In [42]: kmm = u.km/u.m

In [43]: mm = u.m/u.m



In [44]: kmm.is_dimensionless()
Out[44]: True

In [45]: timeit kmm.is_dimensionless()
10000 loops, best of 3: 47 us per loop

In [63]: kmm.physical_type == 'dimensionless'
Out[63]: True

In [48]: timeit kmm.physical_type == 'dimensionless'
10000 loops, best of 3: 53 us per loop

In [49]: kmm == u.dimensionless
Out[49]: False

In [50]: timeit kmm == u.dimensionless
1000 loops, best of 3: 164 us per loop

In [51]: kmm is u.dimensionless
Out[51]: False

In [52]: timeit kmm is u.dimensionless
10000000 loops, best of 3: 168 ns per loop



In [53]: mm.is_dimensionless()
Out[53]: True

In [54]: timeit mm.is_dimensionless()
100000 loops, best of 3: 9.14 us per loop

In [57]: mm.physical_type == 'dimensionless'
Out[57]: True

In [58]: timeit mm.physical_type == 'dimensionless'
100000 loops, best of 3: 16.5 us per loop

In [59]: mm == u.dimensionless
Out[59]: True

In [60]: timeit mm == u.dimensionless
10000 loops, best of 3: 157 us per loop

In [61]: mm is u.dimensionless
Out[61]: False

In [62]: timeit mm is u.dimensionless
1000000 loops, best of 3: 205 ns per loop

In [84]: timeit u.km.physical_type=='dimensionless'
10000 loops, best of 3: 45.6 us per loop

In [85]: timeit u.km.is_dimensionless()
1000000 loops, best of 3: 566 ns per loop

eteq · 2013-02-01T06:24:38Z

I did some further tests (which I won't copy here to not fill up the comments even further) that show that mm.physical_type == 'dimensionless' and mm.decompose().scale==1 and similar are within a factor of 2 (sometimes slower, sometimes faster) as is_unity (except for the irreducible/prefix case, which I assume is due to the physical_type thing I noted above).

So I'm not sure we even need is_unity; the above seems more explicit (which is good here, I think).

And even if you disagree and want to keep is_unity, I think the name needs to be changed. This class is one that I think users should be able to always just look at the code and know what's happening. The term "unity" is somewhat ambiguous, and I don't think I would grasp the fact that it has two pieces (dimensionlessness and no scale) if I hadn't been reading this PR.

astrofrog · 2013-02-01T07:19:44Z

@eteq - if we are going to use

mm.physical_type == 'dimensionless' and mm.decompose().scale==1

instead of

is_unity

then we could just as well use

mm.bases and mm.scale == 1

as it's more concise. But it'd be nice to have a convenience method, so I think we should still keep is_unity, though I'm fine with trying to find a better name. I'll think about it.

astrofrog · 2013-02-01T07:28:15Z

Looking into it more, I think the current implementation of is_unity is not exactly what I had in mind yet. For example, if I have (silly) units of mm * m / micron / km then in #669 I don't want to treat that as being dimensionless, but is_unity will return True - I really only want the cases where the unit is Unit(1) but without having to instantiate Unit(1). So I think the name is_unity is fine, but I think I'd rather have it test that bases is empty and scale is one. @eteq - do you agree the name would be ok with such an implementation?

In order to get #669 moving, I'll simply define a convenience function _is_unity() there for now, so that we don't have to rush this discussion for 0.2.0 if we don't reach a consensus.

astrofrog · 2013-02-01T07:59:36Z

I'm going to +1 @eteq's suggestion from earlier to remove is_dimensionless. I think all we really need is:

some_unit.physical_type == 'dimensionless'

which is testing that the overall type is not length, mass, etc. What I also wanted, but will not insist on a convenience method that is equivalent to

some_unit.bases == [] and some_unit.scale == 1.

and that is what I would call is_unity. However, I would also be happy with is_empty, or is_one for example. And if there's no consensus, we don't have to have a user convenience method for this.

astrofrog · 2013-02-01T09:49:54Z

Since #669 doesn't depend on this PR anymore, shall we re-tag it as 0.2.1 or 0.3? Or do we want to fix it for 0.2?

mdboom · 2013-02-01T13:55:55Z

@eteq: Those benchmarks are helpful, but keep in mind many are doing different things. u.is_dimensionless() and u.physical_type == 'dimensionless' are essentially equivalent, but the others are going to give different results, particularly because they aren't working with decomposed units. (And x is u.dimensionless is not useful here).

So I think we need to first decide what we're trying to achieve, then we can decide what to call it, and then optimize if necessary.

@astrofrog: I don't understand your rationale for not wanting to decompose the unit first before checking whether it is unity. I think it's very unlikely that the user would create a unity quantity directly (which is the only case under which what you suggest would work). It's much more likely that the user would do something like:

Quantity(5, u.m) / Quantity(2, u.m)

and this should be allowed to implicitly convert to a float in the context of what we're discussing for #669.

I think some_unit.bases == [] and some_unit.scale == 1 is just wrong for what we want there -- I know that's what I originally suggested -- but the unit really should be decomposed first. And I think we should have some sort of convenience function for this purpose (even if it's private) as this PR illustrates how difficult getting it right can be.

mdboom · 2013-02-01T13:59:27Z

Also, I think I'm fine with removing is_dimensionless and steering people to using physical_type == 'dimensionless' in the docs.

astrofrog · 2013-02-01T14:07:11Z

@mdboom - regarding #669, I've replied directly there. I think removing is_dimensionless is the right thing to do.

Regarding is_unity, I really can't think of a better name - that is exactly what it's checking - that the unit is basically 1. I agree is_dimensionless was ambiguous, but I don't think users will expect m / km for example to pass is_unity. It's got a sufficiently unusual name that they can look at the docs (whereas is_dimensionless was more dangerous in that respect).

astrofrog · 2013-02-01T14:10:32Z

@iguananaut - do you have any opinions about is_unity?

astrofrog · 2013-02-01T14:15:26Z

@mdboom - by the way, I think my definition of unity did cater for cases such as:

Quantity(5, u.m) / Quantity(2, u.m)

because:

In [9]: (u.m / u.m).bases
Out[9]: []

In [10]: (u.m / u.m).scale
Out[10]: 1

But I've updated #669 to bring is_unity in line with this PR.

mdboom · 2013-02-01T14:26:53Z

Yes, but here's the problem (and come to think of it, we should probably add a test for this):

>>> x = (u.Unit(1000.0 * u.m) / u.km)
>>> x.bases
[Unit("m"), Unit("km")]
>>> x.scale
1000.0
>>> y = x.decompose()
>>> y.bases
[]
>>> y.scale
1.0

astrofrog · 2013-02-01T14:38:10Z

Ah, I see - ok, so now I see that this makes sense!

astrofrog · 2013-02-04T21:07:41Z

Well, I think u.dimensionless would be better as u.unity - @eteq - do you also oppose that name, since it would no longer be a method, so users would likely need to read the docs to find out about it? (u.<tab> isn't very helpful)

eteq · 2013-02-04T21:43:15Z

@astrofrog - In my previous posts I was actually referring to the dimensionless object, not the (current) is_unity() method. In the inline comment I was suggesting ditching is_unity (or whatever it's named), or making it private on the grounds that it appeared to me (and @astrofrog) that in the current version, u.m / u.m == u.dimensionless seemed identical to is_unity(u.m / u.m). So that's why I was only talking about the attribute. Sorry that wasn't very clear.

With that in mind, yes, I think u.unity is still ambiguous. In my head, unity is basically another word for "1" (as long as we're not in group theory...), so u.unit is the same as u.Unit(1). And that's exactly what the whole reason was for creating the dimensionless object; Unit(1) had different interpretations to different people.

I see @mdboom's point on scale_and_dimensionless, though. I'd be ok with unscaled_and_dimensionless. Or dimensionless_unscaled/unscaled_dimensionless (slightly shorter, still gets the point across).

It doesn't really bother me that much that this is a long name. If it starts with dimensionless..., u.dim<tab> gets me what I want in IPython, and when writing code meant for others' consumption it's still less work than the comment I would inevitably have to write to explain what I mean by either dimensionless being unscaled or the fact that Unit(1) means dimensionless.

Sorry to be a pain, but I think this carries quite a bit of potential for confusion because most other aspects of unit are so easy to use... And this is just an inherently complicated case, it seems.

astrofrog · 2013-02-04T21:49:59Z

I agree we can do u.dimensionless_and_unscaled. It is long, but hopefully doesn't need to be used much, and if it does, it will be unambiguous. I think in this case, clarity trumps conciseness. If in future we think of how to say it more concisely, we can always add a shorter alias. @mdboom - what do you think?

mdboom · 2013-02-04T22:42:03Z

I think we're reaching consensus here. is_unity() goes away. u.dimensionless is changed to dimensionless_unscaled (the and isn't strictly necessary to understand it, as @eteq points out).

astrofrog · 2013-02-04T22:48:45Z

Ok, that sounds good to me. Then, just for information, should one do my_unit.decompose() == u.dimensionless_unscaled to do what is_unity currently does?

eteq · 2013-02-04T22:50:03Z

👍 to @mdboom's plan here.

mdboom · 2013-02-04T22:59:48Z

@astrofrog: Yes.

astrofrog · 2013-02-04T23:02:27Z

Great - let's go with that!

eteq · 2013-02-05T00:22:06Z

The travis build passed without complaint. Looks good to me!

Should we get rid of the _is_unity from #669, or leave it in? (either here or a later PR)

astrofrog · 2013-02-05T09:43:47Z

@eteq - I think we can just fix it in a separate PR in future - it's a private method, so no rush.

astrofrog · 2013-02-05T09:44:23Z

Looks good to me too!

Dimensionless unit looks the same regardless of scale

embray · 2013-02-06T18:43:07Z

I've completely lost track of this at this point--but this should be backported for 0.2 final right?

mdboom · 2013-02-06T19:26:00Z

I would think so. It's pretty low risk.

astrofrog · 2013-02-06T22:20:24Z

I agree, it should be backported.

Dimensionless unit looks the same regardless of scale

eteq mentioned this pull request Jan 29, 2013

Conceptual issue with float conversion when quantity is dimensionless #669

Merged

astrofrog reviewed Jan 31, 2013
View reviewed changes

mdboom added 10 commits February 4, 2013 18:07

Add the scale to the __repr__ of a dimensionless unit.

4c3e2d7

Add dimensionless support to Unit.physical_type

d6569b9

Add is_unity method

ed70d2b

Put scale at end, aka Unit('dimensionless with a scale of 1000.0')

2805e04

Update docs to reflect new __repr__

d245c3c

Remove is_dimensionless

e1fcaa3

Add tests for is_unity

ca227d5

Remove reference to is_dimensionless in is_unity docstring

7160411

Latest round of renaming

f5b3d38

Privatize dimensionless constant

f005789

mdboom added a commit that referenced this pull request Feb 5, 2013

Merge pull request #701 from mdboom/unit/dimensionless_repr

4a9044e

Dimensionless unit looks the same regardless of scale

mdboom merged commit 4a9044e into astropy:master Feb 5, 2013

mdboom added a commit that referenced this pull request Feb 8, 2013

Merge pull request #701 from mdboom/unit/dimensionless_repr

61b90ee

Dimensionless unit looks the same regardless of scale

keflavich pushed a commit to keflavich/astropy that referenced this pull request Oct 9, 2013

Merge pull request astropy#701 from mdboom/unit/dimensionless_repr

b66229d

Dimensionless unit looks the same regardless of scale

mdboom deleted the unit/dimensionless_repr branch May 21, 2014 23:57

Uh oh!

Dimensionless unit looks the same regardless of scale #701

Dimensionless unit looks the same regardless of scale #701

Uh oh!

Conversation

mdboom commented Jan 30, 2013

Uh oh!

mdboom commented Jan 29, 2013

Uh oh!

embray commented Jan 29, 2013

Uh oh!

eteq commented Jan 29, 2013

Uh oh!

astrofrog commented Jan 30, 2013

Uh oh!

mdboom commented Jan 30, 2013

Uh oh!

embray commented Jan 30, 2013

Uh oh!

mdboom commented Jan 30, 2013

Uh oh!

astrofrog commented Jan 30, 2013

Uh oh!

eteq commented Jan 30, 2013

Uh oh!

mdboom commented Jan 31, 2013

Uh oh!

astrofrog commented Jan 31, 2013

Uh oh!

astrofrog commented Jan 31, 2013

Uh oh!

mdboom commented Jan 31, 2013

Uh oh!

astrofrog Jan 31, 2013

Choose a reason for hiding this comment

Uh oh!

mdboom Jan 31, 2013

Choose a reason for hiding this comment

Uh oh!

astrofrog Jan 31, 2013

Choose a reason for hiding this comment

Uh oh!

astrofrog commented Jan 31, 2013

Uh oh!

eteq commented Feb 1, 2013

Uh oh!

eteq commented Feb 1, 2013

Uh oh!

astrofrog commented Feb 1, 2013

Uh oh!

astrofrog commented Feb 1, 2013

Uh oh!

astrofrog commented Feb 1, 2013

Uh oh!

astrofrog commented Feb 1, 2013

Uh oh!

mdboom commented Feb 1, 2013

Uh oh!

mdboom commented Feb 1, 2013

Uh oh!

astrofrog commented Feb 1, 2013

Uh oh!

astrofrog commented Feb 1, 2013

Uh oh!

astrofrog commented Feb 1, 2013

Uh oh!

mdboom commented Feb 1, 2013

Uh oh!

astrofrog commented Feb 1, 2013

Uh oh!

astrofrog commented Feb 4, 2013

Uh oh!

eteq commented Feb 4, 2013

Uh oh!

astrofrog commented Feb 4, 2013

Uh oh!

mdboom commented Feb 4, 2013

Uh oh!

astrofrog commented Feb 4, 2013

Uh oh!