Skip to content

Conversation

@taldcroft
Copy link
Member

This implements caching to resolve #4309. The footprint got a little bigger than I had hoped, in part because I expanded the scope to include format caching as well. Because if this I want to get some review before going ahead with doc updates.

The changes in __add__ and __sub__ highlight the subtle problems that can come along with caching. Basically you need to be careful when messing around with _time internals.

Parts of the precision testing were taken out because they relied on changing the internal time object state in a way that is not allowed in normal use.

@taldcroft
Copy link
Member Author

FYI:

In [9]: t = Time(np.arange(1e7), format='cxcsec')

In [10]: time t.unix
CPU times: user 6.2 s, sys: 70.8 ms, total: 6.28 s
Wall time: 6.27 s
Out[10]: 
array([  8.83612737e+08,   8.83612738e+08,   8.83612739e+08, ...,
         8.93612734e+08,   8.93612735e+08,   8.93612736e+08])

In [11]: time t.unix
CPU times: user 19 µs, sys: 0 ns, total: 19 µs
Wall time: 24.1 µs
Out[11]: 
array([  8.83612737e+08,   8.83612738e+08,   8.83612739e+08, ...,
         8.93612734e+08,   8.93612735e+08,   8.93612736e+08])

In [12]: time t.tai
CPU times: user 114 ms, sys: 6.31 ms, total: 121 ms
Wall time: 119 ms
Out[12]: 
<Time object: scale='tai' format='cxcsec' value=[  0.00000000e+00   1.00000000e+00   2.00000000e+00 ...,   9.99999700e+06
   9.99999800e+06   9.99999900e+06]>

In [13]: time t.tai
CPU times: user 26 µs, sys: 0 ns, total: 26 µs
Wall time: 29.8 µs
Out[13]: 
<Time object: scale='tai' format='cxcsec' value=[  0.00000000e+00   1.00000000e+00   2.00000000e+00 ...,   9.99999700e+06
   9.99999800e+06   9.99999900e+06]>

In [14]: t.cache
Out[14]: 
defaultdict(<type 'dict'>, {u'scale': {'tai': <Time object: scale='tai' format='cxcsec' value=[  0.00000000e+00   1.00000000e+00   2.00000000e+00 ...,   9.99999700e+06
   9.99999800e+06   9.99999900e+06]>}, u'format': {'unix': array([  8.83612737e+08,   8.83612738e+08,   8.83612739e+08, ...,
         8.93612734e+08,   8.93612735e+08,   8.93612736e+08])}})

In [15]: t2 = Time(t)

In [16]: t2.cache
Out[16]: defaultdict(<type 'dict'>, {})

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replicate here is not needed any more, since it is done above (a leftover?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change (and the moving up of out = self.replicate()) has me confused: why is it necessary, given that changing scale already implies a replicate?

    def __getattr__(self, attr):
        if attr in self.SCALES and self.scale is not None:
            tm = self.replicate()
            tm._set_scale(attr)
            return tm

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the subtle issue that I referred to earlier, which took me a bit to track down. When you do out = getattr(self, 'tai') here then out references the same object which gets cached. Later on when out._time.jd1 (and jd2) gets set, this of course updates the cached object, so the next time you do getattr(self, 'tai') you get the arithmetic result!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And the new version of __getattr__ doesn't replicate if the scale doesn't change.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And just to make sure I wasn't going crazy, I reverted to the original version for this code and replicated the original test failure that drove this change:

    def test_add_vector(self):
        """Check time arithmetic as well as properly keeping track of whether
            a time is a scalar or a vector"""
        t = Time(0.0, format='mjd', scale='utc')
        t2 = Time([0.0, 1.0], format='mjd', scale='utc')
        dt = TimeDelta(100.0, format='jd')
        dt2 = TimeDelta([100.0, 200.0], format='jd')

        out = t + dt
        assert allclose_jd(out.mjd, 100.0)
        assert out.isscalar

        out = t + dt2
>       assert allclose_jd(out.mjd, [100.0, 200.0])
E       assert allclose_jd(100.0, [100.0, 200.0])
E        +  where 100.0 = <Time object: scale='utc' format='mjd' value=100.0>.mjd

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can see that this took some time to track down! Maybe in this case the code would be clearer if instead of using getattr we just did out._set_scale(...)?

Quick PR to your branch forthcoming...

@taldcroft
Copy link
Member Author

@mhvk - what do you think of fa0ce53?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this was already here, but why isinstance(val, self.__class__) and not just isinstance(val, Time)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, it should be Time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 4eee963. (Skipped CI tests because there are more changes in this branch coming).

@mhvk
Copy link
Contributor

mhvk commented Jan 4, 2016

@taldcroft - sorry for the delay in further review -- I do like having precision, etc., on the time format only; much cleaner!

See my PR for two small improvements (I think), and the comment above about how to have defaults defined. With that, this would seem ready to go in (although like @embray I noticed that at least core.py could use some cleanup; e.g., it still defines _astropy_column_attrs; anyway, for another PR...)

@mhvk
Copy link
Contributor

mhvk commented Jan 4, 2016

@taldcroft - I think this all looks good as is, but wonder whether there should be something in the release notes? And possibly even in the documentation? After all, it is a new public property...

@taldcroft
Copy link
Member Author

@mhvk - yes, I was planning some narrative docs, see 51f3b00. Perhaps release notes can wait until closer to the release.

@mhvk
Copy link
Contributor

mhvk commented Jan 6, 2016

@taldcroft - OK, I like the addition to the documentation, and meant the change log, not the release notes, so this seems all OK.

mhvk added a commit that referenced this pull request Jan 6, 2016
Implement caching for Time scale and format
@mhvk mhvk merged commit 0d71f89 into astropy:master Jan 6, 2016
@taldcroft taldcroft deleted the time-cache branch February 25, 2019 20:23
@mhvk mhvk mentioned this pull request Oct 22, 2023
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve efficiency of getattr(tm, scale), particularly access to jd1/jd2

3 participants