Implement caching for Time scale and format #4422

taldcroft · 2015-12-29T18:33:24Z

This implements caching to resolve #4309. The footprint got a little bigger than I had hoped, in part because I expanded the scope to include format caching as well. Because if this I want to get some review before going ahead with doc updates.

The changes in __add__ and __sub__ highlight the subtle problems that can come along with caching. Basically you need to be careful when messing around with _time internals.

Parts of the precision testing were taken out because they relied on changing the internal time object state in a way that is not allowed in normal use.

taldcroft · 2015-12-29T21:15:41Z

FYI:

In [9]: t = Time(np.arange(1e7), format='cxcsec')

In [10]: time t.unix
CPU times: user 6.2 s, sys: 70.8 ms, total: 6.28 s
Wall time: 6.27 s
Out[10]: 
array([  8.83612737e+08,   8.83612738e+08,   8.83612739e+08, ...,
         8.93612734e+08,   8.93612735e+08,   8.93612736e+08])

In [11]: time t.unix
CPU times: user 19 µs, sys: 0 ns, total: 19 µs
Wall time: 24.1 µs
Out[11]: 
array([  8.83612737e+08,   8.83612738e+08,   8.83612739e+08, ...,
         8.93612734e+08,   8.93612735e+08,   8.93612736e+08])

In [12]: time t.tai
CPU times: user 114 ms, sys: 6.31 ms, total: 121 ms
Wall time: 119 ms
Out[12]: 
<Time object: scale='tai' format='cxcsec' value=[  0.00000000e+00   1.00000000e+00   2.00000000e+00 ...,   9.99999700e+06
   9.99999800e+06   9.99999900e+06]>

In [13]: time t.tai
CPU times: user 26 µs, sys: 0 ns, total: 26 µs
Wall time: 29.8 µs
Out[13]: 
<Time object: scale='tai' format='cxcsec' value=[  0.00000000e+00   1.00000000e+00   2.00000000e+00 ...,   9.99999700e+06
   9.99999800e+06   9.99999900e+06]>

In [14]: t.cache
Out[14]: 
defaultdict(<type 'dict'>, {u'scale': {'tai': <Time object: scale='tai' format='cxcsec' value=[  0.00000000e+00   1.00000000e+00   2.00000000e+00 ...,   9.99999700e+06
   9.99999800e+06   9.99999900e+06]>}, u'format': {'unix': array([  8.83612737e+08,   8.83612738e+08,   8.83612739e+08, ...,
         8.93612734e+08,   8.93612735e+08,   8.93612736e+08])}})

In [15]: t2 = Time(t)

In [16]: t2.cache
Out[16]: defaultdict(<type 'dict'>, {})

mhvk · 2015-12-30T15:27:26Z

astropy/time/core.py

The replicate here is not needed any more, since it is done above (a leftover?)

mhvk · 2015-12-30T22:07:33Z

astropy/time/core.py

This change (and the moving up of out = self.replicate()) has me confused: why is it necessary, given that changing scale already implies a replicate?

def __getattr__(self, attr): if attr in self.SCALES and self.scale is not None: tm = self.replicate() tm._set_scale(attr) return tm

This was the subtle issue that I referred to earlier, which took me a bit to track down. When you do out = getattr(self, 'tai') here then out references the same object which gets cached. Later on when out._time.jd1 (and jd2) gets set, this of course updates the cached object, so the next time you do getattr(self, 'tai') you get the arithmetic result!

(And the new version of __getattr__ doesn't replicate if the scale doesn't change.)

And just to make sure I wasn't going crazy, I reverted to the original version for this code and replicated the original test failure that drove this change:

def test_add_vector(self): """Check time arithmetic as well as properly keeping track of whether a time is a scalar or a vector""" t = Time(0.0, format='mjd', scale='utc') t2 = Time([0.0, 1.0], format='mjd', scale='utc') dt = TimeDelta(100.0, format='jd') dt2 = TimeDelta([100.0, 200.0], format='jd') out = t + dt assert allclose_jd(out.mjd, 100.0) assert out.isscalar out = t + dt2 > assert allclose_jd(out.mjd, [100.0, 200.0]) E assert allclose_jd(100.0, [100.0, 200.0]) E + where 100.0 = <Time object: scale='utc' format='mjd' value=100.0>.mjd

Yes, I can see that this took some time to track down! Maybe in this case the code would be clearer if instead of using getattr we just did out._set_scale(...)?

Quick PR to your branch forthcoming...

… ci]

taldcroft · 2015-12-30T23:34:10Z

@mhvk - what do you think of fa0ce53?

embray · 2016-01-04T16:48:26Z

astropy/time/core.py

I realize this was already here, but why isinstance(val, self.__class__) and not just isinstance(val, Time)?

Good catch, it should be Time.

Fixed in 4eee963. (Skipped CI tests because there are more changes in this branch coming).

mhvk · 2016-01-04T18:29:36Z

@taldcroft - sorry for the delay in further review -- I do like having precision, etc., on the time format only; much cleaner!

See my PR for two small improvements (I think), and the comment above about how to have defaults defined. With that, this would seem ready to go in (although like @embray I noticed that at least core.py could use some cleanup; e.g., it still defines _astropy_column_attrs; anyway, for another PR...)

Lazyproperty for cache / easier scale setting in __add__, __sub__

mhvk · 2016-01-04T22:32:01Z

@taldcroft - I think this all looks good as is, but wonder whether there should be something in the release notes? And possibly even in the documentation? After all, it is a new public property...

taldcroft · 2016-01-05T00:15:34Z

@mhvk - yes, I was planning some narrative docs, see 51f3b00. Perhaps release notes can wait until closer to the release.

mhvk · 2016-01-06T01:43:41Z

@taldcroft - OK, I like the addition to the documentation, and meant the change log, not the release notes, so this seems all OK.

Implement caching for Time scale and format

Implement caching for Time scale and format

b3da852

taldcroft added time Affects-release Performance Enhancement labels Dec 29, 2015

taldcroft added this to the v1.2.0 milestone Dec 29, 2015

taldcroft assigned mhvk Dec 29, 2015

mhvk reviewed Dec 30, 2015
View reviewed changes

astropy/time/core.py Outdated

Copy link

Contributor

mhvk Dec 30, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replicate here is not needed any more, since it is done above (a leftover?)

Implement suggestions from @mhvk

f54382d

mhvk reviewed Dec 30, 2015
View reviewed changes

Make precision, in_subfmt, out_subfmt attrs reflect _time attrs [skip…

fa0ce53

… ci]

embray reviewed Jan 4, 2016
View reviewed changes

mhvk added 2 commits January 4, 2016 13:09

User lazyproperty for cache.

c5adc31

Use _set_scale in __add__ and __sub__

a9f825c

taldcroft added 3 commits January 4, 2016 14:48

Allow init of Time subclass with a base class instance [skip ci]

4eee963

Merge pull request #13 from mhvk/pr4422

15329ee

Lazyproperty for cache / easier scale setting in __add__, __sub__

Update CHANGES.rst

768a0d3

Update time docs for caching

51f3b00

mhvk added a commit that referenced this pull request Jan 6, 2016

Merge pull request #4422 from taldcroft/time-cache

0d71f89

Implement caching for Time scale and format

mhvk merged commit 0d71f89 into astropy:master Jan 6, 2016

mhvk mentioned this pull request Jan 12, 2016

lazyproperty: Documentation misleading #4471

Closed

taldcroft deleted the time-cache branch February 25, 2019 20:23

mhvk mentioned this pull request Oct 22, 2023

Using Masked inside Time #15231

Merged

4 tasks

Uh oh!

Implement caching for Time scale and format #4422

Implement caching for Time scale and format #4422

Uh oh!

Conversation

taldcroft commented Dec 29, 2015

Uh oh!

taldcroft commented Dec 29, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

taldcroft commented Dec 30, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mhvk commented Jan 4, 2016

Uh oh!

mhvk commented Jan 4, 2016

Uh oh!

taldcroft commented Jan 5, 2016

Uh oh!

mhvk commented Jan 6, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants