Skip to content

Conversation

@dmopalmer
Copy link
Contributor

The cache is invalidated and updated when a frame is added to the frame_transform graph. This doubles the speed of creating a new SkyCoord and speeds getattr and some other operations by more than an order of magnitude.

Fixes #5022 (based on discussion in its duplicate #5698).

are registered in the frame_transform_graph).
if self._cached_frame_attrnames_set is None:
result = set()
for frame_cls in self.frame_set:
for attr in frame_cls.get_frame_attr_names().keys():
Copy link
Contributor

@MSeifert04 MSeifert04 Jan 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just result.update(frame_cls.get_frame_attr_names()) instead of the manual loop with result.add?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should work (more precisely result.update(frame_cls.get_frame_attr_names().keys()) ). but I was relocating known-good code rather than rewriting.

Since this is typically called only once over the life of a program, the performance hit is negligible, although .update does improve readability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure but is frame_cls.get_frame_attr_names() a dictionary? If so then result.update(frame_cls.get_frame_attr_names()) can also avoid re-hashing the "keys". On the other hand .keys() creates a list-like-iterator where set.update needs to hash the values again.

If it's not a dictionary then: Forget what I was saying above 😄

But yes, keep it as-is, there is no actual (performance) reason to change it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Class method get_frame_attr_names() doesn't get the frame attr names, instead it builds a dictionary of the names and attributes: going through the list of attribute names and getattr'ing each one.

The documentation says that it returns the names of attributes and their default values (which is not what you would expect from the method name).

HOWEVER: almost everywhere it is used in astropy, it is used as a list/set of attribute names. There may be worth in defining get_frame_attr_nameset() that just returns a set of names, instead of an OrderedDict of names + values.

@mhvk
Copy link
Contributor

mhvk commented Jan 16, 2017

This looks good to me. I'm happy with merging as is, or using update. In principle, if going for the latter, I'd be tempted to do it as well for other properties (such as frameset), and use a frozenset there too, since that makes it unnecessary to copy the output... But my guess is that that is better done in a separate PR...

What should be done, though, is to add an entry in the change log (under "Other changes and additions") since this is quite a remarkable improvement in speed. I guess it will have to be under 2.0, since this does not fix a bug...

@pllim
Copy link
Member

pllim commented Jan 17, 2017

Maybe @eteq wants to have a look before merge?

Anyone did a timing test on this? If the speed up is significant, would be nice to advertise in the appropriate change log.

@pllim
Copy link
Member

pllim commented Jan 17, 2017

Ah, nevermind, I found the timing info here. IMHO, change log would be nice but I'll let subpackage maintainer decide.

Update: Found more timing info here.

@mhvk
Copy link
Contributor

mhvk commented Jan 29, 2017

Note that I included this in #5751.

@bsipocz
Copy link
Member

bsipocz commented Jan 31, 2017

@mhvk @eteq - This is milestoned as 2.0 while PRs building on this, e.g. #5751 is as 1.3.1. Would it be possible to resolve this milestone conflict before merging, either way, so it won't cause headache when doing a bugfix release?

@mhvk
Copy link
Contributor

mhvk commented Jan 31, 2017

If we do #5751, this should be closed as I included the relevant commit in it (with some small changes to take account of a rebase, but still ensuring the commit credits @dmopalmer).

@dmopalmer
Copy link
Contributor Author

dmopalmer commented Feb 27, 2017

This branch has been updated because I needed the changes to do actual work. (Processing time went from 97.1 s to 5.9 s). I probably should have made a branch off the branch, but I didn't, so if #5751 is dropped and you go back to this PR, you may have to work off a previous commit.

FAILURE: it sped up so much because it started raising exceptions instead of actually doing the work. With the latest fix, my program works and takes 1/4 the time.

@bsipocz
Copy link
Member

bsipocz commented Mar 22, 2017

Should this be closed now that #5751 is merged?

@mhvk
Copy link
Contributor

mhvk commented Mar 22, 2017

Yes, indeed. closing. Thanks again, @dmopalmer, for identifying that horrible waste of computing time!

@mhvk mhvk closed this Mar 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants