-
-
Notifications
You must be signed in to change notification settings - Fork 2k
Make Unit and Angle string parsing thread-safe #11227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The lexer and parser generation is all collected together in one place (a new astropy.utils.parsing module), with the side benefit that a lot of code duplication is eliminated. This new shared code uses appropriate locking to ensure that both parser creation and use is thread-safe. Fixes astropy#11225 (when combined with astropy#11224).
|
I haven't added a new module before, so I'm not sure if there are any associated steps that need to be done e.g. to hook it in to the documentation. I'm also not sure if it should be mentioned under "API Changes" in the changelog, given that it is new API albeit not really intended to external consumption. |
mhvk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bmerry - this looks fantastic, thanks! The locking seems simple and good and I really like the cleanup! Only a few small comments inside.
On the larger one: for most stuff in utils, we keep the information to the minimum but we do expose it, by adding an .. automodapi:: ... to docs/utils/index.rst. There is a general warning in that file that utils is mostly for our own use.
For now, I'll put the milestone at 4.3 - am not sure what the thought would be about backporting to the LTS release (or whether that's needed since the files produced are shipped with astropy).
|
The code coverage I think cannot easily be helped, since the parser code does not get re-generated in testing. |
- Add utils.parsing package to documentation, and list it in CHANGES.rst as new API. - Make TAB_HEADER and ThreadSafeParser public. - Some rewording of docstrings. - Use double-backticks to refer to function arguments (otherwise Sphinx was printing warnings) and to refer to undocumented PLY functions.
One could write a test for this module that deleted and regenerated a dummy lexer+parser. It would end up making modification in the installation tree, which feels a bit icky to me. Let me know if you'd like me to have a go at that. |
|
Agreed that changing the installation during testing is not worth it. In any case, for this PR probably best to stick to your (very nice) reorganization. |
bsipocz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, and quite like that all those duplicates are gone and condensed into their separate module.
Some minor nitpicks, and maybe a test to cover the file generation, but none of them are big deal so I'm happy to see this merged without those.
|
As for backporting to LTS: I'm a bit worried to backport this much (not this PR, but generally all the threadsafe fixes) to LST. My intuitions may be wrong here, but I would rather not risk changing behaviour in for LTS. |
|
@bsipocz - yes, I also worried this was getting too much, but wanted your thoughts. Let's stick with 4.3, as a nice new feature! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! "Allowed failure" failures are expected and unrelated to this PR.
I also agree with @bsipocz that a test should be added to cover the missing coverage in parsing.py, specifically the _add_tab_header function. A simple test using tmpdir and a dummy template generated on-the-fly in tmpdir should do, I think? As for testing lex and yacc, I am not sure how easy it would be because it invokes __file__ and all that, so that one needs some thinking.
My review does not address the internals of the parsing logic, which I defer to @mhvk 's expertise.
CHANGES.rst
Outdated
|
|
||
| - Removed usage of deprecated ``ipython`` stream in ``utils.console``. [#10942] | ||
|
|
||
| - Add new ``utils.parsing`` module to with helper wrappers around ``ply``. [#11227] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is better suited for "New Features" section, rather than "API change".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| # You can then commit the changes to this file. | ||
| """ | ||
| import astropy.utils.parsing as parsing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: This is more consistent with the style of the existing imports.
| import astropy.utils.parsing as parsing | |
| from astropy.utils import parsing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (I merged it into the same line as another from astropy.utils import ...).
astropy/units/format/cds.py
Outdated
| from astropy.units.utils import is_effectively_unity | ||
| from astropy.utils import classproperty | ||
| from astropy.utils.misc import did_you_mean | ||
| import astropy.utils.parsing as parsing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| import astropy.utils.parsing as parsing | |
| from astropy.utils import parsing |
astropy/units/format/generic.py
Outdated
| from .base import Base | ||
| from astropy.utils import classproperty | ||
| from astropy.utils.misc import did_you_mean | ||
| import astropy.utils.parsing as parsing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| import astropy.utils.parsing as parsing | |
| from astropy.utils import parsing |
astropy/units/format/ogip.py
Outdated
|
|
||
| from . import core, generic, utils | ||
|
|
||
| import astropy.utils.parsing as parsing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| import astropy.utils.parsing as parsing | |
| from astropy.utils import parsing |
Use from `astropy.utils import parsing` and not `import astropy.utils.parsing as parsing`.
It covers the currently-public API. There is still discussion over whether TAB_HEADER and/or _add_tab_header should be public.
Most of the coverage comes from the main test suite, but this covers the case where the lextab/parsetab does not yet exist and has to be generated.
I added a test that writes Python modules with a trivial lexer and parser to a temporary directory, then imported them to generate table files. It turned out to be reasonably straightforward. |
|
I think this is ready for another look; the one question left in my mind is whether TAB_HEADER and/or add_tab_header should be public or private. |
|
The 32-bit error seems related... |
mhvk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! The only thing left is to remove the HEADER from __all__ again - you convinced me it was better.
astropy/utils/parsing.py
Outdated
| import threading | ||
|
|
||
|
|
||
| __all__ = ['lex', 'TAB_HEADER', 'ThreadSafeParser', 'yacc'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @mhvk wants TAB_HEADER removed as per comment above?
| __all__ = ['lex', 'TAB_HEADER', 'ThreadSafeParser', 'yacc'] | |
| __all__ = ['lex', 'ThreadSafeParser', 'yacc'] |
Done (I also put the underscore back). |
Indeed. My guess is that whatever parallelisation is being used doesn't play nicely with |
Actually it's not the parallel run, it's the one where it runs the tests twice in the same process. I'm guessing the modules that are dynamically loaded in the test persist in sys.modules. I'll have to find a workaround for that - possibly monkey-patching sys.modules, or else generating a unique name. |
PLY has some restrictions about only defining one parser per module (no doubt related to whatever deep magic it's using), so generate a unique module name each time the test is run.
|
The 32-bit parallel build is passing now. readthedocs is failing, but I think that's because numpy's docs have vanished. |
|
Great! Thanks for fixing that double-test problem. Merging... |
|
Just to close the loop, RTD failure was transient, as it built fine after merge. Thanks again! |
Description
The lexer and parser generation is all collected together in one place
(a new astropy.utils.parsing module), with the side benefit that a lot
of code duplication is eliminated. This new shared code uses appropriate
locking to ensure that both parser creation and use is thread-safe.
Fixes #11225 (when combined with #11224).