Ensure registration of atexit function is done only once#10753
Ensure registration of atexit function is done only once#10753vepadulano wants to merge 1 commit intoroot-project:masterfrom
Conversation
Before this commit, the following triggered two calls to TROOT::EndOfProcessCleanups: python -c "import ROOT; ROOT.TH1F" This happened because the TApplication constructor registered a function that calls EndOfProcessCleanups with the C `atexit` and PyROOT registered another similar function with the Python atexit module in `__init__.py`. The Python atexit happened before the C atexit. This commit introduces a static function in TApplication that registers the call to EndOfProcessCleanups with the C `atexit` and this is guaranteed to be run only once through the use of std::call_once. This new function can be called both from PyROOT and the TApplication constructor, to make sure it will be registered by either of them. Furthermore, calling this from PyROOT means that now EndOfProcessCleanups is always called at the very end of the program, seemingly a more sensible choice.
|
Starting build on |
|
Build failed on ROOT-ubuntu16/nortcxxmod. Errors:
|
|
With the following diff And a build of current master, I get Whereas with a build of this PR I get |
|
This pull request introduces 1 alert when merging 697a0b0 into 867fe54 - view on LGTM.com new alerts:
|
|
Thanks Vincenzo! I think the most important change in this PR is that PyROOT does not register the I am not sure we need the combo With this said I'm the least qualified person to decide on whether we want to go with this or not, I'll leave it to the other reviewers. |
|
Build failed on ROOT-performance-centos8-multicore/cxx17. Failing tests: |
|
It seems the order of execution is now 'wrong' in the PyROOT cases, i.e. do we still implement the behavior described in the comment: See also 7a592f5 for some description of a challenging use case. |
It is unclear to me what is the weight of the behaviour described in the comment and in the commit message w.r.t. the logic of what we need to do at atexit time. From my understanding, the current behaviour (before this PR) is:
This logic seems faulty to me. I ran the reproducer of the original issue linked in the commit you linked and it runs smoothly after the changes of this PR. Copying here the reproducer for reference: import ROOT
def create():
f = ROOT.TFile("file.root", "RECREATE")
t = ROOT.TTree("tree", "")
return f, t
f, t = create()Admittedly, I have only tried with Python 3.10, I will see if I can also try with other Python 3 minor versions. But irrespective of this, I don't think |
|
A priori a 2nd running of EndOfProcessCleanups should be 'harmless' (i.e. what ever list of things to delete would be empty on the 2nd run). The issue here (well there #10743) does not seems to be the 2nd running of EndOfProcessCleanups but rather that there is a bad interaction between the 'early' running of the cleanups and that something in RDataFrame is not properly informed that something is delete early. The early cleanups is necessary because of the following issue.
Then the order of execution at tear down will be (1) tear down python During (5) the object attached the TFile (usually just histogram and However if the libraries for the user objects are already unloaded, their code (including a Streamer function or even their dictionary) are gone and this result in crash at best or worse in data loss. In order to prevent that we introduced (in 7a592f5 and friends), running the More likely that not this means than rather than removing early this call to |
|
Ok I understand a bit better the original motivation for the two calls to |
Well, maybe, maybe not. If it is currently not a true no-op then it might be either
It must be called (by python) during point (1) (which maybe what you meant) and RDF should not be doing anything after (5) so it should already not bother RDF. But maybe I am missing something. I.e. is the second call to |
|
Just to clarify, the second call is totally fine and the first call is creating a problem for RDF because of #10742 . Even then, it seemed weird that calling If Python's or PyROOT's teardown sequence unloads libraries that might be needed by EndOfProcessCleanups then I agree with Vincenzo that maybe the best we can do is run it once before Python's teardown (maybe with a comment that explains the issue, if the scenario in the commit message of 7a592f5 does not really need the early EndOfProcessCleanups call as Vincenzo mentioned). |
Unless something in the handling of the python handles changed since the commit, then it is needed.
The "once before ...." is more or so because there is no 'better' place to call it.
Indeed (this is scenario similar to the (broken) ones I tried to describe). Getting the order of destruction right is very challenging (way more corner cases than you would expect or want) and unless there is a strong reason (and #10742 is not a good reason, that bug leads to a broken state which leads to broken behavior :) ), I would leave this code as-is. |
Before and after this PR, the reproducer of 7a592f5 works, so it means that currently we have no coverage for the edge cases that it was covering. |
Did you run it under valgrind? (it could 'just' appear to work) |
|
... but even if it did come clean it would not necessarily prove that it is not needed. Order of tear down is notoriously fickled and may depends on many variable (version of python, compilation options, general order of operation). I.e. unless we unearth the actual commits and/or changes that make this step obsolete, I would not touch it. |
|
Closing this PR, the only thing needed is probably more test coverage to better understand the need for the current approach. I opened a bug issue for that |
Before this commit, the following triggered two calls to
TROOT::EndOfProcessCleanups:
This happened because the TApplication constructor registered a
function that calls EndOfProcessCleanups with the C
atexitandPyROOT registered another similar function with the Python atexit
module in
__init__.py. The Python atexit happened before the Catexit.
This commit introduces a static function in TApplication that
registers the call to EndOfProcessCleanups with the C
atexitandthis is guaranteed to be run only once through the use of
std::call_once. This new function can be called both from PyROOT
and the TApplication constructor, to make sure it will be
registered by either of them.
Furthermore, calling this from PyROOT means that now
EndOfProcessCleanups is always called at the very end of the
program, seemingly a more sensible choice.
This PR fixes #10743