Skip to content

Additional fixes for std::pair's StreamerInfo handling#10230

Merged
pcanal merged 19 commits intoroot-project:masterfrom
pcanal:master-issue-10131
Apr 6, 2022
Merged

Additional fixes for std::pair's StreamerInfo handling#10230
pcanal merged 19 commits intoroot-project:masterfrom
pcanal:master-issue-10131

Conversation

@pcanal
Copy link
Copy Markdown
Member

@pcanal pcanal commented Mar 24, 2022

This repairs crash seem in testProxiesAndCategories on the v6.24 branches and a few additional issues seen during testing.

@pcanal pcanal requested a review from Axel-Naumann as a code owner March 24, 2022 22:06
@pcanal pcanal self-assigned this Mar 24, 2022
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@pcanal
Copy link
Copy Markdown
Member Author

pcanal commented Mar 24, 2022

For the record the following issue is still left:

Issues with 'interpreted' + load-file-first mode.

Loading the file creates a TClass for the map.  Additional code (eg accessing
the CollectionProxy) might also create a TClass for the pair.

When the header file is loaded in the interpreter, unless there is explicit
uses of the map (or pair), there is no decl for the instantiation of thus
the TClass for the map and pair are not refreshed.

When the pair or map are instantiated (eg. `gInterpreter->Declare("pair<...> pl")`),
the TClass for the pair is informed (via `TCling::UpdateClassInfoWithDecl`
and `TCling::RefreshClassInfo`).

We could update `RefreshClassInfo` to refresh the `StreamerInfo` for the pair
but it would also need to also refresh the map's CollectionProxy (size,
hints, etc?) [and there is an arbitrary number because they are thread-local]

So at that point, it might actually be better to recreate the TClass for the
map ...

But wait ... there is currently no support for generating a collection proxy for
an interpreted class ... so it is actually an emulated collection proxy ...

That proxy does not match the interpreted (nor the compiled) version of the
map ... so there is no good point to match the pair either ....

So the solution above are (a tad bit) complex and .... not enough ...

@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@pcanal pcanal marked this pull request as draft March 25, 2022 03:35
@pcanal pcanal force-pushed the master-issue-10131 branch from bfc3736 to 8c249ba Compare March 30, 2022 00:20
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-debian10-i386/cxx14.
Running on pcepsft10.dyndns.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-performance-centos8-multicore/default.
Running on olbdw-01.cern.ch:/data/sftnight/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-ubuntu2004/soversion.
Running on root-ubuntu-2004-1.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-ubuntu16/nortcxxmod.
Running on sft-ubuntu-1604-1.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Failing tests:

pcanal added 16 commits April 1, 2022 11:14
When encountering a data member of a class that is a enum or if it is use in a map or multimap,
record the enum in the rootpcm.
This was a fatal typo in 9678405 pair: properly update StreamerInfo.
We already had TListOfEnumsWithLock but TListOfEnums::FindObject is used by rootcling
In the 'root_serialization' use case, we ended in TBufferFile::WriteClassBuffer with no
StreamerInfo created yet for the TClass for `std::vector<float>` and thus go into
the branch that creates the StreamerInfo ... However, at least in that case,
the call to `BuildRealData` induced the creation of the StreamerInfo but the code
in `TBufferFile::WriteClassBuffer` did not notice and thus create a second
one and tried to register it in the same slot.  (The actual negative side
effect was only an error message and a memory leak).
This happened in a user case where the container (map) for a pair of an enum, which had not dictionary
and something else that did have a dictionary.
This was happening when, inadvertently, StreamerInfo were created multiple times and then immeditately
deleted (for map<enumtype_without_dict, somethinelse>)
@pcanal pcanal force-pushed the master-issue-10131 branch from a16f3c8 to 6189bb9 Compare April 1, 2022 16:15
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

Copy link
Copy Markdown
Member

@Axel-Naumann Axel-Naumann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm more or less :-)

virtual ~TVirtualCollectionProxy() {};

// Reset the info gathered from StreamerInfos and value's TClass.
virtual Bool_t Reset() { return kTRUE; };
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
virtual Bool_t Reset() { return kTRUE; };
virtual Bool_t Reset() { return kTRUE; }

What does the return value indicate?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That the reset failing somehow ... I guess this could/should be replaced by throwing an exception but this code is not using them anywhere else ....


fCollectionProxy->Reset();
TIter nextClass(gROOT->GetListOfClasses());
while (auto acl = (TClass*)nextClass()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really?! This can iterate through a million classes with the experiments...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, only, 10s of thousands but yep ... However the case should be rare-ish this days. It requires: "load file, read data and/or manipulate the TClass and then load the library" (Note that we have a similar loop in TClass::ReplaceWith that so far as not been 'noticed').

TIter nextClass(gROOT->GetListOfClasses());
while (auto acl = (TClass*)nextClass()) {
if (acl == this) continue;
if (acl->fCollectionProxy && acl->fCollectionProxy->GetValueClass() == pcl) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively you could flag the state of the collection proxy as "before we have authoritative info", and then reset and update in GetValueClass() when pcl's state doesn't correspond to that flag?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would change much. It is already functionally the case (the GetValueClass() == pcl). For fully initialized proxy this is "just" a function call and an atomic pointer load (the alternative would only, possibly, save the function call by inlining).

}
}

TIter next(pcl->GetStreamerInfos());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not make sense to update this before telling all collection proxies to update their value class, just in case in the future we get a dependency of the proxies on the value steamer info?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, this is a 'high' risk change (potential order of deletions and/or state change issues, etc.). So I will try it in a different PR (and there is a potential optimization with the same type of risk 5 lines above that I'll try too)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it is confirmed. The objects (ActionSequence) deleted in the CollectionProxy's Reset are pointing/using the information from the StreamerInfo (that is deleted/destroyed by the StreamerInfo's Clear).

/// Specialize FindObject to do search for the
/// a enum just by name or create it if its not already in the list

TObject *TListOfEnums::FindObject(const char *name) const
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do other overloads of FindObject also add? I don't remember that... Maybe consider renaming to FindOrAdd()?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should :) ... at least for this (type) of container. Semantically 'nothing' is added, it is just delayed loading. This is a trade-off between pre-loading all enums (and actively harvesting them at each transaction) and this type of gymnastics. (i.e. the cost of avoiding this const_cast is added unused memory alllocation/retention and added (unused) run-time cost)

TInterpreter::DeclId_t decl;
if (GetClass()) decl = gInterpreter->GetEnum(GetClass(), name);
else decl = gInterpreter->GetEnum(nullptr, name);
if (decl) result = const_cast<TListOfEnums *>(this)->Get(decl, name);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eew. Can we make this function non-const?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not without being backward incompatible (We would probably need to expose this type rather than its base). Note: this is existing code that is just moved around.

{
auto temp = new TListOfEnums();
auto temp = cl->fEnums.load() ? cl->fEnums.load() :
IsFromRootCling() ? new TListOfEnums() : new TListOfEnumsWithLock();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the lock so expensive to warrant the extra code complexity?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not, especially since the mutex are probably not even on in the IsFromRootCling. This one liners is the easiest fix and so we should remove this complexity in a different PR. See #10339 for tracking.

@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@pcanal pcanal merged commit cd90139 into root-project:master Apr 6, 2022
@pcanal pcanal deleted the master-issue-10131 branch April 6, 2022 15:50
@pcanal pcanal linked an issue Apr 6, 2022 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RF] Crash in reading some RooWorkspaces after recent TStreamerInfo update Open too many different non-versioned layouts for pair

3 participants