Skip to content

[ntuple] Add RRVecField for type-erased I/O of RVecs#8770

Merged
eguiraud merged 13 commits intoroot-project:masterfrom
eguiraud:rntuple-rvec
May 19, 2022
Merged

[ntuple] Add RRVecField for type-erased I/O of RVecs#8770
eguiraud merged 13 commits intoroot-project:masterfrom
eguiraud:rntuple-rvec

Conversation

@eguiraud
Copy link
Copy Markdown
Contributor

@eguiraud eguiraud commented Jul 29, 2021

This PR:

  • adds the RRVecField type for type-erased I/O of RVecs with RNTuple
  • removes some special-casing of RVec<bool> which is unnecessary now that we have RVec 2.0
  • adds RPrintValueVisitor support for RVecs
  • adds tests for type-erased I/O of RVecs
  • adds tests for ntuple->Show + RVecs
  • adds tests for interop I/O of std::vectors and RVecs (i.e. writing one and reading the other)

Checklist:

  • tested changes locally
  • updated the docs (if necessary): I don't think it's necessary

This PR fixes #6347 .

@eguiraud eguiraud requested a review from jblomer July 29, 2021 12:39
@eguiraud eguiraud self-assigned this Jul 29, 2021
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@ghost
Copy link
Copy Markdown

ghost commented Jul 29, 2021

DeepCode's analysis on #47e2b3 found:

  • ⚠️ 2 warnings 👇

Top issues

Description Example fixes
The result of malloc, which may return null flows to the first argument of memcpy. This could result in undefined behavior. Consider adding a check for nullness. Occurrences: 🔧 Example fixes
Memory leak. Memory is never freed if realloc fails to allocate memory. Occurrences: 🔧 Example fixes

👉 View analysis in DeepCode’s Dashboard | Configure the bot

👉 The DeepCode service and API will be deprecated in August, 2021. Here is the information how to migrate. Thank you for using DeepCode 🙏 ❤️ !

If you are using our plugins, you might be interested in their successors: Snyk's JetBrains plugin and Snyk's VS Code plugin.

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-debian10-i386/cxx14.
Running on pcepsft11.dyndns.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

Copy link
Copy Markdown
Contributor

@jblomer jblomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Some comments inline. I marked some std::int32_t instead of int32_t but not all. I think it would be also good to warn in the documentation of RVec that changes to the memory allocation and to the class layout need to be reflected here (perhaps that's already there).

@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

1 similar comment
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-debian10-i386/cxx14.
Running on pcepsft11.dyndns.cern.ch:/home/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

2 similar comments
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-debian10-i386/cxx14.
Running on pcepsft10.dyndns.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Errors:

  • [2021-07-30T16:09:18.100Z] stderr: error: could not read '.git/rebase-apply/head-name': No such file or directory

Failing tests:

@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@eguiraud
Copy link
Copy Markdown
Contributor Author

eguiraud commented Oct 7, 2021

The only major things left should be the extra tests required by @jblomer and support for RPrintValueVisitor. Let's see if jenkins agrees.

@phsft-bot
Copy link
Copy Markdown

Build failed on mac1014/python3.
Running on macitois22.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Warnings:

  • [2021-10-07T16:41:18.314Z] builtins/xrootd/XROOTD-prefix/include/xrootd/XrdCl/XrdClOptional.hh:58:29: warning: unused parameter 'n' [-Wunused-parameter]

@phsft-bot
Copy link
Copy Markdown

Build failed on mac11.0/cxx17.
Running on macphsft23.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Warnings:

  • [2021-10-07T18:57:06.763Z] builtins/xrootd/XROOTD-prefix/include/xrootd/XrdCl/XrdClOptional.hh:58:29: warning: unused parameter 'n' [-Wunused-parameter]

Failing tests:

@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link
Copy Markdown

Build failed on mac11.0/cxx17.
Running on macphsft23.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@eguiraud eguiraud marked this pull request as ready for review October 11, 2021 17:13
@eguiraud eguiraud requested a review from pcanal as a code owner October 11, 2021 17:13
@phsft-bot
Copy link
Copy Markdown

Build failed on windows10/cxx14.
Running on null:C:\build\workspace\root-pullrequests-build
See console output.

Errors:

  • [2022-04-14T16:20:57.573Z] C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\tools\clang\include\clang/AST/Decl.h(2112,1): fatal error C1088: Cannot flush compiler intermediate file: 'C:\Users\sftnight\AppData\Local\Temp_CL_150a260dex': Invalid argument (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\tools\clang\lib\ASTMatchers\ASTMatchFinder.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\tools\clang\lib\ASTMatchers\obj.clangASTMatchers.vcxproj]
  • [2022-04-14T16:21:11.690Z] C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include\vector(739,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\Target\NVPTX\NVPTXPeephole.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\Target\NVPTX\LLVMNVPTXCodeGen.vcxproj]
  • [2022-04-14T16:21:11.691Z] C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include\xutility(1208,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\Transforms\IPO\FunctionImport.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\Transforms\IPO\LLVMipo.vcxproj]
  • [2022-04-14T16:21:11.691Z] C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\Analysis\TypeBasedAliasAnalysis.cpp(741,1): fatal error C1060: compiler is out of heap space [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\Analysis\LLVMAnalysis.vcxproj]
  • [2022-04-14T16:21:11.691Z] C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\include\llvm/IR/Metadata.h(496,19): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\Target\NVPTX\NVPTXAssignValidGlobalNames.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\Target\NVPTX\LLVMNVPTXCodeGen.vcxproj]
  • [2022-04-14T16:21:11.691Z] C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include\tuple(190,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\Transforms\IPO\Attributor.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\Transforms\IPO\LLVMipo.vcxproj]
  • [2022-04-14T16:21:11.691Z] C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include\tuple(494,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\Analysis\ScalarEvolutionNormalization.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\Analysis\LLVMAnalysis.vcxproj]
  • [2022-04-14T16:21:11.691Z] C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include\ratio(78,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\Target\NVPTX\NVPTXTargetTransformInfo.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\Target\NVPTX\LLVMNVPTXCodeGen.vcxproj]
  • [2022-04-14T16:21:11.691Z] C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\include\llvm/ADT/ArrayRef.h(184,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\Transforms\IPO\SCCP.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\Transforms\IPO\LLVMipo.vcxproj]
  • [2022-04-14T16:21:11.691Z] C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\include\ratio(28,53): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\Target\NVPTX\NVPTXReplaceImageHandles.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\Target\NVPTX\LLVMNVPTXCodeGen.vcxproj]

And 365 more

@eguiraud
Copy link
Copy Markdown
Contributor Author

eguiraud commented Apr 22, 2022

@jblomer about the discussion at #8770 (comment) , I remember that for RVec we said we would not call destructors for collection elements at every entry, but I see that RVectorField does at line 1039?

auto oldNItems = typedValue->size() / fItemSize;
for (std::size_t i = nItems; i < oldNItems; ++i) {
auto itemValue = fSubFields[0]->CaptureValue(typedValue->data() + (i * fItemSize));
fSubFields[0]->DestroyValue(itemValue, true /* dtorOnly */);
}
typedValue->resize(nItems * fItemSize);
for (std::size_t i = oldNItems; i < nItems; ++i) {
fSubFields[0]->GenerateValue(typedValue->data() + (i * fItemSize));
}

Also if the resize at line 1041 needs to reallocate is it correct to only call GenerateValue for new elements? If my understanding is correct, the bytes of the old elements will be correct, but no placement new for a value of the correct type will have been called in the new memory location so technically the object's lifetimes is not started.

(asking about RVectorField to understand what to do with RRVecField)

@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@jblomer
Copy link
Copy Markdown
Contributor

jblomer commented May 4, 2022

@jblomer about the discussion at #8770 (comment) , I remember that for RVec we said we would not call destructors for collection elements at every entry, but I see that RVectorField does at line 1039?

It does so only for the left-over elements when the collection shrinks.

Also if the resize at line 1041 needs to reallocate is it correct to only call GenerateValue for new elements? If my understanding is correct, the bytes of the old elements will be correct, but no placement new for a value of the correct type will have been called in the new memory location so technically the object's lifetimes is not started.

Good point! I created #10520.

@eguiraud
Copy link
Copy Markdown
Contributor Author

eguiraud commented May 4, 2022

It does so only for the left-over elements when the collection shrinks.

uhm why are leftover elements different from elements that are replaced? if some resource management logic has to be called when an element is removed, it might also have to be called when it is replaced?

EDIT: ah, I see now that #10520 basically covers this

Copy link
Copy Markdown
Contributor

@jblomer jblomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, except one tiny comment on the change in the RNTupleModel API

@eguiraud
Copy link
Copy Markdown
Contributor Author

Sorry I have to rebase this on top of master again in order to be able to compile the branch with gcc 12..

eguiraud and others added 13 commits May 19, 2022 10:12
With RVec's redesign, RVec<bool> now behaves consistently with
all other RVec<T>'s (it does not inherit vector<bool>'s behavior).
This removes some code duplication and makes it possible for
specialized RField<RVec<T>> instances to act as general RRVecField
instances for what regards visitor implementations (immediately
useful to implement a single RPrintValueVisitor for all RVec fields).
Plus tests for type-erased and non-type-erased value printing.
RNTuple does not explicitly destroy existing values when reading,
it only overwrites the bytes.
When reading RVecs, we now:
- destroy excess elements if size is reduced w.r.t. the previous entry,
  for consistency with std::vector I/O
- generate new values for new elements if size is increased w.r.t. the
  previous entry -- only if needed instead of unconditionally as before
- when reallocating, we call destructors on the old elements and
  generate the corresponding new elements in the new memory buffer
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-ubuntu16/nortcxxmod, ROOT-ubuntu2004/soversion, mac1015/python3, mac11/cxx17, windows10/cxx14
How to customize builds

@eguiraud eguiraud requested a review from jblomer May 19, 2022 08:19
@eguiraud
Copy link
Copy Markdown
Contributor Author

This should be ready to go, with two performance optimizations (marked as TODOs) that are left for future PRs (this one is already terribly large, I think).

Copy link
Copy Markdown
Contributor

@jblomer jblomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I'm looking forward to adjusting the ntuple data source to this.

@eguiraud eguiraud merged commit d262262 into root-project:master May 19, 2022
@eguiraud eguiraud deleted the rntuple-rvec branch May 19, 2022 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ntuple] Off-by-one-byte error when deserializing arrays as RVecs

4 participants