Skip to content

[ntuple] Make cluster entries and elements per page configurable#8113

Merged
jblomer merged 6 commits intoroot-project:masterfrom
mxxo:write-config
May 11, 2021
Merged

[ntuple] Make cluster entries and elements per page configurable#8113
jblomer merged 6 commits intoroot-project:masterfrom
mxxo:write-config

Conversation

@mxxo
Copy link
Copy Markdown
Contributor

@mxxo mxxo commented May 6, 2021

Allow users to adjust the number of entries per cluster and the number of elements per page using RNTupleWriteOptions.
I also took the opportunity to fix some whitespace errors in RNTupleOptions.hxx. Fixes #7853.

Usage (see tests as well):

RNTupleWriteOptions opt;
opt.SetNClusterEntries(100000);
opt.SetNElementsPerPage(40000);
auto ntuple = RNTupleWriter::Recreate(
   std::move(model), "ntuple", fileGuard.GetPath(), opt
);

As was pointed out in #7112 (comment), there are some NElementsPerPage inputs that could cause compression problems (i.e. those where the total page memory is larger than 0xffffff). Should we have some error checking at the RNTupleWriteOptions level? Or maybe this should be considered a bug on the compression side of things.

cc @jalopezg-r00t

@mxxo mxxo requested review from jalopezg-git and jblomer May 6, 2021 20:01
@mxxo mxxo requested a review from pcanal as a code owner May 6, 2021 20:01
@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-fedora30/cxx14, ROOT-fedora31/noimt, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@phsft-bot
Copy link
Copy Markdown

Build failed on mac11.0/cxx17.
Running on macphsft23.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Warnings:

  • [2021-05-06T20:22:54.820Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:194:21: warning: 'destroy' is deprecated [-Wdeprecated-declarations]
  • [2021-05-06T20:23:01.597Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:194:21: warning: 'destroy' is deprecated [-Wdeprecated-declarations]
  • [2021-05-06T20:23:01.597Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:128:24: warning: 'destroy' is deprecated [-Wdeprecated-declarations]
  • [2021-05-06T20:23:01.598Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:140:54: warning: 'max_size' is deprecated [-Wdeprecated-declarations]
  • [2021-05-06T20:23:01.598Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:99:21: warning: 'construct' is deprecated [-Wdeprecated-declarations]
  • [2021-05-06T20:23:01.598Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:99:21: warning: 'construct<ROOT::VecOps::RVec<unsigned long>>' is deprecated [-Wdeprecated-declarations]
  • [2021-05-06T20:23:01.598Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:99:21: warning: 'construct' is deprecated [-Wdeprecated-declarations]
  • [2021-05-06T20:23:01.598Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:99:21: warning: 'construct<unsigned long>' is deprecated [-Wdeprecated-declarations]
  • [2021-05-06T20:23:01.598Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:140:54: warning: 'max_size' is deprecated [-Wdeprecated-declarations]
  • [2021-05-06T20:23:01.598Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:128:24: warning: 'destroy' is deprecated [-Wdeprecated-declarations]

And 1476 more

Failing tests:

Copy link
Copy Markdown
Contributor

@jalopezg-git jalopezg-git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I have just left an small comment. :-)

value.GetField()->Append(value);
}
fNEntries++;
if ((fNEntries % fClusterSizeEntries) == 0)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the fClusterSizeEntries data member is unused now? I would either remove it or initialize it in the RNTupleWriter constructor using the value of fSink->GetWriteOptions().GetNClusterEntries().

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Javier, done in 6930897.

: fSink(std::move(sink))
, fModel(std::move(model))
, fMetrics("RNTupleWriter")
, fClusterSizeEntries(kDefaultClusterSizeEntries)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above.

@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-fedora30/cxx14, ROOT-fedora31/noimt, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@mxxo
Copy link
Copy Markdown
Contributor Author

mxxo commented May 7, 2021

Future extension: add soft limits to cluster memory sizes, since we don't want to have a case where a single entry is 1MB and the number of cluster entries is the default 64K.

@phsft-bot
Copy link
Copy Markdown

Build failed on mac11.0/cxx17.
Running on macphsft23.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Warnings:

  • [2021-05-07T15:27:56.368Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:194:21: warning: 'destroy' is deprecated [-Wdeprecated-declarations]
  • [2021-05-07T15:28:03.268Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:194:21: warning: 'destroy' is deprecated [-Wdeprecated-declarations]
  • [2021-05-07T15:28:03.268Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:128:24: warning: 'destroy' is deprecated [-Wdeprecated-declarations]
  • [2021-05-07T15:28:03.268Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:140:54: warning: 'max_size' is deprecated [-Wdeprecated-declarations]
  • [2021-05-07T15:28:03.268Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:99:21: warning: 'construct' is deprecated [-Wdeprecated-declarations]
  • [2021-05-07T15:28:03.268Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:99:21: warning: 'construct<ROOT::VecOps::RVec<unsigned long>>' is deprecated [-Wdeprecated-declarations]
  • [2021-05-07T15:28:03.269Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:99:21: warning: 'construct' is deprecated [-Wdeprecated-declarations]
  • [2021-05-07T15:28:03.269Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:99:21: warning: 'construct<unsigned long>' is deprecated [-Wdeprecated-declarations]
  • [2021-05-07T15:28:03.269Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:140:54: warning: 'max_size' is deprecated [-Wdeprecated-declarations]
  • [2021-05-07T15:28:03.269Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:128:24: warning: 'destroy' is deprecated [-Wdeprecated-declarations]

And 1497 more

Failing tests:

Copy link
Copy Markdown
Contributor

@jblomer jblomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very useful! Thanks a lot!


bool GetUseBufferedWrite() const { return fUseBufferedWrite; }
void SetUseBufferedWrite(bool val) { fUseBufferedWrite = val; }
NTupleSize_t GetNClusterEntries() const { return fNClusterEntries; }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename to GetNEntriesPerCluster so that it is consistent with GetNElementsPerPage (also for the setter and the data member)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, done in b9654b4

}

auto ntuple = RNTupleReader::Open("ntuple", fileGuard.GetPath());
auto col0_pages = ntuple->GetDescriptor().GetClusterDescriptor(0).GetPageRange(0).Clone();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this be a const auto & to avoid the cloning?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so (changed in 557854b). I know similar style is used throughout the other tutorials too but for whatever reason on Friday I was being paranoid about dangling references with so many chained calls (ubsan misses easy mistakes here too). On second thought, I think we're OK because every method returns a const reference, which in turn points to a member of the ntuple's descriptor which is always in scope.

@jblomer
Copy link
Copy Markdown
Contributor

jblomer commented May 10, 2021

@phsft-bot build with flags -DCTEST_TEST_EXCLUDE_NONE=On

@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-fedora30/cxx14, ROOT-fedora31/noimt, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14 with flags -DCTEST_TEST_EXCLUDE_NONE=On
How to customize builds

mxxo added 2 commits May 10, 2021 09:32
We were using Clone to avoid any possibility of a dangling reference,
but in this case we are OK as everything in the call chain returns a
const ref to a member of the ntuple's descriptor.
@phsft-bot
Copy link
Copy Markdown

Build failed on ROOT-ubuntu16/nortcxxmod.
Running on sft-ubuntu-1604-2.cern.ch:/build/workspace/root-pullrequests-build
See console output.

Failing tests:

@phsft-bot
Copy link
Copy Markdown

Build failed on windows10/cxx14.
Running on null:C:\build\workspace\root-pullrequests-build
See console output.

Errors:

  • [2021-05-10T15:13:50.231Z] C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\include\llvm/ADT/STLExtras.h(940,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\IR\Pass.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\IR\LLVMCore.vcxproj]
  • [2021-05-10T15:13:50.633Z] C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\include\llvm/ADT/STLExtras.h(1252,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\IR\Metadata.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\IR\LLVMCore.vcxproj]
  • [2021-05-10T15:13:50.633Z] C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.23.28105\include\tuple(26,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\IR\Module.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\IR\LLVMCore.vcxproj]
  • [2021-05-10T15:13:50.633Z] C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\IR\LLVMContextImpl.cpp(252,1): fatal error C1060: compiler is out of heap space [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\IR\LLVMCore.vcxproj]
  • [2021-05-10T15:13:50.633Z] C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\include\llvm/IR/Metadata.h(50,31): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\IR\Statepoint.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\IR\LLVMCore.vcxproj]
  • [2021-05-10T15:13:50.633Z] C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.23.28105\include\tuple(460,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\IR\SafepointIRVerifier.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\IR\LLVMCore.vcxproj]
  • [2021-05-10T15:13:50.633Z] C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.23.28105\include\utility(211,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\IR\PassManager.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\IR\LLVMCore.vcxproj]
  • [2021-05-10T15:13:50.633Z] C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\include\llvm/ADT/PointerUnion.h(51,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\lib\IR\TypeFinder.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\lib\IR\LLVMCore.vcxproj]
  • [2021-05-10T15:13:50.633Z] C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\include\llvm/Support/Casting.h(227,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\tools\clang\lib\Serialization\ASTWriterDecl.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\tools\clang\lib\Serialization\obj.clangSerialization.vcxproj]
  • [2021-05-10T15:13:50.633Z] C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.23.28105\include\xutility(2410,1): fatal error C1060: compiler is out of heap space (compiling source file C:\build\workspace\root-pullrequests-build\root\interpreter\llvm\src\tools\clang\lib\CodeGen\CGCUDANV.cpp) [C:\build\workspace\root-pullrequests-build\build\interpreter\llvm\src\tools\clang\lib\CodeGen\obj.clangCodeGen.vcxproj]

And 28 more

@phsft-bot
Copy link
Copy Markdown

Starting build on ROOT-debian10-i386/cxx14, ROOT-performance-centos8-multicore/default, ROOT-fedora30/cxx14, ROOT-fedora31/noimt, ROOT-ubuntu16/nortcxxmod, mac1014/python3, mac11.0/cxx17, windows10/cxx14
How to customize builds

@mxxo mxxo requested a review from jblomer May 10, 2021 16:43
@phsft-bot
Copy link
Copy Markdown

Build failed on mac11.0/cxx17.
Running on macphsft23.dyndns.cern.ch:/Users/sftnight/build/workspace/root-pullrequests-build
See console output.

Warnings:

  • [2021-05-10T19:07:30.609Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:194:21: warning: 'destroy' is deprecated [-Wdeprecated-declarations]
  • [2021-05-10T19:07:36.904Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:194:21: warning: 'destroy' is deprecated [-Wdeprecated-declarations]
  • [2021-05-10T19:07:36.904Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:128:24: warning: 'destroy' is deprecated [-Wdeprecated-declarations]
  • [2021-05-10T19:07:37.182Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:140:54: warning: 'max_size' is deprecated [-Wdeprecated-declarations]
  • [2021-05-10T19:07:37.183Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:99:21: warning: 'construct' is deprecated [-Wdeprecated-declarations]
  • [2021-05-10T19:07:37.183Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:99:21: warning: 'construct<ROOT::VecOps::RVec<unsigned long>>' is deprecated [-Wdeprecated-declarations]
  • [2021-05-10T19:07:37.183Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:99:21: warning: 'construct' is deprecated [-Wdeprecated-declarations]
  • [2021-05-10T19:07:37.183Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:99:21: warning: 'construct<unsigned long>' is deprecated [-Wdeprecated-declarations]
  • [2021-05-10T19:07:37.183Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:140:54: warning: 'max_size' is deprecated [-Wdeprecated-declarations]
  • [2021-05-10T19:07:37.183Z] /Users/sftnight/build/workspace/root-pullrequests-build/root/math/vecops/inc/ROOT/RAdoptAllocator.hxx:128:24: warning: 'destroy' is deprecated [-Wdeprecated-declarations]

And 1497 more

Failing tests:

@jblomer jblomer merged commit cbab56c into root-project:master May 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ntuple] Allow the user to customize cluster/page sizes

5 participants