Monster PR#1197
Conversation
tddontje
left a comment
There was a problem hiding this comment.
I noticed in doc/source/tutorials/format-description.rst that the current format version number is said to be "2". Looks like constants.cc changed it to 3.
56e4523 to
076d42d
Compare
tdenniston
left a comment
There was a problem hiding this comment.
Finished RTree review; did not see anything major.
|
RTree changes LGTM. Moving on to the next review. |
tdenniston
left a comment
There was a problem hiding this comment.
API review done (I didn't really look at the associated unit test sources though). Mostly these are questions, I don't know how much time @stavrospapadopoulos and @jakebolewski you want to spend on the API at this time.
tdenniston
left a comment
There was a problem hiding this comment.
Subarray and SubarrayPartitioner reviewed.
tdenniston
left a comment
There was a problem hiding this comment.
VFS changes LGTM -- @joe-maley is also taking a look
tdenniston
left a comment
There was a problem hiding this comment.
Read algorithm and Reader class review done -- LGTM.
b8f82aa to
1c06178
Compare
* Added Subarray class to implement tiledb_subarray_t. * Added functionality for estimating the result size in Subarray. * Added RTree class to expedite checking for tile overlap when estimating the result size. * Now the MBRs in FragmentMetadata are organized into an RTree after loading.
…d consolidator. Also fixed an issue with consolidating kv. Deleted consistency test, since now flushing the fragment metadata after writing to an array requires an exclusive lock. The TileDB consistency model is now better captured by the tiledb_array_open_at functionality anyway.
…in consolidator. Added \bigobj flag for Windows build.
…he read batch stitching. If there are fewer than vfs.batch_gap_size bytes between two batches, the batches get stitched. The current default is 500KB
…arrays. Fixes issue with indefinite growing of the URI to encryption key mapping in `StorageManager` (the mapping is no longer needed).
536cf0f to
f778783
Compare
This PR removes some unused code, specifically: * A patch file that should have been removed in #4553. * The `EncryptionKeyValidation` class that is unused since #1197. * Setting CMake policies to NEW that are already set by `cmake_minimum_required(VERSION 3.21)` --- TYPE: NO_HISTORY (cherry picked from commit 49e8752)
This is unfortunately the outcome of numerous changes we had to experiment with to address various important performance and functionality challenges during a contract with a customer. It should have been better modularized into separate commits, but this did not happen as the work got interleaved as we were progressing with the contract. To be absolutely avoided in the future!
To make the reviewing progress somewhat easier, I am breaking down the changes per meaningful group of files.
High-level Changes
tiledb_array_open_atfunctionality anyway.vfs.min_batch_size.vfs.batch_gap_size) to control the read batch stitching. If there are fewer thanvfs.batch_gap_sizebytes between two batches, the batches get stitched. The current default is 500KBTODO after merge
SubarrayPartitionerto the C/C++ API viatiledb_subarray_partitioner_t.StorageManager::get_fragment_info(there are some TODOs there regarding xlocking)StorageManager. Use an alternative thread pool instead.Review Help
tiledb/sm/rtree/rtree.cctiledb/sm/rtree/rtree.htest/src/unit-rtree.ccRTreeclass.tiledb/sm/subarray/subarray.cctiledb/sm/subarray/subarray.htiledb/sm/misc/tile_overlap.htiledb/sm/cpp_api/deleter.htiledb/sm/cpp_api/query.htiledb/sm/cpp_api/subarray.htiledb/sm/cpp_api/tiledbtiledb/sm/query/query.htiledb/sm/query/query.ccSubarray, which now handles multi-range subarrays.tiledb/sm/subarray/subarray_partitioner.cctiledb/sm/subarray/subarray_partitioner.hSubarrayPartitioner, which manages the result size and facilitates incomplete queries.tiledb/sm/c_api/tiledb.htiledb/sm/c_api/tiledb.cctest/src/unit-capi-subarray.cctest/src/unit-capi-subarray-partitioner.cctest/src/unit-cppapi-subarray.ccdoc/source/c-api.rsttiledb_subarray_t(which wrapsSubarray)tiledb_subarray_partitioner_t(which wrapsSubarrayPartitioner)tiledb/sm/vfs/vfs.htiledb/sm/vfs/vfs.cctest/src/unit-vfs.ccVFS::read_alltakes as input a thread pooltiledb/sm/storage_manager/config.htiledb/sm/storage_manager/config.cctiledb/sm/cpp_api/config.htest/src/unit-config.ccdocs/source/tutorials/config.rstdocs/source/tutorials/performance-factors.rsttiledb/sm/tile/tile.cctiledb/sm/fragment_metadata/fragment_metadata.htiledb/sm/fragment_metadata/fragment_metadata.ccdoc/source/tutorials/format-description.rstFragmentMetadatainstead ofStorageManagertiledb/sm/storage_manager/open_array.htiledb/sm/storage_manager/open_array.ccOpenArrayinstead ofStorageManagerStorageManagertiledb/sm/storage_manager/consolidator.htiledb/sm/storage_manager/consolidator.cctest/src/unit-capi-consolidation.cctiledb/sm/fragment/fragment_info.hvoid*for non empty domain.tiledb/sm/array_schema/domain.htiledb/sm/misc/uri.htiledb/sm/misc/uri.cctiledb/sm/misc/utils.htiledb/sm/misc/utils.cctiledb/sm/query/writer.htiledb/sm/query/writer.ccvoid*subarrays.void*subarrays and useSubarrayinstead for writes.tiledb/sm/query/reader.htiledb/sm/query/reader.cctest/src/unit-capi-incomplete-2.cctest/src/unit-capi-sparse_array-2.cctest/src/unit-capi-sparse_neg-2.cctest/src/unit-capi-sparse_real-2.ccRead::read_2. This will substituteRead::readonce the dense multi-range subarray are implemented.void*subarray still works).*-2.cc, which will substitute the old ones once the dense multi-range subarray are implemented.tiledb/sm/storage_manager/storage_manager.htiledb/sm/storage_manager/storage_manager.ccWriter.tiledb/sm/misc/constants.htiledb/sm/misc/constants.cctiledb/sm/misc/parallel_functions.htiledb/sm/misc/status.htiledb/sm/misc/status.cctiledb/sm/array.htiledb/sm/array.cctest/src/unit-capi-array.cctest/src/unit-dense-array.cctest/src/unit-kv.cctest/src/unit-cppapi-array.cctest/src/unit-capi-consistency.ccopen_{array,kv}_atfunctions.HISTORY.md