Skip to content

Backport: Fix an race condition during multiple DB opening (#8574)#13

Merged
azat merged 1 commit intoClickHouse:masterfrom
azat-archive:backport-data-race-fix
Sep 26, 2021
Merged

Backport: Fix an race condition during multiple DB opening (#8574)#13
azat merged 1 commit intoClickHouse:masterfrom
azat-archive:backport-data-race-fix

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Sep 26, 2021

Summary:
ObjectLibrary is shared between multiple DB instances, the
Register() could have race condition.

Pull Request resolved: facebook#8574

Test Plan: pass the failed test

Reviewed By: ajkr

Differential Revision: D29855096

Pulled By: jay-zhuang

fbshipit-source-id: 541eed0bd495d2c963d858d81e7eabf1ba16153c
(cherry picked from commit c4a503f)

Summary:
ObjectLibrary is shared between multiple DB instances, the
Register() could have race condition.

Pull Request resolved: facebook#8574

Test Plan: pass the failed test

Reviewed By: ajkr

Differential Revision: D29855096

Pulled By: jay-zhuang

fbshipit-source-id: 541eed0bd495d2c963d858d81e7eabf1ba16153c
(cherry picked from commit c4a503f)
@azat
Copy link
Copy Markdown
Member Author

azat commented Sep 26, 2021

Check buck targets and code format / Check TARGETS file and code format (pull_request) Failing after 40s — Check TARGETS file and code format

Actions are broken in the fork.

@azat azat merged commit 296c1b8 into ClickHouse:master Sep 26, 2021
azat added a commit to azat/ClickHouse that referenced this pull request Sep 26, 2021
This should fix the following SIGSEGV, that was found on CI [1]:

    <Fatal> BaseDaemon: Address: NULL pointer. Access: read. Unknown si_code.
    <Fatal> BaseDaemon: 4.4. inlined from ../contrib/rocksdb/utilities/object_registry.cc:19: rocksdb::ObjectLibrary::FindEntry() const
    ...
    <Fatal> BaseDaemon: 7.3. inlined from ../contrib/rocksdb/options/cf_options.cc:678: rocksdb::$_7::operator()()

  [1]: https://clickhouse-test-reports.s3.yandex.net/29341/2b2bec3679df7965af908ce3f1e8e17e39bd12fe/integration_tests_flaky_check_(asan).html#fail1

And also I checked manually with TSan binary, and here is a data race
reported by TSan:

    WARNING: ThreadSanitizer: data race (pid=3356)
      Read of size 8 at 0x7b0c0008cca8 by thread T40:
        2 rocksdb::ObjectLibrary::FindEntry() const obj-x86_64-linux-gnu/../contrib/rocksdb/utilities/object_registry.cc:18:27 (clickhouse-tsan+0x1b839a6c)
        ...
        6 rocksdb::$_7::operator()() const obj-x86_64-linux-gnu/../contrib/rocksdb/options/cf_options.cc:676:32 (clickhouse-tsan+0x1b6bfa63)
        ...
        28 rocksdb::GetColumnFamilyOptionsFromMap() obj-x86_64-linux-gnu/../contrib/rocksdb/options/options_helper.cc:727:10 (clickhouse-tsan+0x1b6fffd2)
        29 DB::StorageEmbeddedRocksDB::initDb() obj-x86_64-linux-gnu/../src/Storages/RocksDB/StorageEmbeddedRocksDB.cpp:359:26 (clickhouse-tsan+0x14195e31)
        ...

      Previous write of size 8 at 0x7b0c0008cca8 by thread T41:
        ...
        9 rocksdb::ObjectLibrary::AddEntry() obj-x86_64-linux-gnu/../contrib/rocksdb/utilities/object_registry.cc:31:19 (clickhouse-tsan+0x1b8392fc)
        ...
        11 rocksdb::RegisterTableFactories()::$_0::operator()() const obj-x86_64-linux-gnu/../contrib/rocksdb/table/table_factory.cc:23:14 (clickhouse-tsan+0x1b7ea94c)
        ...
        43 rocksdb::GetColumnFamilyOptionsFromMap() obj-x86_64-linux-gnu/../contrib/rocksdb/options/options_helper.cc:727:10 (clickhouse-tsan+0x1b6fffd2)
        44 DB::StorageEmbeddedRocksDB::initDb() obj-x86_64-linux-gnu/../src/Storages/RocksDB/StorageEmbeddedRocksDB.cpp:359:26 (clickhouse-tsan+0x14195e31)

Refs: ClickHouse/rocksdb#13
Fixes: ClickHouse#29341
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants