Skip to content

Fix UUID overlap in DROP TABLE for internal DDL from MaterializeMySQL#28533

Merged
tavplubix merged 1 commit intoClickHouse:masterfrom
azat:MaterializeMySQL-drop-fix
Sep 6, 2021
Merged

Fix UUID overlap in DROP TABLE for internal DDL from MaterializeMySQL#28533
tavplubix merged 1 commit intoClickHouse:masterfrom
azat:MaterializeMySQL-drop-fix

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Sep 2, 2021

Changelog category (leave one):

  • Bug Fix (user-visible misbehaviour in official stable or prestable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix UUID overlap in DROP TABLE for internal DDL from MaterializeMySQL

Detailed description / Documentation draft:
This will fix race with DatabaseCatalog::loadMarkedAsDroppedTables(),
since MaterializeMySQL, and MaterializedMySQLSyncThread in background,
will be started earlier then
DatabaseCatalog::loadMarkedAsDroppedTables() and will move those tables
to metadata_dropped, and after loadMarkedAsDroppedTables() will start
and try to load partially dropped tables and will hit UUID overlap:

12:02:51.536783 [ 3026034 ] {} <Information> Application: starting up
12:02:53.019282 [ 3026034 ] {} <Information> DatabaseMaterializeMySQL<Atomic> (mysql): Total 9 tables and 0 dictionaries.
12:02:53.041699 [ 3026200 ] {} <Debug> mysql.data (7143b65f-6982-4600-b143-b65f6982e600): Loading data parts
12:02:53.041740 [ 3026200 ] {} <Debug> mysql.data (7143b65f-6982-4600-b143-b65f6982e600): There are no data parts
12:02:53.620382 [ 3026034 ] {} <Information> DatabaseMaterializeMySQL<Atomic> (mysql): Starting up tables.
12:03:00.669730 [ 3026183 ] {} <Debug> executeQuery: (internal) /*Materialize MySQL step 1: execute MySQL DDL for dump data*/ DROP TABLE mysql.data
12:03:00.741894 [ 3026269 ] {} <Information> DatabaseCatalog: Trying load partially dropped table mysql.data (7143b65f-6982-4600-b143-b65f6982e600) from /var/lib/clickhouse/metadata_dropped/mysql.data.7143b65f-6982-4600-b143-b65f6982e600.sql
12:03:00.742582 [ 3026269 ] {} <Debug> mysql.data (7143b65f-6982-4600-b143-b65f6982e600): Loading data parts
12:03:00.742650 [ 3026269 ] {} <Debug> mysql.data (7143b65f-6982-4600-b143-b65f6982e600): There are no data parts
12:03:00.773137 [ 3026034 ] {} <Error> Application: Caught exception while loading metadata: Code: 57, e.displayText() = DB::Exception: Mapping for table with UUID=7143b65f-6982-4600-b143-b65f6982e600 already exists. It happened due to UUID collision, most likely because some not random UUIDs were manually specified in CREATE queries., Stack trace (when copying this message, always include the lines below):
12:03:01.224557 [ 3026034 ] {} <Error> Application: DB::Exception: Mapping for table with UUID=7143b65f-6982-4600-b143-b65f6982e600 already exists. It happened due to UUID collision, most likely because some not random UUIDs were manually specified in CREATE queries.

Cc: @zhang2014

NOTE: this had been marked as Bug fix but not sure, since MaterializeMySQL is not production ready anyway

@robot-clickhouse robot-clickhouse added the pr-bugfix Pull request with bugfix, not backported by default label Sep 2, 2021
@azat
Copy link
Copy Markdown
Member Author

azat commented Sep 3, 2021

Functional stateless tests (thread) — Timeout, fail: 0, passed: 1532, skipped: 6
Integration tests (asan) — fail: 1, passed: 1573, flaky: 16
Integration tests (release) — fail: 1, passed: 1574, flaky: 17
Integration tests (thread) — fail: 2, passed: 1572, flaky: 17

Same in master

Stress test (memory) — Server failed to start

2021.09.03 00:12:24.474558 [ 28793 ] {} <Error> Application: DB::Exception: The local set of parts of table test_1.alter_table_4 doesn't look like the set of parts in ZooKeeper: 295.09 thousand rows of 295.09 thousand total rows in filesystem are suspicious. There are 57 unexpected parts with 295090 rows (9 of them is not just-written with 42556 rows), 0 missing parts (with 0 blocks).: Cannot attach table `test_1`.`alter_table_4` from metadata file /var/lib/clickhouse/metadata/test_1/alter_table_4.sql from query ATTACH TABLE test_1.alter_table_4 (`a` UInt8, `b` Int16, `c` Float32, `d` String, `e` Array(UInt8), `f` Nullable(UUID), `g` Tuple(UInt8, UInt16)) ENGINE = ReplicatedMergeTree('/clickhouse/tables/00993_system_parts_race_condition_drop_zookeeper_test_1/alter_table', 'r_4') PARTITION BY b % 10 ORDER BY a SETTINGS old_parts_lifetime = 1, cleanup_delay_period = 0, cleanup_delay_period_random_add = 0, index_granularity = 8192: while loading database `test_1` from path /var/lib/clickhouse/metadata/test_1

@tavplubix tavplubix self-assigned this Sep 3, 2021
@tavplubix
Copy link
Copy Markdown
Member

AST fuzzer (debug) — Received signal 11 - #28545

@tavplubix
Copy link
Copy Markdown
Member

But how does it fixes the race?

This will fix race with DatabaseCatalog::loadMarkedAsDroppedTables(),
since MaterializeMySQL, and MaterializedMySQLSyncThread in background,
will be started earlier then
DatabaseCatalog::loadMarkedAsDroppedTables() and will move those tables
to metadata_dropped, and after loadMarkedAsDroppedTables() will start
and try to load partially dropped tables and will hit UUID overlap:

    12:02:51.536783 [ 3026034 ] {} <Information> Application: starting up
    12:02:53.019282 [ 3026034 ] {} <Information> DatabaseMaterializeMySQL<Atomic> (mysql): Total 9 tables and 0 dictionaries.
    12:02:53.041699 [ 3026200 ] {} <Debug> mysql.data (7143b65f-6982-4600-b143-b65f6982e600): Loading data parts
    12:02:53.041740 [ 3026200 ] {} <Debug> mysql.data (7143b65f-6982-4600-b143-b65f6982e600): There are no data parts
    12:02:53.620382 [ 3026034 ] {} <Information> DatabaseMaterializeMySQL<Atomic> (mysql): Starting up tables.
    12:03:00.669730 [ 3026183 ] {} <Debug> executeQuery: (internal) /*Materialize MySQL step 1: execute MySQL DDL for dump data*/ DROP TABLE mysql.data
    12:03:00.741894 [ 3026269 ] {} <Information> DatabaseCatalog: Trying load partially dropped table mysql.data (7143b65f-6982-4600-b143-b65f6982e600) from /var/lib/clickhouse/metadata_dropped/mysql.data.7143b65f-6982-4600-b143-b65f6982e600.sql
    12:03:00.742582 [ 3026269 ] {} <Debug> mysql.data (7143b65f-6982-4600-b143-b65f6982e600): Loading data parts
    12:03:00.742650 [ 3026269 ] {} <Debug> mysql.data (7143b65f-6982-4600-b143-b65f6982e600): There are no data parts
    12:03:00.773137 [ 3026034 ] {} <Error> Application: Caught exception while loading metadata: Code: 57, e.displayText() = DB::Exception: Mapping for table with UUID=7143b65f-6982-4600-b143-b65f6982e600 already exists. It happened due to UUID collision, most likely because some not random UUIDs were manually specified in CREATE queries., Stack trace (when copying this message, always include the lines below):
    12:03:01.224557 [ 3026034 ] {} <Error> Application: DB::Exception: Mapping for table with UUID=7143b65f-6982-4600-b143-b65f6982e600 already exists. It happened due to UUID collision, most likely because some not random UUIDs were manually specified in CREATE queries.

Cc: @zhang2014
@azat azat force-pushed the MaterializeMySQL-drop-fix branch from 37cff81 to bf6ba79 Compare September 4, 2021 07:41
@azat
Copy link
Copy Markdown
Member Author

azat commented Sep 4, 2021

But how does it fixes the race?

Egh, sorry you are right, it does not fixes the issue, since the problem is not the race between DROP and CREATE but between DROP and loadMarkedAsDroppedTables(), moved loadMarkedAsDroppedTables() before loadMetadata(), this should fix the issue.

@azat azat changed the title Use sync DROP TABLE for internal DDL in MaterializeMySQL Fix UUID overlap in DROP TABLE for internal DDL from MaterializeMySQL Sep 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants