Should fail ddl query as soon as possible if table is shutdown by yiguolei · Pull Request #19684 · ClickHouse/ClickHouse

yiguolei · 2021-01-27T02:59:53Z

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Background thread which executes ON CLUSTER queries might hang waiting for dropped replicated table to do something. It's fixed.

Detailed description / Documentation draft:

Check table status during ddl execution. Set task status to exception if the table is shutdown by drop command or others.

tavplubix · 2021-01-27T11:43:24Z

Ok, but could you please explain which problem are you tying to solve? If table is partially shutdown, then status.is_leader must be false and everything should work fine, because DDLWorker takes it into account and tries to execute query on other replica. If it's not possible for some reason, then query fails with Task ... was not executed by anyone ... after several retries.

yiguolei · 2021-01-27T13:02:02Z

@tavplubix If all replica is dropped, status.is_leader is false on all replicas, so that the number of tries will not be updated by any replica. And it will have to wait for timeout currently it is MAX_EXECUTION_TIMEOUT_SEC = 3600s. It is too long and will blocking other DDL tasks.

tavplubix · 2021-01-27T14:35:34Z

You're right, there is a bug in DDLWorker. However, DROP TABLE is not the only thing that partially shutdowns replicated table (and that's why test_ddl_worker_non_leader/test.py::test_non_leader_replica failed), it also happens on lost ZooKeeper connection and SYSTEM RESTART REPLICA. I'm not sure if we should finish DDL query with Table is shutdown ... in these cases. Maybe check IStorage::is_dropped flag instead of partial_shutdown_called? On the other hand, is_dropped is false for detached tables, but DETACH/ATTACH are usually used to restart replica (and that's how SYSTEM RESTART REPLICA works). To avoid waiting on permanently detached tables we can use something like this:

bool replica_dropped = storage->is_dropped;
bool all_replicas_likely_detached = status.active_replicas == 0 && !DatabaseCatalog::instance().isTableExist(table_id, context);
if (replica_dropped || all_replicas_likely_detached) /// Table is shutdown ...

What do you think?

Btw, could you please add a test? It can be simple stateless test with test_shard_localhost cluster with one replica.

yiguolei · 2021-01-28T12:40:02Z

@tavplubix You are right. I have another commit. ^-^
But I find it is hard to test it, because the table has to be dropped during the ddl task's running stage. It is very difficult to make such a test. Could you please provide some idea?

tavplubix · 2021-01-28T13:44:06Z

Take a look at *.sh tests (such as 01150_ddl_guard_rwr and 00993_system_parts_race_condition_drop_zookeeper). You can run two threads, one thread will execute CREATE and DROP and other thread will execute ALTER ... ON CLUSTER.

… table is shutdown

…f table is shutdown

… table is shutdown

Backport #19684 to 21.1: Should fail ddl query as soon as possible if table is shutdown

Backport #19684 to 20.12: Should fail ddl query as soon as possible if table is shutdown

Backport #19684 to 20.11: Should fail ddl query as soon as possible if table is shutdown

Backport #19684 to 21.2: Should fail ddl query as soon as possible if table is shutdown

… table is shutdown

Backport #19684 to 20.8: Should fail ddl query as soon as possible if table is shutdown

Should fail ddl query as soon as possible if table is shutdown

9d086f4

robot-clickhouse added the pr-improvement Pull request with some product improvements label Jan 27, 2021

qoega added the can be tested label Jan 27, 2021

fix code style

6693f77

tavplubix self-assigned this Jan 27, 2021

robot-clickhouse added pr-bugfix Pull request with bugfix, not backported by default and removed pr-improvement Pull request with some product improvements labels Jan 27, 2021

check active replicas and detached tables

b0d645e

yiguolei and others added 4 commits February 1, 2021 10:40

add functional test

768e461

fix functional test

bef5af3

fix functional test

fa03fbd

Update 01671_ddl_hang_timeout.sh

1e44e3f

tavplubix approved these changes Feb 1, 2021

View reviewed changes

tavplubix merged commit befee42 into ClickHouse:master Feb 2, 2021

robot-clickhouse pushed a commit that referenced this pull request Feb 2, 2021

Backport #19684 to 21.1: Should fail ddl query as soon as possible if…

375a4b3

… table is shutdown

robot-clickhouse mentioned this pull request Feb 2, 2021

Backport #19684 to 21.1: Should fail ddl query as soon as possible if table is shutdown #19983

Merged

robot-clickhouse pushed a commit that referenced this pull request Feb 2, 2021

Backport #19684 to 20.12: Should fail ddl query as soon as possible i…

c8923ed

…f table is shutdown

robot-clickhouse mentioned this pull request Feb 2, 2021

Backport #19684 to 20.12: Should fail ddl query as soon as possible if table is shutdown #19984

Merged

robot-clickhouse pushed a commit that referenced this pull request Feb 2, 2021

Backport #19684 to 20.11: Should fail ddl query as soon as possible i…

6c599ff

…f table is shutdown

robot-clickhouse mentioned this pull request Feb 2, 2021

Backport #19684 to 20.11: Should fail ddl query as soon as possible if table is shutdown #19985

Merged

robot-clickhouse pushed a commit that referenced this pull request Feb 2, 2021

Backport #19684 to 21.2: Should fail ddl query as soon as possible if…

acf6673

… table is shutdown

robot-clickhouse mentioned this pull request Feb 2, 2021

Backport #19684 to 21.2: Should fail ddl query as soon as possible if table is shutdown #19986

Merged

tavplubix added a commit that referenced this pull request Feb 3, 2021

Merge pull request #19983 from ClickHouse/backport/21.1/19684

0b13443

Backport #19684 to 21.1: Should fail ddl query as soon as possible if table is shutdown

tavplubix added a commit that referenced this pull request Feb 3, 2021

Merge pull request #19984 from ClickHouse/backport/20.12/19684

eb54e2d

Backport #19684 to 20.12: Should fail ddl query as soon as possible if table is shutdown

tavplubix added a commit that referenced this pull request Feb 3, 2021

Merge pull request #19985 from ClickHouse/backport/20.11/19684

e5780e4

Backport #19684 to 20.11: Should fail ddl query as soon as possible if table is shutdown

tavplubix added a commit that referenced this pull request Feb 3, 2021

Merge pull request #19986 from ClickHouse/backport/21.2/19684

a01e04a

Backport #19684 to 21.2: Should fail ddl query as soon as possible if table is shutdown

This was referenced Feb 12, 2021

Cherry pick #19684 to 20.8: Should fail ddl query as soon as possible if table is shutdown #20439

Merged

Cherry pick #19684 to 20.3: Should fail ddl query as soon as possible if table is shutdown #20440

Closed

robot-clickhouse pushed a commit that referenced this pull request Feb 15, 2021

Backport #19684 to 20.8: Should fail ddl query as soon as possible if…

e2fc52d

… table is shutdown

robot-clickhouse mentioned this pull request Feb 15, 2021

Backport #19684 to 20.8: Should fail ddl query as soon as possible if table is shutdown #20525

Merged

tavplubix added a commit that referenced this pull request Feb 16, 2021

Merge pull request #20525 from ClickHouse/backport/20.8/19684

f0144ce

Backport #19684 to 20.8: Should fail ddl query as soon as possible if table is shutdown

jorisgio mentioned this pull request Mar 17, 2021

ThreadPool hanging forever in wait() #21831

Closed

tavplubix mentioned this pull request Nov 2, 2023

Fix rare logical error in Replicated database #56272

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should fail ddl query as soon as possible if table is shutdown#19684

Should fail ddl query as soon as possible if table is shutdown#19684
tavplubix merged 7 commits intoClickHouse:masterfrom
yiguolei:master

yiguolei commented Jan 27, 2021 •

edited by tavplubix

Loading

Uh oh!

tavplubix commented Jan 27, 2021

Uh oh!

yiguolei commented Jan 27, 2021 •

edited

Loading

Uh oh!

tavplubix commented Jan 27, 2021 •

edited

Loading

Uh oh!

yiguolei commented Jan 28, 2021

Uh oh!

tavplubix commented Jan 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

yiguolei commented Jan 27, 2021 • edited by tavplubix Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tavplubix commented Jan 27, 2021

Uh oh!

yiguolei commented Jan 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tavplubix commented Jan 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yiguolei commented Jan 28, 2021

Uh oh!

tavplubix commented Jan 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yiguolei commented Jan 27, 2021 •

edited by tavplubix

Loading

yiguolei commented Jan 27, 2021 •

edited

Loading

tavplubix commented Jan 27, 2021 •

edited

Loading