Skip to content

Possible deadlock when running stateless tests (ddl guards) #53894

@Algunenano

Description

@Algunenano

Report: https://s3.amazonaws.com/clickhouse-test-reports/53050/0940b5d3aa3897441b21e073cda1f8fe13549a06/stateless_tests__release__s3_storage__[2_2].html

The following 3 tests have timed out:

Test name	Test status	Test time, sec.
02437_drop_mv_restart_replicas	FAIL	600.0
01161_all_system_tables	FAIL	600.01
01109_exchange_tables	FAIL	600.01

Looking at the report there seems to be 5 queries running:

system restart replicas
drop database if exists db_test_19avmcsv
drop database if exists db_test_19avmcsv -- Seems to be repeated, but it's different
SELECT * FROM system.databases LIMIT 10000 FORMAT Null
CREATE TABLE t1 ENGINE=Log() AS SELECT * FROM system.tables AS t JOIN system.databases AS d ON t.database=d.name;

It seems all queries are doing some kind of getDDLGuard call. I don't see the deadlock at first glance, but there is something odd since it took 10 minutes until the tests failed and the queries are still running.

Another related thing is that it seems that these mutexes are not timed (lock_acquire_timeout isn't respected) so it's likely that this execution was stuck forever until a restart was forced.

Metadata

Metadata

Assignees

Labels

fuzzProblem found by one of the fuzzers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions