Skip to content

Flaky 00613_shard_distributed_max_execution_time #31657

@Algunenano

Description

@Algunenano

From #31636 CI: https://clickhouse-test-reports.s3.yandex.net/31636/de3fe0e9299effaa2cffec31f0487e327c59103e/functional_stateless_tests_(debug).html#fail1

But also from master (only 1 failure in the last 60 days there): https://clickhouse-test-reports.s3.yandex.net/0/b58f8197899cc54bb089c0e2593a3e304cd2234e/functional_stateless_tests_(debug).html

Sample error:

2021-10-26 11:53:36 --- /usr/share/clickhouse-test/queries/0_stateless/00613_shard_distributed_max_execution_time.reference	2021-10-26 10:25:48.000000000 -0400
2021-10-26 11:53:36 +++ /tmp/clickhouse-test/0_stateless/00613_shard_distributed_max_execution_time.stdout	2021-10-26 11:53:36.532950517 -0400
2021-10-26 11:53:36 @@ -1,10 +0,0 @@
2021-10-26 11:53:36 -0
2021-10-26 11:53:36 -1
2021-10-26 11:53:36 -2
2021-10-26 11:53:36 -3
2021-10-26 11:53:36 -4
2021-10-26 11:53:36 -5
2021-10-26 11:53:36 -6
2021-10-26 11:53:36 -7
2021-10-26 11:53:36 -8
2021-10-26 11:53:36 -9
2021-10-26 11:53:36 
2021-10-26 11:53:36 
2021-10-26 11:53:36 Database: test_27tunu

The test:

SET max_execution_time = 1, timeout_overflow_mode = 'break';
SELECT DISTINCT * FROM remote('127.0.0.{2,3}', system.numbers) WHERE number < 10;

I think that in an extreme situation (slow machine, slow build, slow universe) both servers might take more than 1 second to send a reply with the first batch of numbers so when the query is cancelled due to a timeout it doesn't have any input.

Metadata

Metadata

Assignees

No one assigned

    Labels

    testingSpecial issue with list of bugs found by CI

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions