Skip to content

Enable keeper fault injection for inserts in functional tests#43117

Merged
devcrafter merged 2 commits intomasterfrom
igor/keeper_retries_enable_fault_injection_in_tests
Nov 17, 2022
Merged

Enable keeper fault injection for inserts in functional tests#43117
devcrafter merged 2 commits intomasterfrom
igor/keeper_retries_enable_fault_injection_in_tests

Conversation

@devcrafter
Copy link
Copy Markdown
Member

@devcrafter devcrafter commented Nov 10, 2022

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Follow up #42607 #43122

@devcrafter devcrafter force-pushed the igor/keeper_retries_enable_fault_injection_in_tests branch from 47c1a5b to b12ebab Compare November 11, 2022 18:08
@robot-clickhouse robot-clickhouse added the pr-not-for-changelog This PR should not be mentioned in the changelog label Nov 11, 2022
@SmitaRKulkarni SmitaRKulkarni self-assigned this Nov 13, 2022
@SmitaRKulkarni
Copy link
Copy Markdown
Member

Some stress tests fail with 'Cannot start clickhouse-server', could this be related to the change ?

@devcrafter
Copy link
Copy Markdown
Member Author

@tavplubix is this PR valid way to enable ZK fault injections for inserts in tests?

@tavplubix
Copy link
Copy Markdown
Member

Yep, it looks valid

@devcrafter
Copy link
Copy Markdown
Member Author

devcrafter commented Nov 17, 2022

Test failures:

  • test_consistent_parts_after_clone_replica/test.py::test_inconsistent_parts_if_drop_while_replica_not_active - cluster failed to start for a reason
  • test_read_only_table/test.py::test_restart_zookeeper(link) - can be non-deterministic, - we hardly can guarantee that server'll reconnect in 5 secs timeout to another keeper node (despite doing our best), technically, I guess we only can measure the time after which we can successfully execute INSERT:
        # ClickHouse should +- immediately reconnect to another zookeeper node
        cluster.stop_zookeeper_nodes([node1_zk])
        time.sleep(5)
    
        for table_id in range(NUM_TABLES):
>           node1.query(
                f"INSERT INTO test_table_{table_id} VALUES (6), (7), (8), (9), (10);"
            )

The simplest we can do - is to increase timeout. A bit more sophisticated - retries for inserts on test side. We could use keeper retries on insert as well, but then we need to get from query_log time to execute the query - it'll make the test more complicated. @tavplubix WDYT?

@tavplubix
Copy link
Copy Markdown
Member

time.sleep(5) never works, we should replace it with something like node.query_with_retry

@devcrafter
Copy link
Copy Markdown
Member Author

time.sleep(5) never works, we should replace it with something like node.query_with_retry

Fixed in #43326

@devcrafter devcrafter merged commit 00920b1 into master Nov 17, 2022
@devcrafter devcrafter deleted the igor/keeper_retries_enable_fault_injection_in_tests branch November 17, 2022 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants