Skip to content

Standalone keeper server#24059

Merged
alesapin merged 27 commits intomasterfrom
standalone_keeper
May 20, 2021
Merged

Standalone keeper server#24059
alesapin merged 27 commits intomasterfrom
standalone_keeper

Conversation

@alesapin
Copy link
Copy Markdown
Member

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add standalone clickhouse-keeper symlink to the main clickhouse binary. Now it's possible to run coordination without the main clickhouse server.

@robot-clickhouse robot-clickhouse added the pr-improvement Pull request with some product improvements label May 12, 2021
@alesapin alesapin added the jepsen-test Need to test this PR with jepsen tests label May 12, 2021
@alesapin
Copy link
Copy Markdown
Member Author

alesapin commented May 13, 2021

I'm not sure about installation. Maybe it would be better to create a separate PR.

@alesapin
Copy link
Copy Markdown
Member Author

alesapin commented May 13, 2021

Target: replace zookeeper in integration tests.

(Without great efforts)

@alesapin
Copy link
Copy Markdown
Member Author

alesapin commented May 15, 2021

Only six tests failed! Nice!

And most of them because of incorrect IP protocol (need to make keeper config pluggable <listen_host>::</listen_host> in IPv6 compose env and <listen_host>0.0.0.0</listen_host> in IPv4 net)

@filimonov
Copy link
Copy Markdown
Contributor

Target: replace zookeeper in integration tests.

(Without great efforts)

Cool! BTW it's interestion if it will also work for Kafka (integration tests setup separate zookeeper for Kafka).

@alexey-milovidov
Copy link
Copy Markdown
Member

It will work with Kafka unless it is using ACL.
ACL will also be implemented nevertheless.

@alesapin
Copy link
Copy Markdown
Member Author

test_rename_column/test.py::test_rename_with_parallel_merges strange failure with thread sanitizer, but possibly it's OK, because Keeper also built with thread sanitizer. Let's try one more time with more verbose logs. In test_zookeeper_config/test.py::test_identity we still need to use original ZooKeeper :(

@alesapin
Copy link
Copy Markdown
Member Author

test_rename_column/test.py::test_rename_with_parallel_merges -- failed again, need to add retries to this test.

@alesapin
Copy link
Copy Markdown
Member Author

alesapin commented May 17, 2021

Hmmm, even no session expired messages:

$ test_rename_column/_instances$ ag 'Session expired' | wc -l
0
$ test_rename_column/_instances$ ag 'Connection loss' | wc -l
0

@alesapin
Copy link
Copy Markdown
Member Author

There are no strange errors in the clickhouse-server's log https://gist.github.com/alesapin/20ca2b5f492d8c3702e82c10bf0d6b4b. So the test was just too slow? I'll make it more lightweight.

@alesapin alesapin marked this pull request as ready for review May 17, 2021 09:37
@alesapin
Copy link
Copy Markdown
Member Author

alesapin commented May 17, 2021

Stress test (undefined) — Logical error thrown (see clickhouse-server.log) wow


2021.05.17 13:43:17.010805 [ 327 ] {} <Error> test_27.dst_8 (ReplicatedMergeTreeQueue): Code: 49, e.displayText() = DB::Exception: Part 2_69_79_2 intersects previous part 2_0_72_999999999. It is a bug., Stack trace (when copying this message, always include the lines below):

Seems like #23997 didn't completely helped.

@tavplubix tavplubix self-assigned this May 17, 2021
else if (config().has("keeper_server.snapshot_storage_path"))
path = config().getString("keeper_server.snapshot_storage_path");
else
path = std::filesystem::path{DBMS_DEFAULT_PATH} / "coordination/logs";
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it still possible to use embedded keeper in clickhouse-server? Do we need some protection from running multiple keepers (for example, embedded and standalone ones) in one directory?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just shouldn't start because of the same ports or the same PID files. But it's always possible to turn your data into the garbage. It's also possible with clickhouse-server. Maybe it would be better to have a separate path, I'll add KEEPER_DEFAULT_PATH.

@alesapin
Copy link
Copy Markdown
Member Author

test_rename_column/test.py::test_rename_with_parallel_merges yaaarrrr

@alesapin
Copy link
Copy Markdown
Member Author

I'm curious why there are so many likes?) We still have to implement data import from ZooKeeper. But it should be OK for fresh installations if you don't use zookeeper auth (quite weird thing).

@tonyhb
Copy link
Copy Markdown

tonyhb commented May 19, 2021

I'm curious why there are so many likes?) We still have to implement data import from ZooKeeper. But it should be OK for fresh installations if you don't use zookeeper auth (quite weird thing).

Not sure about everyone else, but I've been watching your progress closely since #15090 and am really looking forward to using this :)

@alesapin
Copy link
Copy Markdown
Member Author

All failures are not related to changes

@alesapin alesapin merged commit cf94bc9 into master May 20, 2021
@alesapin alesapin deleted the standalone_keeper branch May 20, 2021 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jepsen-test Need to test this PR with jepsen tests pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants