Skip to content

Support insert into system.zookeeper#37596

Merged
tavplubix merged 21 commits intoClickHouse:masterfrom
hanfei1991:hanfei/zk-write
Jun 2, 2022
Merged

Support insert into system.zookeeper#37596
tavplubix merged 21 commits intoClickHouse:masterfrom
hanfei1991:hanfei/zk-write

Conversation

@hanfei1991
Copy link
Copy Markdown
Member

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

This PR aims to resolve #22130 which allows inserting into system.zookeeper. To simplify, this PR is designed as:

  1. only allow to write (name, path, value), three columns must exist in the same insert statement.
  2. allow recursive create if a path and its prefix of path don't exist.
  3. No transaction size limitation.
  4. Not allow "/" identifier in "name" attribution.
  5. Allow "/a/b/c///d/" in "path", it will resolve to "/a/b/c/d" automatically.

In a same insert statement, transaction are ganranteed by zookeeper, but when reading data from zookeeper, changes may happen in the same time. So this is a "Read Commited" isolation level.
This PR uses a trie to cache all the path change to avoid creating path repeatedly.

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 27, 2022

CLA assistant check
All committers have signed the CLA.

@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-feature Pull request with new product feature label May 27, 2022
@KochetovNicolai KochetovNicolai added the can be tested Allows running workflows for external contributors label May 27, 2022
@tavplubix tavplubix self-assigned this May 27, 2022
@hanfei1991
Copy link
Copy Markdown
Member Author

@KochetovNicolai @tavplubix Any suggestions to the fast check fail? I ran this test in my local enviroument:

root@figo-laptop-dell:~/ClickHouse# PATH=$PATH:./build/programs/ tests/clickhouse-test --zookeeper 02311_system_zookeeper_insert
Using queries from '/root/ClickHouse/tests/queries' directory
Using clickhouse-client as client program (expecting split build)
Connecting to ClickHouse server... OK

Running 1 stateless tests (MainProcess).

02311_system_zookeeper_insert:                                          [ OK ]

1 tests passed. 0 tests skipped. 0.20 s elapsed (MainProcess).
Won't run stateful tests because test data wasn't loaded.
All tests have finished.

and succeeded.

@tavplubix
Copy link
Copy Markdown
Member

See runlog.log:

2022-05-27 16:12:45 02311_system_zookeeper_insert:                                          [ UNKNOWN ] 0.00 sec. - Test internal error: 
2022-05-27 16:12:45 UnicodeDecodeError
2022-05-27 16:12:45 'utf-8' codec can't decode byte 0x91 in position 7514: invalid start byte
2022-05-27 16:12:45   File "/ClickHouse/tests/clickhouse-test", line 1040, in run
2022-05-27 16:12:45     result = self.process_result_impl(proc, stdout, stderr, total_time)
2022-05-27 16:12:45 
2022-05-27 16:12:45   File "/ClickHouse/tests/clickhouse-test", line 828, in process_result_impl
2022-05-27 16:12:45     diff = Popen(
2022-05-27 16:12:45 
2022-05-27 16:12:45   File "/usr/lib/python3.8/subprocess.py", line 1015, in communicate
2022-05-27 16:12:45     stdout = self.stdout.read()
2022-05-27 16:12:45 
2022-05-27 16:12:45   File "/usr/lib/python3.8/codecs.py", line 322, in decode
2022-05-27 16:12:45     (result, consumed) = self._buffer_decode(data, self.errors, final)

Seems like some of SELECT queries from the test returned binary data, so clickhouse-test cannot parse output of the test.

Try to run locally other tests that use zookeeper before running your test:

clickhouse-test --zookeeper -j 4 --order=random zoo

@hanfei1991
Copy link
Copy Markdown
Member Author

Got it! The problem is the test outputs all the zookeeper content but they are not cleaned by previous tests. I will change the test.

@hanfei1991
Copy link
Copy Markdown
Member Author

@tavplubix @KochetovNicolai Hi, I failed the flaky test, could you please take a look of it?
in the flaky test, the diff of the result is

2022-05-27 15:52:11 -/2-insert-testx	testz	z
2022-05-27 15:52:11 +/2-insert-testx	testz	y

However, the test is written as

insert into system.zookeeper (name, path, value) values ('testz', '/2-insert-testx', 'y');
insert into system.zookeeper (name, path, value) values ('testc', '2-insert-testz//c/cd/dd//', 'y');
insert into system.zookeeper (name, value, path) values ('testz', 'z', '/2-insert-testx');

insert z happens after insert y. I am not familiar with ClickHouse insert mechanism. Are multiple inserts executing concurrently? Do you have any suggestions to look into it? Thanks a lot!

@tavplubix
Copy link
Copy Markdown
Member

It's because we run tests in parallel by default. You should add no-parallel tag or (much better) use unique path prefix for each test run. You can use current database name as unique value like this:

insert into system.zookeeper (name, path, value) values ('c', '/1-insert-testc/' || currentDatabase() ||  '/c/c/c/c/c/c', 11) ...

@hanfei1991
Copy link
Copy Markdown
Member Author

It's because we run tests in parallel by default. You should add no-parallel tag or (much better) use unique path prefix for each test run. You can use current database name as unique value like this:

insert into system.zookeeper (name, path, value) values ('c', '/1-insert-testc/' || currentDatabase() ||  '/c/c/c/c/c/c', 11) ...

We cannot check the unique path in the reference file though. I guess no-parallel is a cleaner way.

@tavplubix
Copy link
Copy Markdown
Member

We cannot check the unique path in the reference file though

Actually we can, clickhouse-test replaces randomly generated database name (like test_z20leh) with default in test output before comparing it with reference file

@hanfei1991
Copy link
Copy Markdown
Member Author

hanfei1991 commented May 27, 2022 via email

@hanfei1991
Copy link
Copy Markdown
Member Author

@tavplubix In fact, insert into values ("/" || currentDatabase() || "abc") always insert "/abc", meaning currentDatabase() always return emtry string in insert statement. Is it by design?
To work around it, I insert the values into a temp table and select them by select "/" || currentDatabase() || path and then insert into zookeeper. It seems working well by far.

@hanfei1991
Copy link
Copy Markdown
Member Author

@tavplubix the flaky tests fail again. The error message is:

2022-05-28 20:50:42 --- /usr/share/clickhouse-test/queries/0_stateless/02311_system_zookeeper_insert.reference	2022-05-28 20:48:11.361901234 +0930
2022-05-28 20:50:42 +++ /tmp/clickhouse-test/0_stateless/02311_system_zookeeper_insert.658.stdout	2022-05-28 20:50:41.884122509 +0930
2022-05-28 20:50:42 @@ -18,7 +18,7 @@
2022-05-28 20:50:42  /default/1-insert-testc/c/c/kk	g	14
2022-05-28 20:50:42  -------------------------
2022-05-28 20:50:42  /default/2-insert-testx	testc	x
2022-05-28 20:50:42 -/default/2-insert-testx	testz	z
2022-05-28 20:50:42 +/default/2-insert-testx	testz	y
2022-05-28 20:50:42  /default/2-insert-testz	c	
2022-05-28 20:50:42  /default/2-insert-testz/c	cd	
2022-05-28 20:50:42  /default/2-insert-testz/c/cd	dd	
2022-05-28 20:50:42 
2022-05-28 20:50:42 
2022-05-28 20:50:42 Settings used in the test: --max_insert_threads=14 --group_by_two_level_threshold=100000 --group_by_two_level_threshold_bytes=1 --distributed_aggregation_memory_efficient=1 --fsync_metadata=1 --output_format_parallel_formatting=0 --input_format_parallel_parsing=1 --min_chunk_bytes_for_parallel_parsing=10630691 --max_read_buffer_size=964684 --prefer_localhost_replica=1 --max_block_size=13784 --max_threads=54 --optimize_or_like_chain=0
2022-05-28 20:50:42 
2022-05-28 20:50:42 Database: test_crzwrw
2022-05-28 20:50:42 
2022-05-28 20:50:42 Running about 12 stateless tests (ForkPoolWorker-9).
2022-05-28 20:50:42 

Are insert statements running concurrently in the same sql file?

@hanfei1991
Copy link
Copy Markdown
Member Author

@tavplubix anyway I refined the test and the flaky tests are still failing. Please see https://s3.amazonaws.com/clickhouse-test-reports/37596/d883194eb21ea53e32f7465e7064cf0d85edf49d/stateless_tests_flaky_check__address__actions_.html

However, I cannot figure out any info from the report. Could you please take a look?

@tavplubix
Copy link
Copy Markdown
Member

insert into values ("/" || currentDatabase() || "abc") always insert "/abc", meaning currentDatabase() always return emtry string in insert statement. Is it by design?

No, it's a bug.

However, I cannot figure out any info from the report. Could you please take a look?

Open runlog.log and and it will become clear:

2022-05-28 17:04:29 02311_system_zookeeper_insert:                                          [ FAIL ] 65.09 sec. - Test runs too long (> 60s). Make it faster.
2022-05-28 17:04:29 Settings used in the test: --max_insert_threads=0 --group_by_two_level_threshold=100000 --group_by_two_level_threshold_bytes=50000000 --distributed_aggregation_memory_efficient=1 --fsync_metadata=1 --output_format_parallel_formatting=0 --input_format_parallel_parsing=1 --min_chunk_bytes_for_parallel_parsing=13082811 --max_read_buffer_size=501217 --prefer_localhost_replica=1 --max_block_size=57938 --max_threads=12 --optimize_or_like_chain=0
2022-05-28 17:04:29 
2022-05-28 17:04:29 Database: test_3hiaaa

But probably it's because of ThreadFuzzer and parallel run, so you can simply ignore this failure.

@tavplubix
Copy link
Copy Markdown
Member

Our RBAC works a bit weird with system database. For example, a user sill can read from system tables like system.one and system.numbers even if all grants are revoked. Please add a test that user without necessary grants cannot insert into system.zookeeper (just to be sure).

@nvartolomei
Copy link
Copy Markdown
Contributor

@tavplubix maybe this feature should be behind a feature flag? e.g. I would never want non-readonly XML users to write to zookeeper.

@hanfei1991
Copy link
Copy Markdown
Member Author

@tavplubix could you provide some suggestions to write "user" case. I got stuck on it. For example, in default config, when I run some user case, it reports error:

root@figo-laptop-dell:~/ClickHouse# PATH=$PATH:~/ClickHouse/build/programs/ tests/clickhouse-test 01939_user_with_default_database
Using queries from '/root/ClickHouse/tests/queries' directory
Using clickhouse-client as client program (expecting split build)
Connecting to ClickHouse server... OK

Running 1 stateless tests (MainProcess).

01939_user_with_default_database:                                       [ FAIL ] - having stderror:
Received exception from server (version 22.6.1):
Code: 497. DB::Exception: Received from localhost:9000. DB::Exception: default: Not enough privileges. To execute this query it's necessary to have grant DROP USER ON *.*. (ACCESS_DENIED)
(query: drop user if exists u_01939)
Received exception from server (version 22.6.1):
Code: 497. DB::Exception: Received from localhost:9000. DB::Exception: default: Not enough privileges. To execute this query it's necessary to have grant CREATE USER ON *.*. (ACCESS_DENIED)
(query: create user u_01939 default database db_01939)
Code: 516. DB::Exception: Received from localhost:9000. DB::Exception: u_01939: Authentication failed: password is incorrect or there is no user with such name. (AUTHENTICATION_FAILED)

Received exception from server (version 22.6.1):
Code: 497. DB::Exception: Received from localhost:9000. DB::Exception: default: Not enough privileges. To execute this query it's necessary to have grant ALTER USER ON *.*. (ACCESS_DENIED)
(query: alter user u_01939 default database NONE)
Received exception from server (version 22.6.1):
Code: 497. DB::Exception: Received from localhost:9000. DB::Exception: default: Not enough privileges. To execute this query it's necessary to have grant SHOW USERS ON *.*. (ACCESS_DENIED)
(query: show create user u_01939)
Received exception from server (version 22.6.1):
Code: 497. DB::Exception: Received from localhost:9000. DB::Exception: default: Not enough privileges. To execute this query it's necessary to have grant ALTER USER ON *.*. (ACCESS_DENIED)
(query: alter user u_01939 default database `NONE`)
Received exception from server (version 22.6.1):
Code: 497. DB::Exception: Received from localhost:9000. DB::Exception: default: Not enough privileges. To execute this query it's necessary to have grant SHOW USERS ON *.*. (ACCESS_DENIED)
(query: show create user u_01939)
Received exception from server (version 22.6.1):
Code: 497. DB::Exception: Received from localhost:9000. DB::Exception: default: Not enough privileges. To execute this query it's necessary to have grant DROP USER ON *.*. (ACCESS_DENIED)
(query: drop user u_01939 )

What the config should I do in advance when I test these cases locally

@tavplubix
Copy link
Copy Markdown
Member

Do you have <access_management>1</access_management> in default user's definition in your users.xml?

@tavplubix
Copy link
Copy Markdown
Member

@tavplubix maybe this feature should be behind a feature flag? e.g. I would never want non-readonly XML users to write to zookeeper.

Makes sense, we should disable this feature by default, because it's too dangerous. For example, if someone can insert into system.zookeeper they can insert GRANT query into the queue of ON CLUSTER queries or change something in ReplicatedAccessStorage and get arbitrary grants. So looks like "user is allowed to insert into system.zookeeper" == "user is allowed to do anything". Maybe we can check access for inserting into system.zookeeper special way and allow it only if user has all privileges. @nvartolomei, what do you think?

@nvartolomei
Copy link
Copy Markdown
Contributor

nvartolomei commented May 30, 2022

@tavplubix if it were up to me, then I wouldn't allow writing from within ClickHouse to ZooKeeper at all, too low level and it opens the way to corrupt the system from within. E.g. possibly making it unable to (re-)start at all.

What's the use case for this feature?

Edit: answered my question by reading the original issue.

@hanfei1991
Copy link
Copy Markdown
Member Author

Our RBAC works a bit weird with system database. For example, a user sill can read from system tables like system.one and system.numbers even if all grants are revoked. Please add a test that user without necessary grants cannot insert into system.zookeeper (just to be sure).

Added, PTAL

Copy link
Copy Markdown
Member

@tavplubix tavplubix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but let's add a configuration parameter that enables this feature (disabled by default). See Context::getConfigRef()

@tavplubix
Copy link
Copy Markdown
Member

Also ClickHouse special build check (actions) and Style Check (actions) failures are related (other failures are not)

@hanfei1991
Copy link
Copy Markdown
Member Author

LGTM, but let's add a configuration parameter that enables this feature (disabled by default). See Context::getConfigRef()

Good Suggestion. I have added the config check and fixed the code style.

\
M(Bool, allow_unrestricted_reads_from_keeper, false, "Allow unrestricted (without condition on path) reads from system.zookeeper table, can be handy, but is not safe for zookeeper", 0) \
\
M(Bool, allow_writes_to_zookeeper, false, "Allow writes to system.zookeeper table, can be handy, but is not safe for zookeeper", 0) \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not protect from anything. Please, read previous comments in the conversation carefully.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean a config item like "allow_zookeeper_write : true/false" in the config file? Well, I was too hurry

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then how do we enable it in the tests?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found 01594_too_low_memory_limits.sh also uses ad-hoc config file, I will look into it

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

01594_too_low_memory_limits is bad test that was added by mistake, it should have been integration test. The correct way to enable it in tests is to add xml file to tests/config/config.d/ and update tests/config/install.sh

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. It makes better sense

@hanfei1991
Copy link
Copy Markdown
Member Author

@tavplubix I added the config allow_zookeeper_write for safeguard and adapted the test according to 01594_too_low_memory_limits

@tavplubix
Copy link
Copy Markdown
Member

AST fuzzer (UBSan, actions) - #37741
Stateless tests flaky check (address, actions) - OK
Stress test (memory, actions) - #37665

@tavplubix
Copy link
Copy Markdown
Member

Hm, I've downloaded build from this PR, tried it locally and found some unexpected behavior:

dell9510 :) insert into system.zookeeper (name, version, czxid) values ('x', 1, 42)

INSERT INTO system.zookeeper (name, version, czxid) FORMAT Values

Query id: 6cc1ec2c-f9e4-4f5a-b547-24b196845c6a

Ok.

1 row in set. Elapsed: 0.026 sec. 

dell9510 :) insert into system.zookeeper (name) values ('x')

INSERT INTO system.zookeeper (name) FORMAT Values

Query id: eeaeb659-56ce-4272-a68e-949099c42ea9

Ok.

1 row in set. Elapsed: 0.019 sec. 

dell9510 :) insert into system.zookeeper (name, path) values ('x', '')

INSERT INTO system.zookeeper (name, path) FORMAT Values

Query id: 1fc282fa-e6c4-4160-b7d3-b7613925a603

Ok.

1 row in set. Elapsed: 0.019 sec.

Queries like this should throw an exception.

@hanfei1991
Copy link
Copy Markdown
Member Author

hanfei1991 commented Jun 1, 2022 via email

@hanfei1991
Copy link
Copy Markdown
Member Author

Hm, I've downloaded build from this PR, tried it locally and found some unexpected behavior:

dell9510 :) insert into system.zookeeper (name, version, czxid) values ('x', 1, 42)

INSERT INTO system.zookeeper (name, version, czxid) FORMAT Values

Query id: 6cc1ec2c-f9e4-4f5a-b547-24b196845c6a

Ok.

1 row in set. Elapsed: 0.026 sec. 

dell9510 :) insert into system.zookeeper (name) values ('x')

INSERT INTO system.zookeeper (name) FORMAT Values

Query id: eeaeb659-56ce-4272-a68e-949099c42ea9

Ok.

1 row in set. Elapsed: 0.019 sec. 

dell9510 :) insert into system.zookeeper (name, path) values ('x', '')

INSERT INTO system.zookeeper (name, path) FORMAT Values

Query id: 1fc282fa-e6c4-4160-b7d3-b7613925a603

Ok.

1 row in set. Elapsed: 0.019 sec.

Queries like this should throw an exception.

I marked other columns as Materialize to prohibit writing, and allow insert without value, because I think insert(name, path) means creating a path. PTAL

@tavplubix
Copy link
Copy Markdown
Member

Stateless tests flaky check (address, actions) - OK
Stress test - #37738

@tavplubix tavplubix merged commit a893595 into ClickHouse:master Jun 2, 2022

SinkToStoragePtr StorageSystemZooKeeper::write(const ASTPtr &, const StorageMetadataPtr &, ContextPtr context)
{
if (!context->getConfigRef().getBool("allow_zookeeper_write", false))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it should be done via user grants?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the discussion above

@hanfei1991 hanfei1991 deleted the hanfei/zk-write branch December 9, 2024 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow writes into system.zookeeper table

8 participants