Skip to content

ZooKeeper replication for users, roles, row policies, quotas and profiles.#27426

Merged
vitlibar merged 2 commits intoClickHouse:masterfrom
aiven:kmichel-replicated-access-storage
Aug 18, 2021
Merged

ZooKeeper replication for users, roles, row policies, quotas and profiles.#27426
vitlibar merged 2 commits intoClickHouse:masterfrom
aiven:kmichel-replicated-access-storage

Conversation

@kmichel-aiven
Copy link
Copy Markdown
Contributor

@kmichel-aiven kmichel-aiven commented Aug 8, 2021

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

  • Add replicated storage of user, roles, row policies, quotas and settings profiles through ZooKeeper (experimental).

Detailed description / Documentation draft:
This allows storing access entities (users, roles, row policies, quotas and settings profiles) in ZooKeeper to enable replication across an entire cluster.

On a cluster already configured with ZooKeeper, you can enable replication using the following configuration :

    <user_directories>
        <replicated>
            <zookeeper_path>/clickhouse/access/</zookeeper_path>
        </replicated>
    </user_directories> 

The SQL for each access entity is stored at {zookeeper_path}/uuid/{entity_uuid}.

In addition, to enforce name uniqueness, each entity UUID is stored at {zookeeper_path}/{type_code}/{name}, where the type code is one of U,R,S,P,Q.

This is conceptually similar to the experimental Replicated database engine.

Unlike the ON CLUSTER option, users able to manage other users through SQL cannot opt out of the replication.


Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@robot-clickhouse robot-clickhouse added doc-alert pr-feature Pull request with new product feature labels Aug 8, 2021
@kmichel-aiven kmichel-aiven force-pushed the kmichel-replicated-access-storage branch 3 times, most recently from 42931c0 to 58e9635 Compare August 8, 2021 21:28
@vitlibar vitlibar self-assigned this Aug 10, 2021
@vitlibar
Copy link
Copy Markdown
Member

vitlibar commented Aug 13, 2021

Integration tests (thread) says there is a potential deadlock.

To fix that it seems to be enough just to call get_zookeeper() before you lock the mutex of ReplicatedAccessStorage.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message doesn't seem to be very helpful. I suppose maybe it's better to add a new parameter throw_exception to tryInsert(), so that if it's true tryInsert() will throw exception instead of returning false. I mean to call tryInsert with throw_exception=false multiple times and last time with throw_exception=true.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I couldn't do much more detailed than the original zookeeper error, and the error is supposed to be rare anyways, so I just let the original exception bubble up when it's the last try.

Copy link
Copy Markdown
Member

@vitlibar vitlibar Aug 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we should call refreshEntityNoLock() for existing_entity_uuid too?

Copy link
Copy Markdown
Contributor Author

@kmichel-aiven kmichel-aiven Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessary because refreshEntityNoLock on the new entity will call setEntityNoLock which will detect the conflicting existing entity and remove it from local storage.

Copy link
Copy Markdown
Member

@vitlibar vitlibar Aug 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem the mutex of ReplicatedAccessStorage actually needs to be locked until this point.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, I moved things around so now we have clear separation between updating zookeeper, and then locking and updating the local state.

Copy link
Copy Markdown
Member

@vitlibar vitlibar Aug 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually it's better to do initialization in startup() than in a worker thread.
Because if initialization fails maybe it's because of a wrong configuration so we display an error and stop.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it to the main thread to add the early failure but kept it here too, to handle the "reset and recover after major zookeeper failure".

Copy link
Copy Markdown
Member

@vitlibar vitlibar Aug 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just idea. Maybe we should not be watching and trying to get from zookeeper and parse each entity. Instead we could make entries_by_id containing only entities which are somehow interesting for us. I mean for example we can make readImpl() to search in entries_by_id, then if not found try to get from zookeeper, start watching and finally update entries_by_id. What do you think?

Copy link
Copy Markdown
Contributor Author

@kmichel-aiven kmichel-aiven Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make the subscription/notification system a bit too unpredictable if we don't have all entities available.
Operations like findAllImpl or existsImpl could become quite challenging too.
I also like that the current proposition is essentially as fast as MemoryStorage for all read-only operations (and these operations can keep working when ZK is temporarily down).

@vitlibar
Copy link
Copy Markdown
Member

@kmichel-aiven Very nice implementation, thank you!

@kmichel-aiven kmichel-aiven force-pushed the kmichel-replicated-access-storage branch from 58e9635 to e55c0a4 Compare August 17, 2021 14:55
This stores Access Entities in ZooKeeper and replicates them across an entire cluster.

This can be enabled by using the following configuration :

    <user_directories>
        <replicated>
            <zookeeper_path>/clickhouse/access/</zookeeper_path>
        </replicated>
    </user_directories>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants