Formal Operator configuration design v2 by galderz · Pull Request #77 · infinispan/infinispan-operator

galderz · 2019-07-04T11:41:42Z

Version 2 of the operator configuration design here. Noteworthy changes:

General

Instead of 1 full example, there are now 2 full examples. One for using it as a cache service, one for using it as a data grid service.
spec.size is now spec.replicas.
spec.profile has been added as a result of the feedback on v1. Essentially, this is a way for the user to select the base settings of Infinispan. By default this is Secured, that means that both connector and cluster authentication and encryption are enabled. Alternatively, users can select Development where both connector and cluster authentication and encryption are disabled for ease of local development. The 3rd option is Performance which is like Secured but without cluster encryption. Thoughts? @tristantarrant @Crumby.
spec.connector.secret.authentication has been moved to spec.connector.authentication.
spec.container.cpu has been added.
spec.logging categories have been moved to spec.logging.categories, with attribute names defining package names and values referring to logging levels.
spec.management.prometheus boolean has been transformed to a subtree and spec.management.prometheus.enabled flag has been added to enable or disable it. By default would be disabled as in v1. I have some doubts about this though, see question section.
spec.management.secret becomes spec.management.authentication.

Cache

spec.evictionPolicy has been moved to spec.cache.evictionPolicy.
spec.replicationFactor has been moved to spec.cache.replicationFactor.

Data Grid

spec.container.storage has been moved to spec.datagrid.storage since this option is only relevant for datagrid usages.
spec.datasources has been moved to spec.datagrid.datasources.
spec.sites has been moved to spec.datagrid.sites.
spec.sites.remotes[].secret has been renamed to spec.datagrid.sites.remotes[].authentication

Questions

spec.management.prometheus could be removed completely by making using the profile as a way to signal whether prometheus is enabled or not. As example, with Secured and Performance prometheus would be enabled, but with Development it'd be disabled. Thoughts?

tristantarrant · 2019-07-08T14:27:33Z

I like the profile concept, however I'd rename "Secured" to "Default", because we don't want to imply that "Performance" is not secure.

galderz · 2019-07-11T16:33:43Z

One more thing:

.spec.container.jvmOptionsAppend name sounds a bit clunky. I'm liking more the idea of naming it .spec.container.extraJvmOptions. Thoughts?

galderz · 2019-07-12T09:42:16Z

@tristantarrant Something else came to my mind while 😴, would it make sense for connector authentication to take a list of credentials instead of a single one? E.g. if you want to add multiple users for data access. Each could either be defined with credentials... Or would such situation be handled by a single Keystore authentication entry, where the keystore can store usernames and passwords for multiple users?

rigazilla · 2019-07-24T08:44:31Z

Do we have/need something to manage application roles?

galderz · 2019-07-26T15:58:18Z

The question popped in a chat with @tristantarrant about the need to keep management and connector authentication separate...

Tristan Are we keeping application and management user separation in new image?
No.
Tristan: It was pretty pointless, especially in the context of allowing users to create caches remotely on demand.

So, we'll further coalesce the authentication part, probably under spec.authentication.

Do we have/need something to manage application roles?

We probably do. @tristantarrant What should we do here? What is the bare minimum you'd need? @rigazilla We could see what's done in Kubernetes with security in this area of role definitions...

tristantarrant · 2019-07-26T16:07:44Z

There must be two ways

simple, which uses property files created with serverng's user-tool. This can support Basic/Plain, digest and scram Auth methods. This is identical to the old server
token, which delegates user/password/roles to keycloak. The infinispan operator must interact with the keycloak one to provision keycloak "client id/secret" pairs for the server and the clients. This means hot rod /rest clients can use bearer tokens. We need to talk to the keycloak team about this
In truth, both methods can be active at the same time, but let's not overcomplicate things for now

Crumby · 2019-08-01T12:16:33Z

@galderz V2 specification looks good to me. I wasn't sure at first about datagrid and cache elements in context of services but seeing it written down I'm convinced. Eg.: storage element would more fit container as container groups hw resources but that depends on the angle of view. Current way the API is sending clear message that there are two modes to Infinispan with completely different set of capabilities and it's quite clear what can be configured for one but not for the other which wouldn't be as clear with storage being under container.

What's not so clear just by looking at yaml API is that you can either use cache or datagrid element (or you can use both?). Yaml won't naturally stop you from doing that. That'll require either good documentation or something like:

spec:
  service:
    type: cache
    cacheProp1: xxx
    cacheProp2: yyy


spec:
  service:
    type: datagrid
    datagridProp1: xxx
    datagridProp2: yyy

.spec.container.jvmOptionsAppend name sounds a bit clunky. I'm liking more the idea of naming it .spec.container.extraJvmOptions. Thoughts?

+1, Maybe extraJvmOpts?

spec.management.prometheus could be removed completely by making using the profile as a way to signal whether prometheus is enabled or not. As example, with Secured and Performance prometheus would be enabled, but with Development it'd be disabled. Thoughts?

What if user would like to play with the stats in Development mode so he can set up and prepare his monitoring solution? Maybe the best would be to enable Prometheus in any case and drop that option completely. Prometheus agent isn't anything that would give you any advantage if you turn it off.

rigazilla

More words on how/why the setup is partitioned under: cache, datagrid, container would be good IMO.

galderz · 2019-08-07T12:49:38Z

@rigazilla

More words on how/why the setup is partitioned under: cache, datagrid, container would be good IMO.

Cache is the most simple way to use DG, as a cache. We simplified the config in OpenShift images already for 7.3. The existing options there come from the work we did there. Check that out.

Data Grid is everything else: x-site, plug remote stores, local PV data persistence...etc.

Container is per pod configuration: how much CPU to give each pod, how much memory, any extra JVM settings to pass.

galderz · 2019-08-07T12:52:50Z

@Crumby

storage element would more fit container as container groups hw resources but that depends on the angle of view.

Good point, I'll amend that for next revision.

galderz · 2019-08-07T12:56:11Z

@Crumby

Current way the API is sending clear message that there are two modes to Infinispan with completely different set of capabilities and it's quite clear what can be configured for one but not for the other which wouldn't be as clear with storage being under container.

Replied too early. Good point about this, I'll try to think it differently.

What's not so clear just by looking at yaml API is that you can either use cache or datagrid element (or you can use both?). Yaml won't naturally stop you from doing that. That'll require either good documentation or something like:

Good point. Your suggestion makes sense, I'll address that for next rev.

galderz · 2019-08-07T12:56:57Z

+1, Maybe extraJvmOpts?

+1

What if user would like to play with the stats in Development mode so he can set up and prepare his monitoring solution? Maybe the best would be to enable Prometheus in any case and drop that option completely. Prometheus agent isn't anything that would give you any advantage if you turn it off.

+1

Crumby · 2019-08-09T07:06:45Z

@galderz Have you thought about adding some option for exposing the rest/hotrod out? Or users will be required to create the route always manually? Asking because I noticed that opc's cluster config contains option defaultRoute: false and if you set it to true then it exposes the OpenShift's internal registry. I found out myself using this option quite often as it's way quicker and simpler then creating the route manually.

galderz · 2019-08-09T08:10:21Z

@tristantarrant Something else came to my mind while 😴, would it make sense for connector authentication to take a list of credentials instead of a single one? E.g. if you want to add multiple users for data access. Each could either be defined with credentials... Or would such situation be handled by a single Keystore authentication entry, where the keystore can store usernames and passwords for multiple users?

FYI, this will be sorted in v3, where you'll be able to define multiple user/pass credentials.

galderz · 2019-08-09T12:13:48Z

Closing PR, v3 can be found in #108.

galderz requested review from Crumby, rigazilla and tristantarrant July 4, 2019 11:43

galderz mentioned this pull request Jul 4, 2019

Formal operator configuration design #46

Closed

galderz force-pushed the t_spec_design_v2 branch 2 times, most recently from 2a86edc to 23bf786 Compare July 8, 2019 13:50

galderz force-pushed the t_spec_design_v2 branch from 23bf786 to 11ddab9 Compare July 10, 2019 17:05

galderz force-pushed the t_spec_design_v2 branch 2 times, most recently from 2d40bf6 to e489d87 Compare July 16, 2019 09:24

galderz force-pushed the t_spec_design_v2 branch from e489d87 to 5a36034 Compare July 24, 2019 14:53

rigazilla reviewed Aug 2, 2019

View reviewed changes

galderz added 2 commits August 6, 2019 12:09

Operator configuration v1 draft design

3ab6d0b

Operator configuration v2 draft design

a4647ed

galderz force-pushed the t_spec_design_v2 branch from 3d6d6f3 to a4647ed Compare August 6, 2019 10:09

galderz mentioned this pull request Aug 9, 2019

Formal Operator configuration design v3 #108

Merged

galderz closed this Aug 9, 2019

Conversation

galderz commented Jul 4, 2019

General

Cache

Data Grid

Questions

Uh oh!

tristantarrant commented Jul 8, 2019

Uh oh!

galderz commented Jul 11, 2019

Uh oh!

galderz commented Jul 12, 2019

Uh oh!

rigazilla commented Jul 24, 2019

Uh oh!

galderz commented Jul 26, 2019

Uh oh!

tristantarrant commented Jul 26, 2019

Uh oh!

Crumby commented Aug 1, 2019

Uh oh!

rigazilla left a comment

Choose a reason for hiding this comment

Uh oh!

galderz commented Aug 7, 2019

Uh oh!

galderz commented Aug 7, 2019

Uh oh!

galderz commented Aug 7, 2019

Uh oh!

galderz commented Aug 7, 2019

Uh oh!

Crumby commented Aug 9, 2019

Uh oh!

galderz commented Aug 9, 2019

Uh oh!

galderz commented Aug 9, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants