Add helm chart for unitycatalog#895
Conversation
helm/chart/values.yaml
Outdated
| tag: latest | ||
|
|
||
| image: | ||
| repository: unitycatalog/unitycatalog-ui |
There was a problem hiding this comment.
Image naming and versions will be modified when #698 is closed
Or for example, would it be better to bundle the ui into the backend image? Looking for guidance as to how I can help here, as I am not very familiar with frontend/backend distinctions on k8s. |
|
My instinct is that we should overhaul the current |
Oh, I thought both images will be published in that PR. Usually it is best practice to have images as decoupled as possible. IMO best approach would be to separate image for backend and frontend.
In K8s deploying the app and serving it are usually separate. It's mostly common for Helm charts to deploy bare minimum - the app itself and service for accessing the app. |
|
@sdwbgn @tnightengale - will you be able to make it to the next contributors meeting #902 so we can discuss the best approach for this? |
ognis1205
left a comment
There was a problem hiding this comment.
I thought it would be preferable to support TLS.
|
@dennyglee I start a new role that week, so unclear if I will have a conflict. But will endeavor to attend! |
Makes sense! Currently we can achieve that by terminating TLS/SSL connection on NLB or other ingress. Additionally we might want to consider support for TLS in UC itself. |
|
Adding @roeap to review! |
There was a problem hiding this comment.
Overall I think the Helm chart looks good, however I do have some questions around how we manage storage - and please excuse if I am completely off track here :)
IIRC, in a StatefulSet, each Pod is assigned dedicated storage (PV & PVC) isolated from all other pods. So when using it for the file based storage, we may end up with each replica effectively having its own version of the on-disk database that would be always out of sync?
For the configuration - since we render it on ever pod-start - i think we can just use an ephemeral volume and avoid creating a PV?
If the above is in-fact true, we may want to consider using a Deployment and creating a separate RedWriteMany PV/PVC that all pods can mount, or force all pods on the same node via affinities somehow? Or, since on-disk storage seems most suited for evaluation deployments, just limit it to one replica in that case? This would then allow a wider range of storages, since I believe only a limited sub-set supports RWMany ..
I was a little suprised by the storage credentials, but I guess this is simply due to the fact, that the Credentials api surface is not implemented yet, so there is no dynamic handling of credentials possible?
Lastly, does it make sense to also include the creation of an ingress (and/or gateway) object that users can optionally enable?
| {{- if not (and (eq .Values.db.type "file") .Values.db.fileConfig.persistence.enabled) }} | ||
| - name: db-volume | ||
| emptyDir: {} | ||
| {{- end }} |
There was a problem hiding this comment.
This may be a trivial question, as I am not familiar with hibernate. In case we are not using a file based db, is the volume mount even necessary? i.e. should we have the condition on the volume / volume mount rather then mounting an ephemeral volume if we are not using the file-based db?
There was a problem hiding this comment.
Good point, will fix that!
I will keep db volume mounts for "file" type
There was a problem hiding this comment.
Added postgresql DB support, fixed it
| {{- with .Values.server.statefulset.livenessProbe }} | ||
| livenessProbe: | ||
| {{- toYaml . | nindent 12 }} | ||
| {{- end }} |
There was a problem hiding this comment.
My stance of liveliness probes has evolved over time and may not yet be fully stable :) - that said, personally I do find the reasoning here convincing and we may decide not to specify a liveliness probe?
There was a problem hiding this comment.
Agreed, will remove default livenessProbe
helm/chart/values.yaml
Outdated
| volumes: [] | ||
|
|
||
| volumeMounts: [] |
There was a problem hiding this comment.
should we prefix these with extra to indicate we are always mounting some volumes?
helm/chart/templates/_helpers.tpl
Outdated
| {{- define "unitycatalog.server.apiEndpoint" -}} | ||
| http://{{ include "unitycatalog.server.fullname" . }}:{{ .Values.server.service.port }} | ||
| {{- end }} |
There was a problem hiding this comment.
This template initially confused me a bit, as I was thinking of an external endpoint. Since we are using it only once would it make sense to remove the template and just inline this where we use it?
| {{- if .Values.server.config.persistence.enabled }} | ||
| - metadata: | ||
| name: config-volume | ||
| spec: | ||
| accessModes: {{ .Values.server.config.persistence.accessModes }} | ||
| resources: | ||
| requests: | ||
| storage: {{ .Values.server.config.persistence.size }} | ||
| storageClassName: {{ .Values.server.config.persistence.storageClassName }} | ||
| {{- end }} |
There was a problem hiding this comment.
I may be wrong, but i do believe init containers run for every pod that starts, since we are rendering the config always do we need a dedicated volume for this, or could we just use an ephemeral volume with a lifetime tied to the pod?
There was a problem hiding this comment.
There was a need for that due to JWT private key creation by UC. I moved it to Secret resource and removed the need for config persistence (but I need to test it further).
| - metadata: | ||
| name: db-volume | ||
| spec: | ||
| accessModes: {{ .Values.db.fileConfig.persistence.accessModes }} |
There was a problem hiding this comment.
If we are supporting file based dbs, can we even allow all access modes here, or do we require it to be ReadWriteMany?
There was a problem hiding this comment.
I would consider file-based DB to be single replica solution, hence default value is [ "ReadWriteOnce" ]. ReadWriteMany implementations in most clouds are not perfectly reliable, have quite a lot of limitations (performance, latency, race conditions) and overall cover very specific use cases.
Setting multiple accessModes is incredibly rare case, I never seen a case in production where that would be a requirement. If that is really needed, it currently can be overridden in values.yaml, but as for the default value ReadWriteOnce should be enough.
|
i think cannot go to prodution :< , i am trying but get a lot of missing config |
helm/chart/values.yaml
Outdated
| db: | ||
| type: file | ||
| # Supported values: file | ||
| # TODO: add support for postgres and mysql |
There was a problem hiding this comment.
Thanks for this pull request ❤️ Cannot wait!
I assume in future this Postgres will help avoid using Persistent Volume, then can support multiple replicas (?)
There was a problem hiding this comment.
I added postgresql support, tested simple case of multi-replica UC (without auth) and it seems to work
|
Any updates on this? |
|
@roeap Can you review when you get the chance? Thanks! |
|
Sorry for long wait, I am currently working on current comments, I will push new revision today or tomorrow. |
|
Published next iteration
Currently, I have tested following:
What's left to test:
Overall there is still need for proper UI docker image. |
There was a problem hiding this comment.
Example of values.yaml for prod-like deployment with 2 replicas, PostgreSQL and routing using HTTPRoute
|
@sdwbgn Thanks for pushing on this and also adding Postgres support. Any interest in adding the typical array config for |
|
@sdwbgn A couple items on the Gateway implementation. I didn't dig into that template specifically since I wouldn't use that and I don't think the Gateway object belongs in the chart. My simple proposal would be to change the default Typically I add my routes and backendpolicy k8s objects via Kustomize on top of Helm. So if I were to use this I would disable the Gateway/HTTPRoute in the chart, but then it will try to use the frontend proxy in the Deployment. My second choice would be to enable the HTTPRoute using the And a minor correction, the chart docs say that I am happy to collaborate more on this if you would like, but I'm also not trying to throw a bunch of changes into this PR as I know the goal is to get a first iteration merged. |
|
@unfrgivn Thanks for your comments!
Nice idea, added.
I gave it a thought, and I agree with you, removed it and left only
Fixed the doc type, thanks! |
roeap
left a comment
There was a problem hiding this comment.
@sdwbgn - looking great!
Since this is quite a complex thing with many nuances I expect we may need to iterate once this gets into the wild, but this is expected :).
One more update from main an need to fix the DCO, gut I think good to go.
Signed-off-by: Vasilii Bulatov <[email protected]>
Signed-off-by: Vasilii Bulatov <[email protected]>
Gerrit-K
left a comment
There was a problem hiding this comment.
Hey @sdwbgn sorry for randomly chiming in here with a review. I was evaluating unity with my team and we were happy to see that there might be an official helm chart just around the corner. Unfortunately, while trying it out, we stumbled over a few things that we couldn't work around - most of which was related to security enforcements of our target cluster. I figured I might comment some of them here already, but please don't see this as a blocker for the PR.
Thanks for your work!
| - /bin/sh | ||
| - -c | ||
| - |- | ||
| apk add --no-cache envsubst openssl curl |
There was a problem hiding this comment.
Installing packages at runtime requires to run the container with root privileges, which is not recommended. Instead, it would be advisable to either use off-the-shelf images or provide a custom pre-built image that contains the required tools. (same applies for the create-users-job)
There was a problem hiding this comment.
Made a separate PR with that - #1019. Will update the chart as soon as we merge it.
| # | ||
| # [Kubernetes docs](https://kubernetes.io/docs/concepts/policy/security-context/) | ||
| # | ||
| securityContext: |
There was a problem hiding this comment.
As a general suggestion, I would propose to add a few settings by default and let users decide whether they need to extend/override the securityContext. A sensible default could be:
podSecurityContext:
runAsNonRoot: true
runAsUser: 100
runAsGroup: 101
fsGroup: 101
seccompProfile: { type: RuntimeDefault }
securityContext:
allowPrivilegeEscalation: false
capabilities: { drop: [ALL] }At least the runAs... settings are useful so that the users don't need to inspect the image to get the user and group IDs to run the container with.
There was a problem hiding this comment.
Great idea! Currently, due to the alpine images and package installation, it's not possible to drop root permissions. I have prepared a PR with adding those packages to main UC image - #1019. After it's merged, I will drop bare alpine image and usage of apk, and we can use these defaults.
There was a problem hiding this comment.
Thanks for the quick reply! Sounds reasonable 👍
Looking forward to the further improvements
Signed-off-by: Vasilii Bulatov <[email protected]>
Signed-off-by: Vasilii Bulatov <[email protected]>
|
@Gerrit-K Thanks for your suggestions! They are exactly what was missing for making the deployment more secure. I've addressed the certificate issue and securityContext for initContainer of UI. For the init image and default securityContext I want to make a separate change, after PR #1019 is merged. |
PR Checklist
docsis updatedDescription of changes
Bringing helm chart from - https://github.com/sdwbgn/unitycatalog-helm
Features:
Relates: #433