Skip to content

PV reconciliation error causes PV to be outOfSync (requires pruning) on argocd #958

@naydoo

Description

@naydoo

Hi operator team,

I just came across this issue with the CHOP where if we have a PV reconciliation error, the CHOP deletes the PV.

How to reproduce:

  • Clickhouse operator 0.18.5
  • Clickhouse Cluster image: clickhouse/clickhouse-server:22.3
  • Kubernete Version 1.23.3

below is the error I seen in the logs.

I0614 09:30:02.235030 1 worker.go:364] clear():clickhouse/clickhouse/4d093249-f976-4e1d-aa37-a0c43039a34d:Non-reconciled objects: PV: /pvc-35ff1a7d-3b12-4aa2-ae9f-1829a46d9324 PV: /pvc-6445af67-33af-4126-ac56-621537eeab2f I0614 09:30:02.235049 1 worker.go:374] clear():clickhouse/clickhouse/4d093249-f976-4e1d-aa37-a0c43039a34d:remove items scheduled for deletion I0614 09:30:02.447857 1 worker.go:378] worker.go:378:dropReplicas():start:clickhouse/clickhouse/4d093249-f976-4e1d-aa37-a0c43039a34d:drop replicas based on AP I0614 09:30:02.447880 1 worker.go:399] worker.go:399:dropReplicas():end:clickhouse/clickhouse/4d093249-f976-4e1d-aa37-a0c43039a34d:processed replicas: 0 I0614 09:30:02.447889 1 worker.go:352] includeStopped():clickhouse/clickhouse/4d093249-f976-4e1d-aa37-a0c43039a34d:add CHI to monitoring I0614 09:30:02.673250 1 worker.go:424] markReconcileComplete():clickhouse/clickhouse/4d093249-f976-4e1d-aa37-a0c43039a34d:reconcile completed
Screenshot from 2022-06-14 14-06-16

Already added the reclaimPolicy to guard against the complete deletion.
`volumeClaimTemplates:

  • name: data-storage-vc-template
    reclaimPolicy: Retain
    spec:
    storageClassName: "standard"
    accessModes:
    - ReadWriteOnce
    resources:
    requests:
    storage: 5Gi`

I have also tried to add the below reconciling config, it however did not solve the problem.

apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
  name: "{{ include "clickhouse.fullname" . }}"
spec:
  reconciling:
    policy: "nowait"
    configMapPropagationTimeout: 90
    cleanup:
      unknownObjects:
        statefulSet: Delete
        pvc: Retain
        configMap: Delete
        service: Delete
      reconcileFailedObjects:
        statefulSet: Retain
        pvc: Retain
        configMap: Retain
        service: Retain
  configuration:

Not sure if PV needs to be added as part of the reconciliation parameters.
How can this issue be resolved so that PVs do not start terminatng after installation of clickhouse operator and cluster.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions