Skip to content

stuck on propagation check failed DNS record not yet propagated #5515

@rgl

Description

@rgl

Describe the bug:

The cert-manager container is stuck in a loop outputting the following error message, but failing to progress:

E1017 16:19:41.560836       1 sync.go:190] cert-manager/challenges "msg"="propagation check failed" "error"="DNS record for \"auth.example.com\" not yet propagated" "dnsName"="auth.example.com" "resource_kind"="Challenge" "resource_name"="auth-jt5dj-1674015084-2821633056" "resource_namespace"="default" "resource_version"="v1" "type"="DNS-01"

Expected behaviour:

Expected it to eventually succeed. But cert-manager is stuck in the loop way longer than the DNS propagation time.

Steps to reproduce the bug:

  1. Configure cert-manager as https://github.com/rgl/terraform-azure-aks-example/blob/master/cert-manager.tf.
    • This essentially configures the Staging Let's Encrypt DNS-01 challenge to a new Azure DNS zone.
    • Please note that for practical purpose the domain does NOT exists until, I, as the Human Operator, manually create the DNS delegation. So Its expected that cert-manager will loop for a lot of time until the Human Operator actually does the DNS delegation, and it properly propagates (10m).
  2. Let cert-manager loop with the mentioned expected error of DNS record not yet propagated.
  3. Manually delegate the domain to the Azure DNS zone (also created by the terraform code).
  4. Observe that even after the DNS propagation time (10m), cert-manager does NOT recover, and does not create the certificate.

To unstuck cert-manager, I basically deleted the cert-manager pod (and ONLY this pod; the other two pods, cert-manager-cainjector and cert-manager-webhook were not touched).

After k8s recreates the cert-manager pod, it works as expected.

Anything else we need to know?:

Environment details::

  • Kubernetes version: 1.24.6
  • Cloud-provider/provisioner: Azure AKS/terraform/helm/Staging Let's Encrypt DNS-01
  • cert-manager version: 1.10.0
  • Install method: helm

/kind bug

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.lifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions