Description
In zarf we run a localhost registry on a nodeport without TLS. We have seen and had failures reported on AKS, EKS, RKE2, and k3s because config_path in the containerd config.toml is being set to a default path (that may or may not actually exist).
Relevant context: zarf-dev/zarf#592
Similar to: #7392
Steps to reproduce the issue
- Run a k8s cluster w/ containerd
- Use this config
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d"
-
Run an insecure, plain HTTP registry as a pod in the cluster exposed via a nodeport and push an image to it
-
Run a pod that tries to pull/use an image from the in-cluster registry
Describe the results you received and expected
Expected result: containerd sends a plain HTTP request to pull the image and it succeeds.
K8s logs and the containerd.log file show that containerd is sending an HTTPS request rather than a plain HTTP request when trying to pull an image from the insecure localhost registry.
Here's a containerd.log file as a result of running zarf init on a linux/arm64 Ubuntu VM to create a k3s cluster and bootstrap a registry in the cluster:
containerd.log
Primary logs of interest:
"host will try HTTPS first since it is configured for HTTP with a TLS configuration, consider changing host to HTTPS or removing unused TLS configuration" host="127.0.0.1:30152"
|
log.G(ctx).WithField("host", host.host).Info("host will try HTTPS first since it is configured for HTTP with a TLS configuration, consider changing host to HTTPS or removing unused TLS configuration") |
"trying next host" error="failed to do request: Head \"https://127.0.0.1:30152/v2/library/registry/manifests/2.8.3\": net/http: TLS handshake timeout" host="127.0.0.1:30152"
It appears that there is supposed to be an HTTP fallback mechanism that doesn't seem to be working as expected:
|
rhosts[i].Client.Transport = docker.HTTPFallback{RoundTripper: rhosts[i].Client.Transport} |
What version of containerd are you using?
v1.7.11
Any other relevant information
Versions affected appear to be >=1.7.7
Show configuration if it is related to CRI plugin.
# File generated by k3s. DO NOT EDIT. Use config.toml.tmpl instead.
version = 2
[plugins."io.containerd.internal.v1.opt"]
path = "/var/lib/rancher/k3s/agent/containerd"
[plugins."io.containerd.grpc.v1.cri"]
stream_server_address = "127.0.0.1"
stream_server_port = "10010"
enable_selinux = false
enable_unprivileged_ports = true
enable_unprivileged_icmp = true
sandbox_image = "rancher/mirrored-pause:3.6"
[plugins."io.containerd.grpc.v1.cri".containerd]
snapshotter = "overlayfs"
disable_snapshot_annotations = true
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/var/lib/rancher/k3s/data/b239951455c1937c9a602d6f648eb66e85742877bacc677f28cb07c2962f9d3a/bin"
conf_dir = "/var/lib/rancher/k3s/agent/etc/cni/net.d"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/var/lib/rancher/k3s/agent/etc/containerd/certs.d"
note that the config_path "/var/lib/rancher/k3s/agent/etc/containerd/certs.d" doesn't actually exist
Description
In zarf we run a localhost registry on a nodeport without TLS. We have seen and had failures reported on AKS, EKS, RKE2, and k3s because
config_pathin the containerdconfig.tomlis being set to a default path (that may or may not actually exist).Relevant context: zarf-dev/zarf#592
Similar to: #7392
Steps to reproduce the issue
Run an insecure, plain HTTP registry as a pod in the cluster exposed via a nodeport and push an image to it
Run a pod that tries to pull/use an image from the in-cluster registry
Describe the results you received and expected
Expected result: containerd sends a plain HTTP request to pull the image and it succeeds.
K8s logs and the
containerd.logfile show that containerd is sending an HTTPS request rather than a plain HTTP request when trying to pull an image from the insecure localhost registry.Here's a
containerd.logfile as a result of runningzarf initon a linux/arm64 Ubuntu VM to create a k3s cluster and bootstrap a registry in the cluster:containerd.log
Primary logs of interest:
"host will try HTTPS first since it is configured for HTTP with a TLS configuration, consider changing host to HTTPS or removing unused TLS configuration" host="127.0.0.1:30152"containerd/core/remotes/docker/config/hosts.go
Line 254 in b0d00f8
"trying next host" error="failed to do request: Head \"https://127.0.0.1:30152/v2/library/registry/manifests/2.8.3\": net/http: TLS handshake timeout" host="127.0.0.1:30152"It appears that there is supposed to be an HTTP fallback mechanism that doesn't seem to be working as expected:
containerd/core/remotes/docker/config/hosts.go
Line 256 in b0d00f8
What version of containerd are you using?
v1.7.11Any other relevant information
Versions affected appear to be
>=1.7.7Show configuration if it is related to CRI plugin.
note that the
config_path"/var/lib/rancher/k3s/agent/etc/containerd/certs.d"doesn't actually exist