Skip to content

_detector.GoogleCloudResourceDetector misbehaving on Buildkite #363

@xrmx

Description

@xrmx

I have _detector.GoogleCloudResourceDetector misbehaving when running under Buildkite (CI tool) on GKE.

When I load the resource detector the process will get restarted and I'll have an infinite loop with one of this messages from the OTel sdk per run:

Detector <GoogleCloudResourceDetector object at 0x7f8ae0522250> took longer than 5 seconds, skipping

I've debugged this a bit and there is something in _gke_resource that causes this.

As an experiment I've open coded the thing into another resource detector and to my surprise the following code is working fine:

class GoogleDebugCloudResourceDetector(ResourceDetector):
    def detect(self) -> Resource:
        from opentelemetry.resourcedetector.gcp_resource_detector import _metadata, _gke  # , _detector
        from opentelemetry.resourcedetector.gcp_resource_detector._constants import (
            ResourceAttributes,
        )

        try:
            print(_metadata.get_metadata())
        except Exception:
            return Resource.get_empty()

        if _gke.on_gke():
            cluster_location = _metadata.get_metadata()["instance"]["attributes"]["cluster-location"]
            hyphen_count = cluster_location.count("-")
            if hyphen_count == 1:
                zone_or_region_key = ResourceAttributes.CLOUD_REGION
            elif hyphen_count == 2:
                zone_or_region_key = ResourceAttributes.CLOUD_AVAILABILITY_ZONE
            else:
                print("oops no zone_or_region_key")
                zone_or_region_key = "oops"

            cluster_name = _metadata.get_metadata()["instance"]["attributes"]["cluster-name"]
            host_id = str(_metadata.get_metadata()["instance"]["id"])
            return Resource(
                {
                    ResourceAttributes.CLOUD_PLATFORM_KEY: ResourceAttributes.GCP_KUBERNETES_ENGINE,
                    zone_or_region_key: cluster_location,
                    ResourceAttributes.K8S_CLUSTER_NAME: cluster_name,
                    ResourceAttributes.HOST_ID: host_id,
                }
            )

Unrelated questions:

  • WDYT on making _metadata.is_available a wrapper for _metadata.get_metadata() that returns false if an exception is raised? This way we have one http call less.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions