Skip to content

Prometheus 2.41 and above cannot discover EC2 SD via a proxy when using instance/task credentials #11991

@nhinds

Description

@nhinds

What did you do?

In Prometheus 2.40.7 and below, launching Prometheus with the environment variables HTTP_PROXY=http://myproxy:3128, HTTPS_PROXY=http://myproxy:3128, and NO_PROXY=169.254.0.0/16,172.16.0.0/12,192.168.0.0/16,*.myprivatedomain and configuration like:

scrape_configs:
  - job_name: example
    ec2_sd_configs:
      - region: us-west-2

resulted in Prometheus fetching its own IAM credentials from the EC2 metadata service http://169.254.169.254 (or from the ECS task metadata service http://169.254.170.2), and then using the HTTP proxy http://myproxy:3128 to connect to the EC2 API.

After upgrading to Prometheus 2.41.0 or 2.42.0, this no longer works. There is a new proxy_url field in ec2_sd_configs, so I attempted the following configuration:

scrape_configs:
  - job_name: example
    ec2_sd_configs:
      - region: us-west-2
        proxy_url: http://myproxy:3128

What did you expect to see?

Setting the proxy_url for EC2 SD configs should use that proxy for connecting to the EC2 API (and probably for connecting to the STS API when assuming roles), but it should not use this proxy when connecting to link-local services such as the EC2 metadata API or ECS task API.

This should either happen automatically (e.g. Prometheus could hardcode knowledge of the EC2/ECS metadata endpoints to bypass the proxy) or via configuration similar to the NO_PROXY environment variable

What did you see instead? Under which circumstances?

Prometheus attempts to connect to the link-local API http://169.254.169.254 via the HTTP proxy when one is specified in proxy_url for an EC2 SD config. The proxy correctly rejects the attempt for Prometheus to read the proxy's IAM credentials, and Prometheus fails to discover instances in EC2 with an error about not having credentials.

System information

Linux 5.10.165-143.735.amzn2.x86_64 x86_64

Prometheus version

prometheus, version 2.42.0 (branch: HEAD, revision: 225c61122d88b01d1f0eaaee0e05b6f3e0567ac0)
  build user:       root@c67d48967507
  build date:       20230201-07:53:32
  go version:       go1.19.5
  platform:         linux/amd64

prometheus, version 2.41.0 (branch: HEAD, revision: c0d8a56c69014279464c0e15d8bfb0e153af0dab)
  build user:       root@d20a03e77067
  build date:       20221220-10:40:45
  go version:       go1.19.4
  platform:         linux/amd64

Prometheus configuration file

scrape_configs:
  - job_name: example
    ec2_sd_configs:
      - region: us-west-2
        proxy_url: http://myproxy:3128

Alertmanager version

No response

Alertmanager configuration file

No response

Logs

ts=2023-02-19T23:18:54.812Z caller=refresh.go:80 level=error component="discovery manager scrape" discovery=ec2 msg="Unable to refresh target groups" err="could not describe instances: NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"
ts=2023-02-19T23:19:06.257Z caller=refresh.go:80 level=error component="discovery manager scrape" discovery=ec2 config=example msg="Unable to refresh target groups" err="could not describe instances: NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"
ts=2023-02-19T23:20:06.260Z caller=refresh.go:99 level=error component="discovery manager scrape" discovery=ec2 config=example msg="Unable to refresh target groups" err="could not describe instances: NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions