Export AWS quotas on Prometheus
A subset of the aws service quotas are labelled adjustable
. This can be at the account or region level. If some of the quotas are adjusted for some regions, then the quotas per region would no longer be homogeneous. This would cause a rift when creating monitoring or alerting logic in prometheus based on the service quotas.
The aim of the aws_quota_exporter
is to export these quotas in prometheus to solve the above problem. At the time of writing, this feature is not currently available in the prometheus yace exporter
Version 1.0.0 +
will introduce a clustering functionality that groups similar metrics. This was requested here. The common words from the metric group are extracted as a metric name. The unique words form the label. Two new labels are added:
kind
: The label for the unique word.name
: The AWS metric name.
Example transformation of the metric grouping can be seen below:
# HELP aws_quota_ec2_all_dl_spot_instance_requests All DL Spot Instance Requests
aws_quota_ec2_all_dl_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",region="us-west-1",unit="None"} 0
# HELP aws_quota_ec2_all_f_spot_instance_requests All F Spot Instance Requests
aws_quota_ec2_all_f_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",region="us-west-1",unit="None"} 0
# HELP aws_quota_ec2_all_g_and_vt_spot_instance_requests All G and VT Spot Instance Requests
aws_quota_ec2_all_g_and_vt_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",region="us-west-1",unit="None"} 0
# HELP aws_quota_ec2_all_inf_spot_instance_requests All Inf Spot Instance Requests
aws_quota_ec2_all_inf_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",region="us-west-1",unit="None"} 0
# HELP aws_quota_ec2_all_p4__p3_and_p2_spot_instance_requests All P4, P3 and P2 Spot Instance Requests
aws_quota_ec2_all_p4__p3_and_p2_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",region="us-west-1",unit="None"} 0
# HELP aws_quota_ec2_all_p5_spot_instance_requests All P5 Spot Instance Requests
aws_quota_ec2_all_p5_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",region="us-west-1",unit="None"} 0
# HELP aws_quota_ec2_all_standard__a__c__d__h__i__m__r__t__z__spot_instance_requests All Standard (A, C, D, H, I, M, R, T, Z) Spot Instance Requests
aws_quota_ec2_all_standard__a__c__d__h__i__m__r__t__z__spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",region="us-west-1",unit="None"} 5
# HELP aws_quota_ec2_all_trn_spot_instance_requests All Trn Spot Instance Requests
aws_quota_ec2_all_trn_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",region="us-west-1",unit="None"} 0
# HELP aws_quota_ec2_all_x_spot_instance_requests All X Spot Instance Requests
aws_quota_ec2_all_x_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",region="us-west-1",unit="None"} 0
# HELP aws_quota_ec2_all_spot_instance_requests Amazon Elastic Compute Cloud (Amazon EC2): All Spot Instance Requests
# TYPE aws_quota_ec2_all_spot_instance_requests gauge
aws_quota_ec2_all_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",kind="DL",name="All DL Spot Instance Requests",region="us-west-1",unit="None"} 0
aws_quota_ec2_all_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",kind="F",name="All F Spot Instance Requests",region="us-west-1",unit="None"} 0
aws_quota_ec2_all_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",kind="G and VT",name="All G and VT Spot Instance Requests",region="us-west-1",unit="None"} 0
aws_quota_ec2_all_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",kind="Inf",name="All Inf Spot Instance Requests",region="us-west-1",unit="None"} 0
aws_quota_ec2_all_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",kind="P4, P3 P2",name="All P4, P3 and P2 Spot Instance Requests",region="us-west-1",unit="None"} 0
aws_quota_ec2_all_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",kind="P5",name="All P5 Spot Instance Requests",region="us-west-1",unit="None"} 0
aws_quota_ec2_all_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",kind="Standard (A, C, D, H, I, M, R, T, Z)",name="All Standard (A, C, D, H, I, M, R, T, Z) Spot Instance Requests",region="us-west-1",unit="None"} 5
aws_quota_ec2_all_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",kind="Trn",name="All Trn Spot Instance Requests",region="us-west-1",unit="None"} 0
aws_quota_ec2_all_spot_instance_requests{account="126485599999",adjustable="true",global_quota="false",kind="X",name="All X Spot Instance Requests",region="us-west-1",unit="None"} 0
- Run the following command
go run . --prom.port=10100 --config.file=config.yml
- Example of
config.yml
jobs:
- serviceCode: lambda
accountName: dev-account # optional
regions:
- us-west-1
- us-east-1
role: arn:aws:iam::ACCOUNT-ID:role/rolename # optional
- serviceCode: cloudformation
accountName: prod-account # optional
regions:
- us-west-1
- us-east-1
- Use the optional
role
key if you want the exporter to assume the role when retrieving that specific job metrics
- View program help:
$ ./aws_quota_exporter -h
Usage of ./aws_quota_exporter:
-cache.duration duration
Cache expiry time. (default 5m0s)
-cache.serve-stale
Serve stale cache data during cache refresh. This avoids delays in serving metrics. (default: false)
-collect.usage
Collect quotas usage where available (NOTE: CloudWatch calls aren't free, default: false)
-config.file string
Path to configuration file. (default "/etc/aqe/config.yml")
-log.folder string
Folder to store logfiles. logs to stdout if not specified. (default "stdout")
-log.format string
Format of log messages (text or json). (default "text")
-log.level string
Log level to log from (DEBUG|INFO|WARN|ERROR). (default "INFO")
-prom.port int
Port to expose prometheus metrics. (default 10100)
-version
Display aqe version
- Display version
$ ./aws_quota_exporter -version
{
App: "AWS Quota Exporter (AQE)",
Version: "dev",
Date: "Sun Sep 3 17:54:45 UTC 2023",
Platform: "darwin/arm64",
Commit: "none",
GoVersion: "go1.21.13"
}
The serviceCode
is the AWS service identifier. To identify the serviceCode
for a particular service, use the following aws cli command:
aws service-quotas list-services
You can enable quota usage collection with -collect.usage
flag (ℹ️ Not all quotas have usage. see docs). The latest usage value from CloudWatch using GetMetricStatistics API method is collected. type="usage|quota
is used to differentiate the metrics. This "type": "usage"
will export usage metrics while "type": "quota"
will export quota metrics.
Example promQL query to get quota usage ratio:
{job="quota-exporter", type="usage"} / {job="quota-exporter", type="quota"}
NOTE: It requires cloudwatch:GetMetricStatistics
permission in IAM policy.
Using the docker image avaliable on dockerhub
docker run --name my-aqe -d -p 10100:10100 -e AWS_ACCESS_KEY=111222 -e AWS_SECRET_KEY=secret ugwuanyi/aqe:main
This program relies on the AWS SDK for Go V2
for handling authentication.
The AWS SDK uses its default credential chain to find AWS credentials. This default credential chain looks for credentials in the following order:
-
Environment variables
- Static Credentials:
(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN)
- Web Identity Token:
(AWS_WEB_IDENTITY_TOKEN_FILE)
- Static Credentials:
-
Shared configuration files
- SDK defaults to
credentials file
andconfig file
under.aws
folder that is placed in the home folder on the host.
- SDK defaults to
-
IAM role for tasks.
-
IAM role for Amazon EC2.
By default, the SDK checks the AWS_PROFILE
environment variable to determine which profile to use. If no AWS_PROFILE
variable is set, the SDK uses the default profile.
To set profile to use:
$ AWS_PROFILE=test_profile
Steps to use the helm chart
- Add chart to local repository
helm repo add aws_quota_exporter https://emylincon.github.io/aws_quota_exporter
- To view configurable values. You can edit any of those the configurable values.
helm show values aws_quota_exporter/aqe
- In this example, we will set the aws credentials in values.yaml
secret:
# base64 encoded secrets
AWS_ACCESS_KEY_ID: QVdTX0FDQ0VTU19LRVlfSUQK
AWS_SECRET_ACCESS_KEY: QVdTX1NFQ1JFVF9BQ0NFU1NfS0VZCg==
- We will create a new namespace and install the chart in the namespace
kubectl create namespace aqe
helm install -n aqe -f values.test aqe aws_quota_exporter/aqe
- View installed chart
helm list -A
- Uinstall chart
helm uninstall -n aqe aqe
The exporter requires the AWS managed policy ServiceQuotasReadOnlyAccess
. This also depends on the jobs specified in the config.yml
file, as all of the permissions are probably not required. The permissions included in ServiceQuotasReadOnlyAccess
are as follows in policy document:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAccountLimits",
"cloudformation:DescribeAccountLimits",
"cloudwatch:DescribeAlarmsForMetric",
"cloudwatch:DescribeAlarms",
"cloudwatch:GetMetricData",
"cloudwatch:GetMetricStatistics",
"dynamodb:DescribeLimits",
"elasticloadbalancing:DescribeAccountLimits",
"iam:GetAccountSummary",
"kinesis:DescribeLimits",
"organizations:DescribeAccount",
"organizations:DescribeOrganization",
"organizations:ListAWSServiceAccessForOrganization",
"rds:DescribeAccountAttributes",
"route53:GetAccountLimit",
"tag:GetTagKeys",
"tag:GetTagValues",
"servicequotas:GetAssociationForServiceQuotaTemplate",
"servicequotas:GetAWSDefaultServiceQuota",
"servicequotas:GetRequestedServiceQuotaChange",
"servicequotas:GetServiceQuota",
"servicequotas:GetServiceQuotaIncreaseRequestFromTemplate",
"servicequotas:ListAWSDefaultServiceQuotas",
"servicequotas:ListRequestedServiceQuotaChangeHistory",
"servicequotas:ListRequestedServiceQuotaChangeHistoryByQuota",
"servicequotas:ListServices",
"servicequotas:ListServiceQuotas",
"servicequotas:ListServiceQuotaIncreaseRequestsInTemplate",
"servicequotas:ListTagsForResource"
],
"Resource": "*"
}
]
}
Please Remove permissions that you would not use
- include default port here when finished
- Guide on how to write an exporter
- AWS Service Quota Documentation
- list-service-quotas: Lists the
applied quota values
for the specified AWS service. For some quotas, only the default values are available. If the applied quota value is not available for a quota, the quota is not retrieved - list-aws-default-service-quotas: Lists the
default values
for the quotas for the specified AWS service. A default value does not reflect any quota increases.
- list-service-quotas: Lists the