Auto-scale DynamoDB provision based on Prometheus metrics#841
Auto-scale DynamoDB provision based on Prometheus metrics#841
Conversation
b7990d1 to
8497e4f
Compare
|
At the weekly table changeover (which happens Wednesday night for me) it initially does a scale-down of the new table, because the error rate is zero. Having thought about it a bit, I think this is best addressed by setting the minimum write capacity to a level which is enough to handle the expected traffic at midnight. I like the idea that the core algorithm should operate without knowing anything specific about the individual tables. |
|
Scaling up by a fixed fraction of current provision isn't very good. As a simple improvement I might floor the increment to a fraction of the max, e.g. if max is 10K then min increment is 1K. |
| func chunkTagsToDynamoDB(ts chunk.Tags) []*dynamodb.Tag { | ||
| if ts == nil { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Is this redundant? Looks like result will be nil if len(ts) == 0 (or ts is nill)
There was a problem hiding this comment.
yes; it's cut/pasted from what was there before
| err = d.disableAutoScaling(ctx, expected) | ||
| } else if current.WriteScale != expected.WriteScale { | ||
| err = d.enableAutoScaling(ctx, expected) | ||
| if expected.WriteScale.Enabled && d.metrics != nil { |
There was a problem hiding this comment.
Perhaps the check expected.WriteScale.Enabled && d.metrics != nil should be put in the constructor so we can fail early? Then we can just check expected.WriteScale.Enabled here.
There was a problem hiding this comment.
Actually, in what situation is metrics nil?
There was a problem hiding this comment.
metrics is nil when you want AWS auto-scaling
There was a problem hiding this comment.
I don't see how - must be missing something. Would you mind explaining a little further for me?
There was a problem hiding this comment.
plus or minus a bug...
| }); err != nil { | ||
| return err | ||
| if awsErr, ok := err.(awserr.Error); ok && awsErr.Code() == "LimitExceededException" { | ||
| level.Warn(util.Logger).Log("msg", "update limit exceeded", "err", err) |
There was a problem hiding this comment.
Perhaps export a counter for this too?
pkg/chunk/aws/metrics_autoscaling.go
Outdated
| if err != nil { | ||
| return nil, err | ||
| } | ||
| promAPI = promV1.NewAPI(client) |
There was a problem hiding this comment.
If I read this right, this will return a non-functioning but non-nil metricsData if the addr is empty? That seems odd to me; I'd expect either an error or a fully-functioningmetricsData - perhaps this logic should be pushed up to the called?
pkg/chunk/aws/metrics_autoscaling.go
Outdated
| } | ||
|
|
||
| func extractRates(matrix model.Matrix) (map[string]float64, error) { | ||
| ret := make(map[string]float64) |
There was a problem hiding this comment.
IIRC throughout the codebase we tend to prefer map[string]float64, and only use make when we know the size.
pkg/chunk/aws/metrics_autoscaling.go
Outdated
| } | ||
|
|
||
| // fetch write error rate per DynamoDB table | ||
| deMatrix, err := promQuery(ctx, m.promAPI, `sum(rate(cortex_dynamo_failures_total{error="ProvisionedThroughputExceededException",operation=~".*Write.*"}[1m])) by (table) > 0`, 0, time.Second) |
There was a problem hiding this comment.
I think we should be be able to inject extra selectors into these queries, for instance we run multiple cortex cluster monitored by the same prometheus.
There was a problem hiding this comment.
I have made the whole promql expression configurable, on the basis that other things like rate periods might need tweaking too.
| return chunk.TableDesc{ | ||
| Name: name, | ||
| }, dynamodb.TableStatusActive, nil | ||
| }, true, nil |
| f.Int64Var(&cfg.OutCooldown, argPrefix+".out-cooldown", 3000, "DynamoDB minimum time between each autoscaling event that increases provision capacity.") | ||
| f.Int64Var(&cfg.InCooldown, argPrefix+".in-cooldown", 3000, "DynamoDB minimum time between each autoscaling event that decreases provision capacity.") | ||
| f.Int64Var(&cfg.OutCooldown, argPrefix+".out-cooldown", 1800, "DynamoDB minimum seconds between each autoscale up.") | ||
| f.Int64Var(&cfg.InCooldown, argPrefix+".in-cooldown", 1800, "DynamoDB minimum seconds between each autoscale down.") |
There was a problem hiding this comment.
All usages of this are as a time.Duration, so perhaps we should make this a DurationVar?
There was a problem hiding this comment.
Good idea, but not backwards-compatible. I guess we could add two new flags and deprecate the old.
| }) | ||
| } | ||
| return result | ||
| } |
| queueLengths []float64 | ||
| errorRates map[string]float64 | ||
| usageRates map[string]float64 | ||
| } |
There was a problem hiding this comment.
There is quite a lot of coupling between metricsData and dynamoTableClient; I wonder if more work needs to be done to tease out a clean interface?
There was a problem hiding this comment.
OK, I think much of this was simply the metrics methods were on the wrong struct, but I went ahead and created an abstract interface so the choice between old-style and new-style is in the constructor. Let me know what you think.
|
A few minor nits, main concern is about coupling between the metricsData and tableClient. I wonder if a better interface could be teased out? |
pkg/chunk/aws/metrics_autoscaling.go
Outdated
| } | ||
| } | ||
|
|
||
| func newMetrics(cfg DynamoDBConfig) (*metricsData, error) { |
There was a problem hiding this comment.
This probably belongs at the top of the file now?
| }, []string{"operation", "status_code"}) | ||
| // Pluggable auto-scaler implementation | ||
| type autoscale interface { | ||
| CreateTable(ctx context.Context, desc chunk.TableDesc) error |
There was a problem hiding this comment.
This is really PostCreateTable, right?
There was a problem hiding this comment.
yeah, I was wrestling with how similar it is to the higher-level interface
If the current provision is |
and bump the failure counter too
so we can supply multiple sets of results
Don't want to scale down if all our metrics are returning zero, or all the ingesters have crashed, etc
50 minutes seems weird. Currently using 30 minutes in production.
Move old-style AWS autoscaling to a separate file; make metrics autoscaling implement the common interface.
If the current capacity is very low, adding 20% doesn't get us very far, so add minimum 10% of max capacity. Refactored to extract 'compute' functions for readability.
47eaed8 to
0dd6116
Compare
0dd6116 to
f93ace9
Compare
|
LGTM! |
Fixes #735
To use this feature, set
-dynamodb....scale.enabledon in command-line args but do not supply an AWS autoscaler-applicationautoscaling.url.Also set
-metrics.urlto point at a Prometheus API-compatible server which will serve the metrics.(at the time of writing this can't be a multi-tenant server because I didn't put in an option for the tenant)
PR also includes much refactoring of existing tests because I couldn't stand so much repetition.
Outstanding things I should do:
queueLengthAcceptableto be changedout-cooldownandin-cooldownFor extra credit: