-
Notifications
You must be signed in to change notification settings - Fork 10.3k
tsdb.DefaultStripeSize is way too big at 16384 #17100
Description
This is more of an issue for multi-tenant derivatives, but I think it's also wrong inside Prometheus.
The number is defined here:
Line 1896 in ba808d1
| DefaultStripeSize = 1 << 14 |
It is used to size a central data structure for TSDB - the lookup maps for series, and the locks that protect them.
Lines 1922 to 1933 in ba808d1
| series: make([]map[chunks.HeadSeriesRef]*memSeries, stripeSize), | |
| hashes: make([]seriesHashmap, stripeSize), | |
| locks: make([]stripeLock, stripeSize), | |
| seriesLifecycleCallback: seriesCallback, | |
| } | |
| for i := range s.series { | |
| s.series[i] = map[chunks.HeadSeriesRef]*memSeries{} | |
| } | |
| for i := range s.hashes { | |
| s.hashes[i] = seriesHashmap{ | |
| unique: map[uint64]*memSeries{}, |
With the default number of stripes, 16384, each of locks, series and hashes allocates around 1MB, which may not seem much in the Prometheus context, but given the attention paid to avoiding cache line sharing, worth reflecting that 1MB is about the size of a typical L2 cache.
The justification given for striping the data structure is to avoid lock contention. By "birthday problem" reasoning, 16384 gives us 128 parallel operations with less than 50% chance of collision. This is at the absolute top end of what anyone would be running Prometheus at, so way too high for a default.
I would suggest changing it to 256.
Update: I was under the impression that #6644 "made stripe size configurable" added a flag, but it was just internal configurability. So probably we need to add that flag to let people change it back.