Conversation
|
|
||
| Token is same as virtual node (or vnode) in consistent hashing. | ||
|
|
||
| Number of tokens per ring is configurable via `--ingester.num-tokens`. Default is 128 |
There was a problem hiding this comment.
Q: why can't we have it configurable for distributors?
There was a problem hiding this comment.
because distributors are not part of the ring :)
There was a problem hiding this comment.
I think distributor also has its own ring. At least that's what I see.
|
|
||
| Loki in microservice mode usually can have multiple ingesters and distributors. This multiple instances of same component(ingester or distributor) forms a ring (more presicisly Consisten Hash Ring). | ||
|
|
||
| Both distributors and ingesters have their own ring. Write path looks like Client -> Distributors (ring) -> Ingesters (ring). Read path looks like Client -> Querier -> Ingesters (ring). |
There was a problem hiding this comment.
Not sure what do you mean by.
Both distributors and ingesters have their own ring.
There was a problem hiding this comment.
I actually mean two separate rings for distributor and ingestor.
distributor - https://github.com/grafana/loki/blob/master/pkg/distributor/distributor.go#L52
ingestor - https://github.com/grafana/loki/blob/master/pkg/ingester/ingester.go#L52
There was a problem hiding this comment.
I see but they're not all hashring. It seems that you explain hashring here. At least I would focus on ring type to avoid confusion.
|
|
||
| Loki aggregates incoming log lines into something called Log stream. Log stream is just a set of logs associated with a single tenant and a unique labelset(key value pairs). | ||
|
|
||
| Here is the problem, say I have large number of log streams coming in and I have a bunch of Loki servers(can be ingesters or distributors). Now I want to distribute these log streams across the servers in a way I can find it later when I want to read back the log stream. Here is the tricky part. I want to do this without any global directory or lookup service. |
There was a problem hiding this comment.
This could be rephrase a bit. The technical assumptions are correct.
|
|
||
| Each ingester belongs to a single hash ring. This hash ring stored in Consul is used to achieve consistent hashing; | ||
|
|
||
| Every ingester (also distributor) that is part of the ring has two things associated with it. |
There was a problem hiding this comment.
| Every ingester (also distributor) that is part of the ring has two things associated with it. | |
| Every ingester that is part of the ring has two things associated with it. |
There was a problem hiding this comment.
see for distributors we store rate limits, not tokens.
Co-authored-by: Cyril Tovena <[email protected]>
Co-authored-by: Cyril Tovena <[email protected]>
Co-authored-by: Cyril Tovena <[email protected]>
owen-d
left a comment
There was a problem hiding this comment.
I think this has some good bits, but we'll need to reorganize it. I'd like you to focus on what the ring is first. I think a section at the end describing how other components use the ring (rulers for scheduling alerting rule evaluation, distributors for calculating rate limits) would be helpful, but we should simplify most of the doc to only talk about our ingestion path.
The Cortex ring docs are a good resource here as well.
I'd like the structure to look like:
- how to spread writes across a pool of Ingesters?
- why is round robin bad?
- creates chunks per stream proportional to the number of ingesters
- how can we make sure streams are only on replication_factor nodes?
- hashing!
- show how hashing helps
- but re-hashing hurts us when ring membership changes :(
- consistent hashing! tokens, etc.
- what are our ring options? memberlist, etcd, consul
- Why do we only need 1 consul/etcd replica?
- nodes can re-register themselves if the ring store restarts
I also think it'd help if you added some diagrams showing hashing into the ring, how that translates to tokens & thus ingesters, etc.
Finally, there seems to be a bit of confusion about ingesters vs distributors here. Distributors technically store their own ring, but only store one token. The only thing this is used for is to get the total number of distributors so they can calculate per distributor rate limits like (total_rate_limit / n_distributors). Ingesters use the ring to register themselves & an associated set of tokens (vnodes) in order to help route log traffic to them deterministically. The distributors read the ingester ring to know which ingesters they need to send logs to. Again, I would defer the distributor & ruler rings to the end of the document in their own section.
|
|
||
| Another way to solve this problem is via hashing. We assign an integer to the servers (say 0, 1, 2, 3.. etc) and we hash our log stream (labelset + tenantID) to integer value 'h' and handover it to the server with the value 'h%n' where n is the max server. | ||
|
|
||
| Interestingly, this approach solves some of the problem we have, say same log stream goes to same server (because same set of labels + tenant gives same hash value), and while reading, we can hash it back to find the server where its value is stored. |
There was a problem hiding this comment.
and while reading, we can hash it back to find the server where its value is stored.
I wouldn't mention this because we don't actually perform this optimization. It's very difficult to do as it changes when the ring membership changes. To get around this, we query all ingesters.
|
|
||
| Interestingly, this approach solves some of the problem we have, say same log stream goes to same server (because same set of labels + tenant gives same hash value), and while reading, we can hash it back to find the server where its value is stored. | ||
|
|
||
| But this solution lacks some scaling properties. Say if we introduce new server or remove existing server (intentionally or unintentially) then every single log stream will map to different server now. |
There was a problem hiding this comment.
| But this solution lacks some scaling properties. Say if we introduce new server or remove existing server (intentionally or unintentially) then every single log stream will map to different server now. | |
| But this solution lacks some scaling properties. Say if we introduce new server or remove existing server (intentionally or unintentially) then every single log stream may map to different server now. |
|
|
||
| If Loki is run in microservice mode, both ingester and distributor exposes their ring status via the endpoint `api/v1/ruler/ring` | ||
|
|
||
| ### Other uses of consistent hashing. |
There was a problem hiding this comment.
This is not actually true. We use the index to find the chunks. Chunks are content addressed to de-amplify writes: there's no need to store replication_factor identical chunks if their contents are the same. Content addressing helps ensure this because writing the same chunk would have the same address, deduping itself naturally.
|
|
||
| ### Ring Status page | ||
|
|
||
| If Loki is run in microservice mode, both ingester and distributor exposes their ring status via the endpoint `api/v1/ruler/ring` |
There was a problem hiding this comment.
This endpoint is only the ruler ring. The ingester ring is stored at /ring and that's the one I'd mention here. Followup: we should really expose all the rings on their own endpoints.
|
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions. |
* Add tenant resolver package This implements the multi tenant resolver as described by the [proposal] for multi tenant query-federation. By default it behaves like before, but it's implementation can be swapped out. [proposal]: cortexproject/cortex#3364 Signed-off-by: Christian Simon <[email protected]> * Replace usages of `ExtractOrgID` Use TenantID or UserID depending on which of the methods are meant to be used. Signed-off-by: Christian Simon <[email protected]> * Replace usages of `ExtractOrgIDFromHTTPRequest` This is replaced by ExtractTenantIDFromHTTPRequest, which makes sure that exactly one tenant ID is set. Signed-off-by: Christian Simon <[email protected]> * Add methods to `tenant` package to use resolver directly Signed-off-by: Christian Simon <[email protected]> * Remove UserID method from Resolver interface We need a better definition for what we are trying to achieve with UserID before we can add it to the interface Signed-off-by: Christian Simon <[email protected]> * Update comment on the TenantID/TenantIDs Signed-off-by: Christian Simon <[email protected]> * Improve performance of NormalizeTenantIDs - reduce allocations by reusing the input slice during de-duplication Signed-off-by: Christian Simon <[email protected]>
| - LEAVING | ||
| - UNHEALTHY | ||
|
|
||
| The state ACITVE may receive both read and write requests. While state JOINING can receive only write requests, the state LEAVING may receive read requests. |
|
Closing this as no activities for long time. Feel free to send new PR if anyone want's to revive the work |
|
Seems a shame to have lost this improvement in documentation 😢 |
What this PR does / why we need it:
Document the Hash Ring component used in Loki
Which issue(s) this PR fixes:
NA
Special notes for your reviewer:
Based on discussion with @owen-d :)
Checklist