Skip to content

Ring doc#3486

Closed
kavirajk wants to merge 5 commits intografana:mainfrom
kavirajk:ring-doc
Closed

Ring doc#3486
kavirajk wants to merge 5 commits intografana:mainfrom
kavirajk:ring-doc

Conversation

@kavirajk
Copy link
Copy Markdown
Contributor

@kavirajk kavirajk commented Mar 15, 2021

What this PR does / why we need it:

Document the Hash Ring component used in Loki

  • What is hash ring
  • What problem does it solves
  • How it works?

Which issue(s) this PR fixes:
NA

Special notes for your reviewer:

Based on discussion with @owen-d :)

Checklist

  • Documentation added
  • Tests updated

@kavirajk kavirajk requested a review from a team March 15, 2021 07:37

Token is same as virtual node (or vnode) in consistent hashing.

Number of tokens per ring is configurable via `--ingester.num-tokens`. Default is 128
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: why can't we have it configurable for distributors?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because distributors are not part of the ring :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kavirajk kavirajk requested a review from owen-d March 15, 2021 07:39
Comment thread docs/sources/architecture/ring.md Outdated

Loki in microservice mode usually can have multiple ingesters and distributors. This multiple instances of same component(ingester or distributor) forms a ring (more presicisly Consisten Hash Ring).

Both distributors and ingesters have their own ring. Write path looks like Client -> Distributors (ring) -> Ingesters (ring). Read path looks like Client -> Querier -> Ingesters (ring).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what do you mean by.

Both distributors and ingesters have their own ring.

Copy link
Copy Markdown
Contributor Author

@kavirajk kavirajk Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

@cyriltovena cyriltovena Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see but they're not all hashring. It seems that you explain hashring here. At least I would focus on ring type to avoid confusion.

Comment thread docs/sources/architecture/ring.md Outdated

Loki aggregates incoming log lines into something called Log stream. Log stream is just a set of logs associated with a single tenant and a unique labelset(key value pairs).

Here is the problem, say I have large number of log streams coming in and I have a bunch of Loki servers(can be ingesters or distributors). Now I want to distribute these log streams across the servers in a way I can find it later when I want to read back the log stream. Here is the tricky part. I want to do this without any global directory or lookup service.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be rephrase a bit. The technical assumptions are correct.

Comment thread docs/sources/architecture/ring.md Outdated

Each ingester belongs to a single hash ring. This hash ring stored in Consul is used to achieve consistent hashing;

Every ingester (also distributor) that is part of the ring has two things associated with it.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Every ingester (also distributor) that is part of the ring has two things associated with it.
Every ingester that is part of the ring has two things associated with it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see for distributors we store rate limits, not tokens.

Copy link
Copy Markdown
Contributor

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has some good bits, but we'll need to reorganize it. I'd like you to focus on what the ring is first. I think a section at the end describing how other components use the ring (rulers for scheduling alerting rule evaluation, distributors for calculating rate limits) would be helpful, but we should simplify most of the doc to only talk about our ingestion path.

The Cortex ring docs are a good resource here as well.

I'd like the structure to look like:

- how to spread writes across a pool of Ingesters?
  - why is round robin bad?
    - creates chunks per stream proportional to the number of ingesters
  -  how can we make sure streams are only on replication_factor nodes?
    - hashing!
      - show how hashing helps
    - but re-hashing hurts us when ring membership changes :(
      - consistent hashing! tokens, etc.
- what are our ring options? memberlist, etcd, consul
  - Why do we only need 1 consul/etcd replica?
    - nodes can re-register themselves if the ring store restarts

I also think it'd help if you added some diagrams showing hashing into the ring, how that translates to tokens & thus ingesters, etc.

Finally, there seems to be a bit of confusion about ingesters vs distributors here. Distributors technically store their own ring, but only store one token. The only thing this is used for is to get the total number of distributors so they can calculate per distributor rate limits like (total_rate_limit / n_distributors). Ingesters use the ring to register themselves & an associated set of tokens (vnodes) in order to help route log traffic to them deterministically. The distributors read the ingester ring to know which ingesters they need to send logs to. Again, I would defer the distributor & ruler rings to the end of the document in their own section.


Another way to solve this problem is via hashing. We assign an integer to the servers (say 0, 1, 2, 3.. etc) and we hash our log stream (labelset + tenantID) to integer value 'h' and handover it to the server with the value 'h%n' where n is the max server.

Interestingly, this approach solves some of the problem we have, say same log stream goes to same server (because same set of labels + tenant gives same hash value), and while reading, we can hash it back to find the server where its value is stored.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and while reading, we can hash it back to find the server where its value is stored.

I wouldn't mention this because we don't actually perform this optimization. It's very difficult to do as it changes when the ring membership changes. To get around this, we query all ingesters.


Interestingly, this approach solves some of the problem we have, say same log stream goes to same server (because same set of labels + tenant gives same hash value), and while reading, we can hash it back to find the server where its value is stored.

But this solution lacks some scaling properties. Say if we introduce new server or remove existing server (intentionally or unintentially) then every single log stream will map to different server now.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
But this solution lacks some scaling properties. Say if we introduce new server or remove existing server (intentionally or unintentially) then every single log stream will map to different server now.
But this solution lacks some scaling properties. Say if we introduce new server or remove existing server (intentionally or unintentially) then every single log stream may map to different server now.


If Loki is run in microservice mode, both ingester and distributor exposes their ring status via the endpoint `api/v1/ruler/ring`

### Other uses of consistent hashing.
Copy link
Copy Markdown
Contributor

@owen-d owen-d Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not actually true. We use the index to find the chunks. Chunks are content addressed to de-amplify writes: there's no need to store replication_factor identical chunks if their contents are the same. Content addressing helps ensure this because writing the same chunk would have the same address, deduping itself naturally.


### Ring Status page

If Loki is run in microservice mode, both ingester and distributor exposes their ring status via the endpoint `api/v1/ruler/ring`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This endpoint is only the ruler ring. The ingester ring is stored at /ring and that's the one I'd mention here. Followup: we should really expose all the rings on their own endpoints.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 20, 2021

CLA assistant check
All committers have signed the CLA.

@stale
Copy link
Copy Markdown

stale bot commented Jun 2, 2021

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale A stale issue or PR that will automatically be closed. label Jun 2, 2021
@kavirajk kavirajk added keepalive An issue or PR that will be kept alive and never marked as stale. and removed stale A stale issue or PR that will automatically be closed. labels Jun 6, 2021
cyriltovena pushed a commit to cyriltovena/loki that referenced this pull request Jun 11, 2021
* Add tenant resolver package

This implements the multi tenant resolver as described by the [proposal]
for multi tenant query-federation.

By default it behaves like before, but it's implementation can be
swapped out.

[proposal]: cortexproject/cortex#3364

Signed-off-by: Christian Simon <[email protected]>

* Replace usages of `ExtractOrgID`

Use TenantID or UserID depending on which of the methods are meant to be
used.

Signed-off-by: Christian Simon <[email protected]>

* Replace usages of `ExtractOrgIDFromHTTPRequest`

This is replaced by ExtractTenantIDFromHTTPRequest, which makes sure
that exactly one tenant ID is set.

Signed-off-by: Christian Simon <[email protected]>

* Add methods to `tenant` package to use resolver directly

Signed-off-by: Christian Simon <[email protected]>

* Remove UserID method from Resolver interface

We need a better definition for what we are trying to achieve with
UserID before we can add it to the interface

Signed-off-by: Christian Simon <[email protected]>

* Update comment on the TenantID/TenantIDs

Signed-off-by: Christian Simon <[email protected]>

* Improve performance of NormalizeTenantIDs

- reduce allocations by reusing the input slice during de-duplication

Signed-off-by: Christian Simon <[email protected]>
- LEAVING
- UNHEALTHY

The state ACITVE may receive both read and write requests. While state JOINING can receive only write requests, the state LEAVING may receive read requests.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just typo: ACITVE vs ACTIVE

@kavirajk
Copy link
Copy Markdown
Contributor Author

Closing this as no activities for long time. Feel free to send new PR if anyone want's to revive the work

@kavirajk kavirajk closed this Mar 18, 2022
@danpoltawski
Copy link
Copy Markdown
Contributor

Seems a shame to have lost this improvement in documentation 😢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

keepalive An issue or PR that will be kept alive and never marked as stale. size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants