Conversation
pkg/tenant/resolver.go
Outdated
There was a problem hiding this comment.
Using a global variable here is not something I am particularly happy about.
I have tried to pass a tenantResolver reference around that would itself be instantiated in a Cortex module. But this approach has a massive changeset, as its touching so many areas of the codebase and I decided to go with this approach.
There was a problem hiding this comment.
I think a global is definitely acceptable in this case. A package could always instantiate it's own resolver down the line if there was a reason to go down that path.
pkg/tenant/resolver.go
Outdated
There was a problem hiding this comment.
I think a global is definitely acceptable in this case. A package could always instantiate it's own resolver down the line if there was a reason to go down that path.
pkg/tenant/resolver.go
Outdated
There was a problem hiding this comment.
Cool, so we are going to continue using the weaveworks/common library as is? It just won't be aware of how the header value will be resolved.
There was a problem hiding this comment.
Yes being able intercept headers would require us to do something that I have trailed #3416, which came with a massive change set across the project.
pkg/alertmanager/api.go
Outdated
There was a problem hiding this comment.
suggestion(if-minor): Instead of retrieving the default resolver and calling these functions, what if you expose the functions of the Resolver interface as top level functions in the tenant package and just assume they will utilize the default in the package. That could make the syntax a bit cleaner.
| userID, err := tenant.DefaultResolver().TenantID(r.Context()) | |
| userID, err := tenant.TenantID(r.Context()) |
There was a problem hiding this comment.
I have done that, golint complained about the stuttering issue: tenant.TenantID[s], but I think it's important to have it there to make it clear UserID vs TenantID.
f6e7bb8 to
c7f2471
Compare
csmarchbanks
left a comment
There was a problem hiding this comment.
I would love some more detail around when UserID will be used vs TenantID/TenantIDs. I don't see any info in the proposal around that (or I am blind which is totally possible). Is this meant to be something like a username? Just an extra header that will be passed?
The comment "used to identify a user in log messages" scares me a bit as we could be getting into PII which I certainly do not want to encourage people to log.
@csmarchbanks agreed. This is something where the current comments on the interface What I think we should be trying to solve is how to move away from the overloaded meaning that is currently derived from While I think using the default settings cortex will always behave As you are aware we are using Cortex as a basis of Grafana Metric Enterprise and we have some plans that involve distinguishing more details between different actors using the API within the same tenant. I see that as a first step in getting a good understanding what is involved there. I hope that makes sense and wonder what if you think this could go forward, with some of that baked into the comments of the |
I don't see this as overloaded. Introducing new userID concept is confusing, since in Cortex we don't deal with individual users in any way.
In the proposal, multiple tenants were sorted. I am bit confused by this statement (esp. by "TenantIDs[0]" part, since that will always be tenant with lowest ID??) |
There was a problem hiding this comment.
Using userID without tenantID is misleading. Different tenants may have same set of users. For log messages, metrics or fairness, both need to be used.
For metrics especially, using user can lead to additional cardinality explosion, and is explicitly called out as bad practice.
From the code it seems like what you're calling "user ID" is basically a sorted set of tenant IDs. This mix of terminology is confusing.
This code replaces user.ExtractOrgID calls, but not user.InjectOrgID calls. Furthermore, there are components from weaveworks library, that we use, that still use original user.ExtractOrgID (instrumentation, tenant propagation for grpc calls). I think this needs to be fixed, preferably in the weaveworks common library. (Otherwise we need to reimplement these helper components on top of new "tenant" package).
Totally agreed, yes eventually we will need information in the
Both of those are correct and I know this is far from ideal, but that was called out as part of the proposal to be a step-gap solution, that really only affects you if you enabled multi-tenant support and ideally log both and so on
I don't think this is a problem as we ourselves in the
Sorry I have really phrased that in aquite confusing way, what I meant to say. Using the default resolver, which is
I think that's how my vision for the future work would look like, either making I am right now not yet in a position to predict how we would achieve that and how interfaces and more specific would look like, but by gradually carving parts out (like this PR) I think we can make this transition work. |
|
I am wondering if this PR would have it easier to get consensus if I would remove the
wdyt @pstibrany? |
Withdrawing my request in case someone else wants to approve. (I plan to take another look but not sure when)
These suggestions would make it less controversial |
23c4d2b to
af2045e
Compare
pkg/tenant/tenant.go
Outdated
There was a problem hiding this comment.
Performance wise a couple of things:
- The input
idsis just fine if there are no duplicates (which I would expect to be the common case) resultmake be pre-allocated with the length ofids
pstibrany
left a comment
There was a problem hiding this comment.
Thank you for addressing the feedback, LGTM. Currently it's a noop in Cortex, and I'm looking forward to see more support for multitenant.
This implements the multi tenant resolver as described by the [proposal] for multi tenant query-federation. By default it behaves like before, but it's implementation can be swapped out. [proposal]: cortexproject#3364 Signed-off-by: Christian Simon <[email protected]>
Use TenantID or UserID depending on which of the methods are meant to be used. Signed-off-by: Christian Simon <[email protected]>
This is replaced by ExtractTenantIDFromHTTPRequest, which makes sure that exactly one tenant ID is set. Signed-off-by: Christian Simon <[email protected]>
Signed-off-by: Christian Simon <[email protected]>
We need a better definition for what we are trying to achieve with UserID before we can add it to the interface Signed-off-by: Christian Simon <[email protected]>
Signed-off-by: Christian Simon <[email protected]>
- reduce allocations by reusing the input slice during de-duplication Signed-off-by: Christian Simon <[email protected]>
af2045e to
cd074e9
Compare
* Add tenant resolver package This implements the multi tenant resolver as described by the [proposal] for multi tenant query-federation. By default it behaves like before, but it's implementation can be swapped out. [proposal]: cortexproject/cortex#3364 Signed-off-by: Christian Simon <[email protected]> * Replace usages of `ExtractOrgID` Use TenantID or UserID depending on which of the methods are meant to be used. Signed-off-by: Christian Simon <[email protected]> * Replace usages of `ExtractOrgIDFromHTTPRequest` This is replaced by ExtractTenantIDFromHTTPRequest, which makes sure that exactly one tenant ID is set. Signed-off-by: Christian Simon <[email protected]> * Add methods to `tenant` package to use resolver directly Signed-off-by: Christian Simon <[email protected]> * Remove UserID method from Resolver interface We need a better definition for what we are trying to achieve with UserID before we can add it to the interface Signed-off-by: Christian Simon <[email protected]> * Update comment on the TenantID/TenantIDs Signed-off-by: Christian Simon <[email protected]> * Improve performance of NormalizeTenantIDs - reduce allocations by reusing the input slice during de-duplication Signed-off-by: Christian Simon <[email protected]>
* Add tenant resolver package This implements the multi tenant resolver as described by the [proposal] for multi tenant query-federation. By default it behaves like before, but it's implementation can be swapped out. [proposal]: cortexproject/cortex#3364 Signed-off-by: Christian Simon <[email protected]> * Replace usages of `ExtractOrgID` Use TenantID or UserID depending on which of the methods are meant to be used. Signed-off-by: Christian Simon <[email protected]> * Replace usages of `ExtractOrgIDFromHTTPRequest` This is replaced by ExtractTenantIDFromHTTPRequest, which makes sure that exactly one tenant ID is set. Signed-off-by: Christian Simon <[email protected]> * Add methods to `tenant` package to use resolver directly Signed-off-by: Christian Simon <[email protected]> * Remove UserID method from Resolver interface We need a better definition for what we are trying to achieve with UserID before we can add it to the interface Signed-off-by: Christian Simon <[email protected]> * Update comment on the TenantID/TenantIDs Signed-off-by: Christian Simon <[email protected]> * Improve performance of NormalizeTenantIDs - reduce allocations by reusing the input slice during de-duplication Signed-off-by: Christian Simon <[email protected]>
* Add tenant resolver (cortexproject/cortex#3486) * Add tenant resolver package This implements the multi tenant resolver as described by the [proposal] for multi tenant query-federation. By default it behaves like before, but it's implementation can be swapped out. [proposal]: cortexproject/cortex#3364 Signed-off-by: Christian Simon <[email protected]> * Replace usages of `ExtractOrgID` Use TenantID or UserID depending on which of the methods are meant to be used. Signed-off-by: Christian Simon <[email protected]> * Replace usages of `ExtractOrgIDFromHTTPRequest` This is replaced by ExtractTenantIDFromHTTPRequest, which makes sure that exactly one tenant ID is set. Signed-off-by: Christian Simon <[email protected]> * Add methods to `tenant` package to use resolver directly Signed-off-by: Christian Simon <[email protected]> * Remove UserID method from Resolver interface We need a better definition for what we are trying to achieve with UserID before we can add it to the interface Signed-off-by: Christian Simon <[email protected]> * Update comment on the TenantID/TenantIDs Signed-off-by: Christian Simon <[email protected]> * Improve performance of NormalizeTenantIDs - reduce allocations by reusing the input slice during de-duplication Signed-off-by: Christian Simon <[email protected]> * Add multi tenant query federation (cortexproject/cortex#3250) * Add tenant query federation This experimental feature allows queries to cover data from more than a single Cortex tenant and can be activated by supplying `-tenant-federation.enabled` to all cortex instances. To query multiple tenants a `|` separated tenant list can be specified in the `X-Scope-OrgID` header. The source tenant of a metric will be exposed in the label `__tenant_id__`. Signed-off-by: Christian Simon <[email protected]> * Aggregate the limit of maxQueriers correctly This ensures the limit is aggregated correctly of the setting of each individual tenant. It also implements the logic for the v2 query frontend. Signed-off-by: Christian Simon <[email protected]> * Fix tenant labels and make LabelNames more efficient Signed-off-by: Christian Simon <[email protected]> * Use tsdb_errors for error handling Signed-off-by: Christian Simon <[email protected]> * Fix uninitialized label matcher Regexp matcher need to be initialized, this adds a test for regexp matcher and fixes the underlying issue. Signed-off-by: Christian Simon <[email protected]> * Use map for filterValuesByMatchers to reduce allocations Signed-off-by: Christian Simon <[email protected]> * Address review suggestions Signed-off-by: Christian Simon <[email protected]> * Add validation.SmallestPositiveNonZeroIntPerTenant to avoid code duplication Signed-off-by: Christian Simon <[email protected]> * Add limitations and experimental status to docs Signed-off-by: Christian Simon <[email protected]> * Remove unnecessary cast of defaultTenantLabel Signed-off-by: Christian Simon <[email protected]> * Handle query-range limits for multi tenant queries Signed-off-by: Christian Simon <[email protected]> * Skip results cache, if multi tenants query Signed-off-by: Christian Simon <[email protected]> * Add failint to ensure query path supports multiple tenants To avoid any future regressions in the multi tenant support within the query path, a faillint command tests if TenantID is used and if it finds one, it suggestest using TenantIDs instead> Signed-off-by: Christian Simon <[email protected]> * Align CHANGELOG line with the flag description Signed-off-by: Christian Simon <[email protected]> * Add a limitation about missing results cache Signed-off-by: Christian Simon <[email protected]> * Restrict path segments in TenantIDs (cortexproject/cortex#4375) (cortexproject/cortex#4376) This prevents paths generated from TenantIDs to become vulnerable to path traversal attacks. CVE-2021-36157 Signed-off-by: Christian Simon <[email protected]> * Update nolint statement Co-authored-by: Bryan Boreham <[email protected]>
What this PR does:
Which issue(s) this PR fixes:
This implements the multi tenant resolver as described by the proposal for multi tenant query-federation.
By default it behaves like before, but it's implementation can be swapped out.
This replaces:
user.ExtractOrgIDby eitherTenantIDorUserIDdepending on if in this context a multi tenant usage should be alloweduser.ExtractOrgIDFromHTTPRequestwithExtractTenantIDFromHTTPRequestto gather exactly one tenant id.I have also added lint test to check for usage of the upstream methods directly
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]