Skip to content

Conversation

@jhesketh
Copy link
Contributor

@jhesketh jhesketh commented Sep 19, 2024

Native histograms are ignored when calculating stddev or stdvar.

However, for the first series that is loaded a groupedAggregation would always be created. If the first series that was loaded is a histogram then it acts as the equivalent of a 0 point.

This change does not create a groupedAggregation if the point is a histogram, thus ignoring it like the rest of the aggregation function does.

Additionally, when doing stddev or stdvar on a single series with a NaN, 0 was returned. Whereas any aggregation group with more than one series would return NaN. Make this consistent by always returning NaN.

Similarly, stddev and stdvar would return NaN when infinity exists in one series. However if infinity was the only point loaded it would return 0. Make this consistent by always treating infinity as NaN.*

* Note that this likely isn't the correct way to handle infinity in a standard deviation. For example, given two points 1 and inf, I would expect the stddev to be inf. For now, however, this change keeps the functionality of stddev as it was and makes it consistent for when handling the case of just infinity.

(The added tests fail without this change).

@juliusv juliusv requested a review from beorn7 September 19, 2024 18:28
@beorn7
Copy link
Member

beorn7 commented Sep 19, 2024

Thanks, will have a look ASAP.

Note that the tracking bug for this is #13934 . /cc @NeerajGartia21

@beorn7
Copy link
Member

beorn7 commented Sep 19, 2024

Actually, in #13934, we assumed the implemented behavior is already correct, but you found (and fixed here) that that's not entirely true.

Native histograms are ignored when calculating stddev or stdvar.

However, for the first series that is loaded a groupedAggregation would
always be created. If the first series that was loaded is a histogram
then it acts as the equivalent of a 0 point.

This change does not create a groupedAggregation if the point is a
histogram, thus ignoring it like the rest of the aggregation function does.

(The added tests fail without this change).

Signed-off-by: Joshua Hesketh <[email protected]>
Signed-off-by: Joshua Hesketh <[email protected]>
Signed-off-by: Joshua Hesketh <[email protected]>
@jhesketh jhesketh changed the title Fix stddev/stdvar when aggregating histograms Fix stddev/stdvar when aggregating histograms, NaN, or Infinities Sep 23, 2024
@jhesketh jhesketh changed the title Fix stddev/stdvar when aggregating histograms, NaN, or Infinities Fix stddev/stdvar when aggregating histograms and NaNs Sep 23, 2024
@jhesketh jhesketh changed the title Fix stddev/stdvar when aggregating histograms and NaNs Fix stddev/stdvar when aggregating histograms, NaNs, and infinities Sep 25, 2024
@jhesketh
Copy link
Contributor Author

I've also added a fix to infinity to be consistent. However, how infinities are handled should likely be visited in more detail. (please see updated topic).

promql/engine.go Outdated
}
case parser.STDVAR, parser.STDDEV:
group.floatValue = 0
if h == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of initialising the groupedAggregation for each case, can we just set the group.seen=false when histogram is encountered for the first time in the group? If YES, then it will reduce a lot of code repeatations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this works. I have made some changes and pushed it to @jhesketh's branch. @jhesketh WDYT?

@beorn7
Copy link
Member

beorn7 commented Oct 2, 2024

Thanks for having a look @NeerajGartia21 . And apologies for my continued delay. I'll try to get to this next week.

@beorn7
Copy link
Member

beorn7 commented Oct 9, 2024

Thank you again @jhesketh . I think this works fine, but I have played with @NeerajGartia21's idea, and it seems to work. (But I'm definitely not an export when it comes to the PromQL engine, so I might have missed something.)
In any case, I have simply pushed my attempt to your branch. Let me know what you think.

@beorn7
Copy link
Member

beorn7 commented Oct 9, 2024

Also resolved merge conflict. Now let's see if the CI tests will run this time.

- Use canonical formatting in PromQL expressions.
- Add tests `by (label)`.
- Consistent indentation.

Signed-off-by: beorn7 <[email protected]>
@beorn7
Copy link
Member

beorn7 commented Oct 9, 2024

Added more test nuances.

@beorn7
Copy link
Member

beorn7 commented Oct 15, 2024

@jhesketh do you have any thoughts on my pushed additions?

@NeerajGartia21 maybe you could review my additions. If you agree they are correct, we can merge this.

@beorn7
Copy link
Member

beorn7 commented Oct 15, 2024

I solved merge conflicts once more.

@jhesketh
Copy link
Contributor Author

Sorry for the delay on this (been on PTO). Looks good to me, thank you for the improvements and solving merge conflicts :-)

@NeerajGartia21
Copy link
Contributor

It looks good to me as well! I think it’s ready to merge.

@beorn7
Copy link
Member

beorn7 commented Oct 16, 2024

Thank you all. I'll squash this all into one and merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants