Skip to content

add quantile#428

Merged
zyzhu merged 3 commits intofslaborg:masterfrom
alexpantyukhin:add_quantile
Aug 21, 2018
Merged

add quantile#428
zyzhu merged 3 commits intofslaborg:masterfrom
alexpantyukhin:add_quantile

Conversation

@alexpantyukhin
Copy link
Copy Markdown
Contributor

No description provided.

@tpetricek
Copy link
Copy Markdown
Member

This looks great to me! 👍

@zyzhu
Copy link
Copy Markdown
Contributor

zyzhu commented Aug 20, 2018

@alexpantyukhin It looks great. Three comments.

  1. Make the function inline to support integer and decimal values. The following lines would do.
  static member inline quantile (quantiles:float[], series:Series<'K, 'V>) =
    let vals = series.Values |> Seq.sort |> Seq.toArray
    let valsLength = vals |> Array.length
    
    quantiles
    |> Array.map(fun q ->
      let floatIndex = q * float(valsLength - 1)
      if floatIndex % 1.0 = 0.0 then
        string q, float vals.[int(floatIndex)]
      else
        let index = int(floatIndex)
        let l = float vals.[index]
        let r = float vals.[index + 1]

        string q, l + (r - l) * (floatIndex % 1.0)
    )
    |> Series.ofObservations
  1. Math.Net supports 9 different ways of quantile calculation from R project. https://numerics.mathdotnet.com/descriptivestatistics.html
    The one you implemented is essentially R7, the Excel version. You can be explicit in the comments of the function saying it's the Excel version of quantile, equivalent to QuantileDefinition.R7 from Math.Net.

In addition, you could add another unit test in the end of stats.fs to use FsCheck and Math.Net quantilecustom function to confirm your number has the same output as Math.Net.

  1. We could add wrapper in the future Deedle.Math to support all other quantilecustom functions from Math.Net. You implementation here is good enough for quick descriptive study.

Thanks a lot.

@alexpantyukhin
Copy link
Copy Markdown
Contributor Author

alexpantyukhin commented Aug 20, 2018

Thank you @zyzhu for your notes!
About the first point: do you mean the decimal and integer for series values?
Actually I forgot to make a check there that all values in quantiles:float[] should be <= 1.0 .

@zyzhu
Copy link
Copy Markdown
Contributor

zyzhu commented Aug 20, 2018

@alexpantyukhin Yes. I meant series could have integer or decimal values. Quantile on that series should still work and output float value. Just like all the other inline stats functions we refactored recently.

You are right about the second point. The parameter values should be checked between 0.0 and 1.0

@zyzhu
Copy link
Copy Markdown
Contributor

zyzhu commented Aug 21, 2018

@alexpantyukhin
Copy link
Copy Markdown
Contributor Author

Yes, I saw it. Interesting that it didn't pass only for .netcore. I will check that.

@zyzhu
Copy link
Copy Markdown
Contributor

zyzhu commented Aug 21, 2018

Looks great. Maybe it's time to add quantile info to describe function. Thanks!

@alexpantyukhin
Copy link
Copy Markdown
Contributor Author

Still need to check tests :) It passed well even in previos time.

@zyzhu
Copy link
Copy Markdown
Contributor

zyzhu commented Aug 21, 2018

Travis has passed test. It looks like the status here is stuck. I'll merge it.

@zyzhu zyzhu merged commit 43fd4e1 into fslaborg:master Aug 21, 2018
@alexpantyukhin alexpantyukhin deleted the add_quantile branch August 21, 2018 18:36
@alexpantyukhin
Copy link
Copy Markdown
Contributor Author

Cool, thanks! I guess it’s time to add it into describe method. I would like to make describe method according the pandas way. For different types different collection of keys.

@alexpantyukhin
Copy link
Copy Markdown
Contributor Author

I will do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants