Skip to content

docs(example): Adds Confidence Interval Ellipses#3747

Merged
mattijn merged 32 commits intomainfrom
vegalite_v2_examples
Jan 18, 2025
Merged

docs(example): Adds Confidence Interval Ellipses#3747
mattijn merged 32 commits intomainfrom
vegalite_v2_examples

Conversation

@dangotbanned
Copy link
Copy Markdown
Member

@dangotbanned dangotbanned commented Jan 4, 2025

Will close #3715

Description

Adds an example inspired by (ggplot2|plotnine).stat_ellipse().

As can be seen in the first commit, this PR began by rebasing a closed PR from almost 7 years ago.

Relevant info from (#3715 (comment)):

I believe plotnine.stat_ellipse would be an example of implementing this with numpy, scipy.
Source code

I also found an old closed PR (#514 by @essicolo) that would have added an example for this.
The blocker at the time is no longer an issue as (#3202 by @joelostblom) added scipy as a docs dependency.

Example

image

Tasks

Future Work

I think a more generalized version of this would be a good fit for https://github.com/vega/altair_ally.
An issue might be the scipy dependency, which I really was hoping to be able to avoid here.
The dendrogram example shows some kind of inlining from scipy - but I have no idea if that is possible for:

Serge-Étienne Parent and others added 9 commits January 4, 2025 16:16
example showing bivariate deviation ellipses of petal length and width of three iris species
Happy with the end result, but not comfortable merging so much complexity I don't understand yet

#3715
@dangotbanned
Copy link
Copy Markdown
Member Author

Cannot express how relieved I am to see the CI finally green 😅
be087d2

Comment thread tests/examples_arguments_syntax/deviation_ellipses.py Outdated
Comment thread tests/examples_arguments_syntax/deviation_ellipses.py Outdated
@dangotbanned dangotbanned changed the title docs(DRAFT): Add Confidence Interval Ellipse example docs: Add Confidence Interval Ellipse example Jan 5, 2025
@dangotbanned dangotbanned changed the title docs: Add Confidence Interval Ellipse example docs(example): Adds Confidence Interval Ellipses Jan 5, 2025
Comment thread tests/examples_arguments_syntax/deviation_ellipses.py Outdated
Includes comment removal suggestion in (#3747 (comment))
- Fixed a type ignore (causes by incomplete stubs)
- Renamed variables
- Make replace the implicit `"index"` column with naming it `"order"`

#3747 (comment)
@dangotbanned dangotbanned marked this pull request as ready for review January 10, 2025 16:45
@dangotbanned dangotbanned requested a review from mattijn January 10, 2025 16:53
@mattijn
Copy link
Copy Markdown
Contributor

mattijn commented Jan 16, 2025

Thanks for this example! Can we add this to the category case studies instead of distributions? If the uncertainty was computed within Altair (not yet possible) than it was a good example for the distribution section.

On-the-fly computation of these confidence regions by doing selections would be really great though. Continue dreaming for vega/vega-lite#6043.

Can we remove this line regarding the import annotation from __future__? Probably also requires removing other types from the function, but that is fine for this example.

Can we rename the functions? Eg can we use confidence_region_2d and grouped_confidence_regions instead?

I also tried to reduce the usage of pandas by using dicts directly. I think that can work, but the readability is slightly less than current situation. But doing more with dicts as containers is always good 😊

@dangotbanned
Copy link
Copy Markdown
Member Author

Thanks for this example!

No problem @mattijn, although I feel my role in this was mostly making the connection between #3715, #514 and ggplot2.stat_ellipse.
It was interesting though brushing up on some scipy/numpy - as I don't usually interact with these packages directly.

Can we add this to the category case studies instead of distributions? If the uncertainty was computed within Altair (not yet possible) than it was a good example for the distribution section.
On-the-fly computation of these confidence regions by doing selections would be really great though. Continue dreaming for vega/vega-lite#6043.

Huh, I guess I'd never considered where the computation occured being a factor in the category.
Yeah I can recategorise if you think that would be a better fit

Can we remove this line regarding the import annotation from __future__? Probably also requires removing other types from the function, but that is fine for this example.

Yeah I think we might not need that line at all actually.
I added it out of habit when I used tuple[...] here, but that should be valid since https://docs.python.org/3.9/whatsnew/3.9.html#type-hinting-generics-in-standard-collections

I'll take a look today and see what works 👍

arr: np.ndarray[tuple[int, int], np.dtype[np.float64]],

Can we rename the functions? Eg can we use confidence_region_2d and grouped_confidence_regions instead?

Yeah those seem like much better names!
Bikeshedding a little here, but maybe confidence_region_2d and confidence_region(s?)_grouped?.
I don't feel too strongly about it since this is an example - but I'd usually try to match the prefix for related functionality

I also tried to reduce the usage of pandas by using dicts directly. I think that can work, but the readability is slightly less than current situation. But doing more with dicts as containers is always good 😊

Interesting, I hadn't thought to try this without a dataframe at all 🤔
I'm not a fan of pandas, but since we're already using it (vega_datasets) for iris - I think it is simpler just to keep the data in that container.
We'll also be able to easily adapt this example to penguins (#2231) after (#3631) and in #2213

@dangotbanned dangotbanned mentioned this pull request Jan 17, 2025
6 tasks
@dangotbanned dangotbanned marked this pull request as draft January 17, 2025 13:23
@dangotbanned dangotbanned marked this pull request as ready for review January 17, 2025 14:08
@dangotbanned
Copy link
Copy Markdown
Member Author

@mattijn hopefully I've addressed all of your points in (#3747 (commits))

@dangotbanned dangotbanned requested review from mattijn and removed request for mattijn January 18, 2025 11:36
@mattijn mattijn enabled auto-merge (squash) January 18, 2025 11:51
@mattijn mattijn merged commit 144befb into main Jan 18, 2025
@dangotbanned
Copy link
Copy Markdown
Member Author

Thanks for your help with this @mattijn

@dangotbanned dangotbanned deleted the vegalite_v2_examples branch January 18, 2025 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support/document Confidence Interval Ellipse

3 participants