Skip to content

Conversation

@ahuang11
Copy link
Collaborator

@ahuang11 ahuang11 commented Sep 9, 2024

Closes #1406

Adds a warning and checks if kind == curve to prevent random sampling; only sample head.

@ahuang11 ahuang11 requested a review from maximlt September 9, 2024 16:28
@codecov
Copy link

codecov bot commented Sep 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.81%. Comparing base (6c96c7e) to head (058bf41).
Report is 27 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1408      +/-   ##
==========================================
+ Coverage   87.39%   88.81%   +1.41%     
==========================================
  Files          50       51       +1     
  Lines        7490     7618     +128     
==========================================
+ Hits         6546     6766     +220     
+ Misses        944      852      -92     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ahuang11 ahuang11 added this to the 0.11.0 milestone Sep 10, 2024
Copy link
Member

@maximlt maximlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I feel like MAX_ROWS could be increased, 10000 is kind of low specially considering that now HoloViews uses the webGL backend by default for Bokeh. How about 100000?
  • Just for discussion for now, do you think we should expose this kind of setting in the explorer? Maybe in an advanced tab?

hvplot/ui.py Outdated
):
df = df.sample(n=MAX_ROWS)
if self.kind == 'line':
param.main.param.warning(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about displaying these warnings in the alert box in addition/in place of as a programmatic warning?

Copy link

@hagaishalevaei hagaishalevaei Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to do df = df.sample(n=MAX_ROWS).sort_index(). Otherwise the line plot will not work as it should.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with sort_index; the x may not always be index.

Copy link

@hagaishalevaei hagaishalevaei Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. is there an option to add sort_index as an option?
something like df.hvplot.explorer(x='x', y='y', kind='line', sort_index=True)
where the default is sort_index=False ?

Copy link
Collaborator Author

@ahuang11 ahuang11 Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it probably makes more sense to do sort_values(selected_x) when sampling for line, but I still think head is better, and perhaps a slider for sample size

@maximlt maximlt merged commit bd3e590 into main Sep 20, 2024
@maximlt maximlt deleted the fix_max_rows branch September 20, 2024 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

hvplot Explorer is giving nuisance Line plot when using over 10,000 points

4 participants