-
-
Notifications
You must be signed in to change notification settings - Fork 44
Catch hanging figure payloads in parse step #55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- to catch figure payload that can cause image-exporter to hang
|
@scjody is this PR still worth merging? |
|
We need something like this before we can close https://github.com/plotly/streambed/issues/9865. Right now image requests are being delayed unnecessarily due to server restarts, and we need to run 12 servers at all times (whereas we should be able to scale down to 3 off-peak). I think it's worth waiting until you've had the chance to analyze some more server-breaking plots though, since this work is based on only one. I expect to have time to collect some more next week by modifying the |
|
Thanks for the info @scjody 👌 |
|
@scjody your recommendations from https://github.com/plotly/streambed/issues/9865#issuecomment-370514774 have now been addressed. Ready to review. |
scjody
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I feel like someone with more plotly.js experience like @alexcjohnson should review it too.
💃
| function maxPtsPerTraceType (type) { | ||
| switch (type) { | ||
| case 'scattergl': | ||
| case 'splom': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that we really have this info yet, but I'd have thought splom would need to be limited quite a bit lower, perhaps the same as parcoords provisionally but potentially even lower.
src/component/plotly-graph/parse.js
Outdated
| function findMaxArrayLength (cont) { | ||
| const lengths = Object.keys(cont) | ||
| .filter(k => Array.isArray(cont[k])) | ||
| .map(k => cont[k].length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2D arrays?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right. We can get a better estimate for those. Thanks for noticing 👍
src/component/plotly-graph/parse.js
Outdated
| case 'histogram2dcontour': | ||
| case 'box': | ||
| case 'violin': | ||
| return 1e5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these really only handle 1e5? They're not actually displaying every point, I'd have thought we could bump up to 1e6 perhaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of box and violin for (box)points turned on. We could make a special case for those I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good call - most of the time outliers are a tiny fraction of total points, but not always... and certainly not if ...points='all'
| if (Array.isArray(trace.dimensions)) { | ||
| return trace.dimensions | ||
| .map(findMaxArrayLength) | ||
| .reduce((a, v) => a + v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done
src/component/plotly-graph/parse.js
Outdated
| // cap the number of point per trace type | ||
| if (numberOfPtsPerTraceType[type] > maxPtsPerTraceType(type)) { | ||
| return true | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm imagining someone trying to hang our servers, sending in one trace of each type, each just under our limit... do we want to do something like:
budget += estimateDataLength(trace) / maxPtsPerTraceType(type);
if(budget > 1) return true;
src/component/plotly-graph/parse.js
Outdated
| .filter(k => Array.isArray(cont[k])) | ||
| .map(k => cont[k].length) | ||
|
|
||
| return Math.max(...lengths) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better fall back to 0 if there are no arrays - otherwise the user could start with an empty trace (Math.max() = -Infinity) and every trace after that would be ignored.
- dive into 2d arrays to estimate their lengths - bump box, violin, histogram and histogram2d max pt threshold to 1e6, and add special case for box and violin with sample points displayed - use budget so that multiple traces close to threshold won't hang exporter - make traces w/o array don't mess up count
|
@alexcjohnson thanks for the feedback! I think all your comments have been addressed in 884afdf |
src/component/plotly-graph/parse.js
Outdated
|
|
||
| return Math.max(...lengths) | ||
| const lengths = arrays.map(arr => { | ||
| const innerArrays = arr.filter(Array.isArray) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this going to be super slow for big 1D arrays? Most if not all of our modules that support 1D or 2D differentiate just based on the first element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call -> done in 851b803
src/component/plotly-graph/parse.js
Outdated
| if ( | ||
| type === 'box' && | ||
| trace.boxpoints === 'all' && | ||
| len > 5e4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle I guess the special cases should get included in the budget as well - otherwise you could include 199 x 5e4-point box traces...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea -> 0372d82
src/component/plotly-graph/parse.js
Outdated
| } | ||
| }) | ||
|
|
||
| return arrays.length ? Math.max(...lengths) : 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐄 not that it matters, but Math.max(0, ...lengths) works too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤠 -> 851b803
|
Great, thanks! Still concerned about splom but we'll see once it exists 😄 |
I have a good feeling splom with handle |
A proof of concept for https://github.com/plotly/streambed/issues/9865#issuecomment-365322166 for more info.
cc @scjody @alexcjohnson