Skip to content

Added higlightable scatter plot#415

Merged
swharden merged 38 commits intoScottPlot:masterfrom
bclehmann:scatter-highlight
May 31, 2020
Merged

Added higlightable scatter plot#415
swharden merged 38 commits intoScottPlot:masterfrom
bclehmann:scatter-highlight

Conversation

@bclehmann
Copy link
Member

New contributors should review CONTRIBUTING.md:
https://github.com/swharden/ScottPlot/blob/master/CONTRIBUTING.md

Purpose:
#414

New functionality (code):
There is a demo, but the gist of it is this:

PlottableScatterHighlight highlightPlot = plt.PlotScatterHighlight(x, rand);
highlightPlot.HighlightPoint(4);
highlightPlot.HighlightPointNearest(8);
highlightPlot.HighlightPointNearest(5, 1);
//highlightPlot.HighlightClear();

New functionality (image):
image

@swharden
Copy link
Member

haha, wow you're so fast @Benny121221! This is pretty cool. Nice LINQ in there too.

Don't throw these ideas in just because I suggested them - I'm more asking what you think about these potential other features... what about a flag to turn on/off drawing the Y value of the point on the image, similar in appearance (and perhaps with similar options) as the annotation plottable? Also, what do you think about a flag to enable a "crosshair" mode which draws lines (perhaps with similar options as HLine and VLine) to the highlghted point? This could make it feel really interactive when I refactor the code to make it a little easier to show values under the mouse cursor.

Looking forward to hearing your thoughts!

@bclehmann
Copy link
Member Author

Do you mean this thing:
image

I don't know that that would make much sense for this plottable, but I could see value in a method to return the point closest to a specified coordinate pair? And I think its a similar story for the crosshair mode. I think those both make the most sense in a scenario where you want to snap the mouse to the nearest point. It would look quite messy to have 12 annotations and 24 lines drawn because the user decided to highlight points because they are outliers.

I think it could get messy to be calling other plottables from this one. And if you don't create it as a plottable (i.e. just using the drawing library) it becomes less easy to customize, because you can't expose the same API you use for drawing normal VLines to the user.

I think the best option is to add a function to make it easier to get the closest point to a set of coordinates. This also makes it trivial to merge the two scatter plottables

@swharden
Copy link
Member

swharden commented May 22, 2020

I think the best option is to add a function to make it easier to get the closest point to a set of coordinates. This also makes it trivial to merge the two scatter plottables

I like this conclusion. We can stick with smaller, simpler pieces that snap together. This also makes me think crosshair (essentially a VLine and HLine combined) might be a useful plot type 🤔 I'll keep this in mind!

Thanks again for your work on this @Benny121221! I'll work on this a bit more over the weekend.

@bclehmann
Copy link
Member Author

No problem. I probably can't work on this tomorrow because I have a test, but on the weekend I can help out.

@bclehmann
Copy link
Member Author

Also, if there's a crosshair plottable, I think it is necessary that there is an easter egg which lets you quickscope the graph

@swharden
Copy link
Member

quickscope the graph

XDXDXD

@swharden
Copy link
Member

quickscope the graph

XDXDXD

counter-terrorists win

@StendProg
Copy link
Contributor

Hi there.
This PR contains interesting idea, and good implementation.
I have a few comments on the code and want to review if you don't mind.

First of all, it's better to make PlottableScatterHighlight dirived from PlottableScatter to remove a lots of copy/pasted code(GetLimits(), GetCSV(), Scatter constructor etc).
If you go further, you can put marker rendering code into a separate virtual function in PlottableScatter. This will prevent you from copying the Render() too.

Good idea to declare IHighlightable interface with all this HighlightPointNearest() ... methods.

All lambdas like:

points.Select((p, i) => (p.x, p.y, i)).OrderBy(p => pointDistance(p.x, p.y)).First().i

looks cool, but not most effective way to find minimum value.
Using sort you get result in O(N*log(N)).
Using simple minimum search you will get the same result for O(N).
There is no way to find minimum index using lambdas in .NetFramework. (Its possible in MoreLinq, but it's external package).
Finding index using for loop, looks less elegant, but its few lines of code and simple too.

@bclehmann
Copy link
Member Author

First of all, it's better to make PlottableScatterHighlight dirived from PlottableScatter to remove a lots of copy/pasted code(GetLimits(), GetCSV(), Scatter constructor etc).
If you go further, you can put marker rendering code into a separate virtual function in PlottableScatter. This will prevent you from copying the Render() too.

@StendProg I agree, although I think it makes even more sense for the two scatterplots to be merged. For now I'll work on your suggestions.

@bclehmann
Copy link
Member Author

bclehmann commented May 22, 2020

I like the changes, PlottableScatterHighlight.cs is much shorter now. The only thing is the constructor signature is a tad long:
https://github.com/Benny121221/ScottPlot/blob/18a56600ecacffa994d50bb5ab8e3b0d8aff7381/src/ScottPlot/plottables/PlottableScatterHighlight.cs#L22

@swharden
Copy link
Member

I like the changes, PlottableScatterHighlight.cs is much shorter now. The only thing is the constructor signature is a tad long

They are large by C# standards, but it's consistent with the style of other ScottPlot plottables so I probably wouldn't worry too much about it.

ScottPlot's API uses command-like methods to create plots, and optional arguments to customize them. This style results in methods (and constructors) with lots of arguments. I think methods with lots of arguments is considered a code smell .NET land... but interestingly it is considered Pythonic because that language has some advanced "kwarg" features. Since I try to style the ScottPlot API after matplotlib (a Python library that makes heavy use of kwargs), this tilts toward methods with lots of optional arguments.

It wasn't until recently that I realized how uncommon this style is in the .NET world. A MVC-style (requiring the user to compose plot objects, style them by setting properties, then pass them into a plot viewer) would be more .NETish, but why I like ScottPlot is that it's much more familiar and easier to pick up by people who haven't used .NET/MVC before, and modeling its API after matplotlib is how I keep it easy to use for newcomers to the language (many data scientists, like me, come from Python backgrounds).

That's a longer response than it needed to be, but I wanted to put it out there for context and perhaps see if @StendProg has any comments, as @StendProg has a lot more .NET experience than I do

@bclehmann
Copy link
Member Author

My observation about the constructor signature was more a comment on the : base(args....) than it was about the parameter list. I have a Java background, so that syntax for constructor overloading is uncomfortable. In Java it would be part of the constructor body and it would look like super(args...). The base constructor call technically doesn't technically count as part of the method header. It's borrowed from the C++ member initializer list syntax (which until a recent revision was not used for constructor overloading) and it's not considered part of the header.

I think the most .NET-ic solution is to create a settings struct which carries all of the parameters. Something like: var scatter = plt.PlotScatter(settings);

Keep in mind most of my .NET experience is from web development, where the above solution is preferred, as the client is sending a JSON object already. Although .NET supports binding JSON to a list of arguments, it's simpler to just bind it to a class or struct, especially as there is no argument list exposed to the client.

I normally don't like the pythonic way, but I think here it is more clear, if perhaps more verbose. Of course, the enterprise solution is to use a PlotSettingsManagerServiceFactory to instantiate a PlotSettingsManagerService which sorts it all out for us.

@StendProg
Copy link
Contributor

@Benny121221 .
Binary search do the trick.
Additionally you can optimize

public void HighlightPoint(int index)
{
    highlightedIndexes.Add(index);
    highlightedIndexes.Sort();
}

if you insert index value each time in right place of List, you keep it sorted without manual Sort call.

another microoptimization.

minDistance = value[0];
for ( int i = 1; ....)

You replace spaces with tabs in last commit. And have problem with formating in PlottableScatterHighlight.cs. Please fix it.

All others looks perfect, good job.

@bclehmann
Copy link
Member Author

I got you. The only thing is this method is written a little unintuitively now: https://github.com/swharden/ScottPlot/blob/f4550c8ed2e21eee961a52187dcc1448f692e6f2/src/ScottPlot/plottables/PlottableScatterHighlight.cs#L49:L60

I added a comment explaining how it works because I would normally assume that List<T>.BinarySearch() would return -1 if it doesn't find the value. Plus, most people who aren't systems programmers are made uncomfortable by bitwise operators in general.

@StendProg
Copy link
Contributor

@swharden
I thinked a lot on this library API.
In fact, it is very friendly to beginners. And it also attracted me in due time.
And even having experience working with the library, it’s very convenient to set about 5 parameters and do what you need with one line of code. As a result, the library is very convenient from the point of view of the user (IMHO).

But from the point of view of the developer, supporting this API quickly steps into hell. Methods with dozen of parameters that need to be thrown through several function calls. Besides being annoying, it is also a potential source of various errors.
The standard technique for refactoring functions with a large number of parameters is to combine the parameters into a structure and pass it as a parameter. But from the point of view of the user, you need to create a structure, fill in its fields, for this you need to go into the documentation to read what kind of fields and how to fill them out. Many may choose to go look for another library.

As a result, I think that the upper level of the API(user level) should remain as it is, otherwise it will be a completely different library.
At developer level (not visible to user) you can move away from strict rules, and transmit data through fields, you can even use DI and factories, etc., just don’t rush to redo everything, but constantly think about this problem and apply it only in the new code.

@swharden
Copy link
Member

swharden commented May 22, 2020

it’s very convenient to set about 5 parameters and do what you need with one line of code

from the point of view of the developer, supporting this API quickly steps into hell

This is a very good description of the advantage and disadvantage of this design 😅

I think that the upper level of the API (user level) should remain as it is

I agree

constantly think about this problem and apply it only in the new code

What do you think about an intermediate strategy that seeks the best from both worlds? The public API can remain as it is (lots of simple arguments), but inside those methods we developers can start to use a "style object" model to create Plottables, simplifying the constructors and making errors less likely.

After we make a few style objects, we could slowly refactor toward this model one Plottable at a time, and the public API never changes. This may slowly make the existing code base easier for the developer to support 🤔

@swharden swharden mentioned this pull request May 23, 2020
53 tasks
@StendProg
Copy link
Contributor

StendProg commented May 23, 2020

@Benny121221
I can't commit directly to this PR, so I make PR to branch of you fork with Unit tests for this new plottable, please check it.
bclehmann#2

@swharden
Copy link
Member

swharden commented May 24, 2020

I made a mouse interaction demo here

image

image

I refactored this module a bit to return tuples, making it easier to get x/y/index information back out of the method that finds the nearest point

private void formsPlot1_MouseMove(object sender, MouseEventArgs e)
{
    double mouseX = formsPlot1.plt.CoordinateFromPixelX(e.Location.X);
    double mouseY = formsPlot1.plt.CoordinateFromPixelY(e.Location.Y);

    sph.HighlightClear();
    var (x, y, index) = sph.HighlightPointNearest(mouseX, mouseY);
    formsPlot1.Render();

    label1.Text = $"Closest point to ({mouseX:N2}, {mouseY:N2}) " +
        $"is index {index} ({x:N2}, {y:N2})";
}

This is a vastly simpler "show value on hover" solution!

@swharden
Copy link
Member

swharden commented May 24, 2020

@StendProg thank you for making these thorough tests! They revealed an uncommon but fundamental issue with scatter plots: xs and ys must always be the same length, but this is not enforced. Same for errorX and errorY if not null. The problem is when the length of the array changes. Users can manually modify these public properties to cause the scatter plot to enter a bad state. With ScatterHighlight, this is also true with the isHighlightable array.

One solution may be to make all the properties {get; private set;} and create methods like Update(newXs, newYs) which handle the new array creation and force constraints, but this would be a breaking change.

Another idea I've been considering is something like a ScatterArray() which uses readonly double[] (valid state checked at initialization), and something like ScatterList() which uses List<double> to make it easier for data we know will resize (valid state checked on every render). The public interface could remain the same, with overloads separating the two behaviors.

As far as this PR goes, I think we've gotten about as good as we can with the base class we have.

@StendProg
Copy link
Contributor

StendProg commented May 25, 2020

The problem is when the length of the array changes. Users can manually modify these public properties to cause the scatter plot to enter a bad state. With ScatterHighlight, this is also true with the isHighlightable array.

Simpiest way to fix it, is copy xs ys arrays to temp arrays at the begining of Render loop. Check them for length equality and work with temp arrays only in render loop. Copy means only reference copy like tempxs=xs. The same way, you need to do in all methods that works correct only on equal length.
How it works: user can change array at any time(at half of Render loop too), but Scatter update it only on Render() start, and make Render() only then data correct.

You need a lot more complex solutions if you want to achive true thread safe result. Does it really needed? Maybe make user contract: Scatter support only equal arrays, and remove all check will be better solution?

@bclehmann bclehmann requested a review from swharden May 31, 2020 02:28
@bclehmann
Copy link
Member Author

There were some merge conflicts but I rebased. Everything seems dandy now

@swharden swharden merged commit 74246d0 into ScottPlot:master May 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants