Replies: 8 comments 24 replies
-
I guess my main response would be - this feature would not actually be moving away from arrays! Memory and Span can both natively wrap managed Arrays, and are designed to have array-like performance. Neither side as it exists today would notice much difference. On the ScottPlot side, the data is accessed using the same notation as it is today, mostly through indexers. Basically, a function changes from: Only the input types need to change in most cases, output types don't change unless you want to take it a step further and reduce ScottPlot's memory allocations (I'm looking into this separately). No additional allocations happen either, since Memory, Span, and PlotData are all readonly structs (hence the 'in' qualifier) - there are much slower things that far outweigh the extra couple instructions needed to execute the implicit operator that wraps double[] into PlotData. |
Beta Was this translation helpful? Give feedback.
-
|
I think the performance and compatibility conversation is an important one, but I'm wondering how one communicates to users that a parameter which takes a One option is obviously to maintain two overloads, one which takes I wonder if one could rename |
Beta Was this translation helpful? Give feedback.
-
|
I make some performance benchmark:
I used Scott build in benchmark info, There is averange value measured by 'eyes'. Error bars about 10-20 ms. |
Beta Was this translation helpful? Give feedback.
-
|
@bclehmann Please re-read the original post. Performance is only included as a statement of non-degradation. The focus is very significantly (and I keep repeating this) on flexibility of input data. The fact that the lack of flexibility causes a performance issue absolutely does not mean that the proposed changes are performance-centric - They are flexibility centric. In fact, performance is mentioned only once as the reason for flexibility being needed - not once did I propose some change to improve ScottPlot's performance. @StendProg You are making all sorts of false analogies here. Let us look at some. First off, I showed you exactly the relevant portions of my benchmark. It includes the exact API I was running and analyzing - no more, no less. We can certainly apply further and further edge-cases as you move the goalposts so that I can show you the next way to address an obscure situation. If I show that 20 million points still outperform any graphics drawing library in existence, you will raise the number to 100 million and tell me that some new library might be created. I will start off by saying, that the highest performance scenarios running .NET are moving from arrays to Span/Memory. The question must thus be why should ScottPlot have more stringent performance than any other high performance scenario? And also, is that even a goal of ScottPlot? I would assuredly say that it is more fair to assume that a balance of performance and flexibility is desired for a charting library. We're not running compute workloads on million-dollar hardware here - a bit of inefficiency is not the end of the world. A great example that should really set the stage is ImageSharp. It is an image processing library for .NET which does raw-pixel operations. Next, we must ask, What does ScottPlot really do? Right now, there are many limitations, several of which would be removed with Memory/Span. Your argument keeps coming back to large datasets with millions of points of data. Is ScottPlot a data processing/compute library? So we have examined several extremely unlikely scenarios - you have provided no normal scenario to consider. In fact, in all normal scenarios, there is no noticeable change anyway, so you are focusing on unlikely events in absurd situations that should be far outside the scope of this project. To complete the thought, though - If we have 50 million data points, and they are not absurdly updated continuously, the answer to large processing requirements is never to improve performance through hardware or compiler optimizations. I find it very unreasonable when someone focuses on edge cases and poor understanding of out-of-scope scenarios that are not even currently supported, as a blocker to forward progress. It causes others to spend energy discussing things that will either never happen, or that should be addressed differently, it provides no benefit (since you're not actually developing a high performance data analysis algorithm), and is incredibly pedantic (not in a good way) - everything has a performance penalty. This is my conclusion: There are better ways to improve performance when displaying 20 million data points than making sure the compiler optimizes array access. Please don't pretend otherwise - this is not an opinion, it is a well known fact in computing, and really needs to be a separate discussion. Premature optimization is a very very well-known phenomenon, and you have dug yourself into a very deep optimization hole. I am done talking about performance and optimization. Let's talk about general purpose, non-specialized features and techniques that will help general purpose users use a general purpose library in a general purpose way. I promise you, any tiny edge case you can think of can be addressed independently as its own problem to be solved. This isn't high performance computing. Please don't treat it as such. |
Beta Was this translation helpful? Give feedback.
-
|
As a final afterthought, here are the performance results for 100 million points:
Indeed, as you mentioned, span access is now more costly than GDI drawing calls. |
Beta Was this translation helpful? Give feedback.
-
|
Unfortunately, while I do accept there is a flexibility benefit, I question how much value this would add over calling If we look at this we have the main benefits of I would be interested to see an application that needs the interop of a very large amount of data (so that the copying cost is relevant), but where additions are so frequent that the copying cost cannot be written off as a one-time thing. Even then, I think |
Beta Was this translation helpful? Give feedback.
-
The fact of the matter is that I can do it with
I think the last word is still for the author of this library @swharden , He, of course, listens very well and adjusts to the opinions of others until it crosses out the main concept of the project (his vision). As for this discussion, it seems to me that it has already gone a long way. To summarize why I am against the proposed changes:
The proposed solution forces you to make a choice at the library level, there is no way to save both options and provide both options for the user to choose (Retaining the original performance of the first option.). Based on these considerations, I choose the first option, especially since it has already been implemented. |
Beta Was this translation helpful? Give feedback.
-
|
A lot of messages have been posted in this discussion over the last few days. The topic of how ScottPlot should manage plot data in memory is highly advanced, complex, and nuanced. This discussion is worth preserving! I'm appreciative to each of you for sharing your thoughts, code, tests, benchmarks, and ideas for how to make this library better. Some of the messages in this discussion are more adversarial than typical for this community. The opening of ScottPlot's Code of Conduct communicates our desire to foster an open and welcoming environment for contributors! I greatly value and wish to encourage open discourse and disagreement, but in response to the escalating intensity of the messages over the last several days I've decided to lock this discussion. Everyone has been given an opportunity to share their opinion, agreement has been reached on some topics but not on all (likely due to differences in peoples' specific use cases and personal experiences), and I don't think this group will reach improved consensus through additional posts on this page. If someone wants to add or modify something on this final post they are welcome to message me directly: [email protected] I'll briefly summarize this discussion including advantages and concerns associated with it, then indicate what action I intend to take. I'm going to avoid technical discussion - we're all programmers, and with some time and effort we can figure out the details. I'll keep this summary at a high level: Proposed Change
Advantages
ConcernsI'm just listing the concerns here, not addressing them. Detailed discussions of each of these topics can be found above.
Notes
Scott's Decision / Next Action
Resources
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm doing some relevant personal work to enable the use of Memory and Span model for plot data, you can check out the code at this fork/branch: https://github.com/Jmerk523/ScottPlot/tree/plot_data
The idea is to use Memory and Span here for the same reasons that it's useful in general.
I won't go into a large explanation of why these classes are useful, suffice it to say they are becoming more and more common in .NET, including in core APIs such as System.IO.Stream.
Basically, these types let you wrap arbitrary (managed or unmanaged) memory with fast read/write access.
It's a large performance issue to have to convert between Arrays and any other form of memory, including copying and allocations, so supporting more methodologies seems useful. Plus, since we already have Memory and Span in all major releases of .net and core, no version upgrades are needed.
In my test implementation, I created an intermediate PlotData struct, which is essentially a Memory where I can define my own operators, allowing the class to implicitly convert existing array data automatically (Memory and Span support array storage natively).
I would have used Memory and Span directly, but that would have required conversions in a lot of locations that can be done automatically with an implicit operator.
The result was that PlotData was almost a perfect drop-in for externally-visible APIs with arrays as parameters, but also allowing Memory data to be used for plots.
The main drawback is API change - binary backwards-compatibility would require duplicating every changed API.
However, re-compiling is almost seamless and requires changes only in some very specific conditions.
Also, a dependency on System.Memory is required, but this is not a very large addition.
I'm wondering whether anyone else would find this useful, or has any related thoughts.
Beta Was this translation helpful? Give feedback.
All reactions