-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve performance and memory consumption #20
Conversation
ansicolor.js
Outdated
// Those are added in the actual parse, this is done for performance reasons to have the same hidden class | ||
this.css = ""; | ||
this.color = ""; | ||
this.bgColor = ""; | ||
this.bold = undefined; | ||
this.inverse = undefined; | ||
this.italic = undefined; | ||
this.underline = undefined; | ||
this.bright = undefined; | ||
this.dim = undefined; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding those properties later only if the styles
have them, we will always have those so we can take advantage of "hidden class"
cc45738
to
b304a93
Compare
b304a93
to
babeb80
Compare
const clean = obj => { | ||
for (const k in obj) { if (!obj[k]) { delete obj[k] } } | ||
return (O.keys (obj).length === 0) ? undefined : obj | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove clean as it changes the "hidden class of the object", it is better to just keep the properties but change their values
static str(x) { | ||
if(x === undefined) return ""; | ||
return "\u001b[" + Number(x) + "m"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to create the instance just for this, recreated the logic here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the number conversion can probably be removed
|
||
buffer += c | ||
// getString as function instead of string to allow garbage collection | ||
function* rawParse(getString) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of this:
const str = <1 GB of string>
rawParse(str)
the str
will be saved in memory until the execution finish
we get a function to get the string so when the function finish the string can be garbage collected
if (c === '\u001b') { state = BRACKET; buffer = c; } | ||
else { text += c } | ||
break | ||
const chunks = splitStringToChunksOfSize(getString(), ONE_MB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Split the string into chunks of 1MB to allow fast garbage collection of 1MB at a time - instead of holding the reference to the string we split into chunks of 1MB and after processing is finished we can remove the reference so it can be GC
also we split to 1MB so it's not too little that after every processing we gonna change the array size (shift) and not too big that we hold reference to too much data
ansicolor.js
Outdated
else { state = TEXT; text += buffer } | ||
break | ||
while (chunks.length > 0) { | ||
const chunk = chunks.shift(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shift here is important so we don't have any reference to the data once we finish processing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a note, but shift is O(n) and causes the contents of the arr to be moved which makes this loop On^2. Depending on the amount of chunks you are expecting, you may be able to squeeze some extra perf by using a linked list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily, even though linked list are theoretically better for this they might not be, the case, I tried once to move nodejs webstream internal from using array to linked list and it was actually worst
what we can do instead is set the value to undefined so it can be garbage collected and we won't change the array size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct (hence my comment about chunk count). Context is important and I don't have much of it here, I wanted to flag this mostly for awareness.
In many cases when dealing with small iterators, micro-optimizations like these are redundant. What I have seen happen with code like this is that even though you don't see any improvement, someone in the future might run into some obscure edge case (maybe by processing something with an insanely high chunk count) and experience degraded perf, but you are the lib author so you know if this is ever likely to happen (if ever) and why you imo don't need to even handle it.
I wanted to raise this for awareness (sorry if I sidetracked the discussion), but I think I should add that you did a very good job on the entire PR 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right, also I'm actually not the lib author
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be okay for sane uses (like 1k chunks), as the reallocating and copying of such an array would be peanuts compared to processing of chunks themselves...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shift here is important so we don't have any reference to the data once we finish processing
Would be reasonable to add a comment in the code, so it won't trigger whoever will be reading that code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assigned to undefined and added comment
|
||
buffer += c | ||
// getString as function instead of string to allow garbage collection | ||
function* rawParse(getString) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the processing can take time we don't want the user to not do anything with the data until all of it is ready, so this way the user can do something with it
and some use cases just don't need some of the spans, so it can be GCed faster
consider this:
function createSpans() {
return [{text: 'hello'}, {text: ''}, {text: 'world'}];
}
function run() {
const allSpans = createSpans()
for(const span of spans) {
if(span.text.length === 0) {
continue; // We don't care for empty spans
}
}
}
in this example, the span that we don't care can be GCed after the run
function finish
but if we use generators the empty span can be GCed immediately:
function* createSpans() {
yield {text: 'hello'};
yield {text: ''};
yield {text: 'world'};
}
function run() {
const allSpans = createSpans()
for(const span of spans) {
if(span.text.length === 0) {
continue; // We don't care for empty spans
}
}
}
(there is no reference to the empty span)
Wow! Thanks for your amazing work. I'll review and merge it shortly. |
@rluvaton I am getting troubles with running the bundled .js file (can be reproduced with
I am quite sure it is because of generators usage, as babel tries to compile it to ES5 and fails to include some polyfill for emulation of generators or something... But it's almost 2024 and I am not sure we still need to compile packages to ES5, seems ancient! What do you think if we remove the |
Sounds great, could break older browsers of course but I think it's fine |
use es2016 |
Should I bump the version as well? |
@rluvaton Yes, bump plz to 2.0.0 — as we lose the compatibility with older browsers, we don't want ppl to get autoupdated to it. I'll publish |
You should really use CI/CD for running the test + build + deployed with auto bumping |
I originally used Travis CI for that (there was a CD script, you can see it's remnants in the repo), but it had been breaking occasionally and I didn't have the time to maintain it, so I dropped it — as manually publishing wasn't really an issue. Since we now have Github actions, it would be reasonable to add a GH-based CD, as I guess the maintenance burden will be lower than with Travis. I still don't have the time for that, so if you desire to make a PR, it would be awsome. |
@rluvaton Running the bundled But now I noticed that some tests are failing — could you please investigate and fix whatever is causing that? You could reproduce it by running |
Most of the tests because of missing properties that now always shown and one weird failure (checking now) |
All tests are now passing |
@rluvaton I would recommend re-running the original benchmark posted in your PR description so we can see if any of the final changes impacted it. Awesome work 👏🏼 |
done |
Improve performance and memory consumption drastically
(the benchmark for the after used the parseIterator)
benchmark
Migrated after improving the performance in my colorize ansi chrome extension that bundled this library