-
Notifications
You must be signed in to change notification settings - Fork 20.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecated: Improve $.trim performance for strings with lots of whitespace #5068
Conversation
61fb761
to
66018e1
Compare
Thanks for the PR. This is adding 63 bytes to the gzipped minified build, we consider that a significant number. As I suggested in the other thread, can you try removing the branch that leverages native To verify the size increase, please run |
Do you mean I should completely remove native trim approach? |
Yes, that was the idea. Are there any issues with this approach? As I understand, your regex changes should resolve the performance issues that you reported. You can also try the version without a native |
Native trim should be faster than any regexp, and I think the recent browsers should not require the regexp. Just in case:
|
Here's a sample benchmark: https://jsbench.me/oml5j9kjfu/1 Note that even the optimized regex still needs to walk over the entire string while the native implementation can walk the string backwards. Results on my machine are: trim 1000 char text like
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the test cases.
I added some comments that should contribute to making this patch much smaller.
It's a bit tricky as this is a deprecated util that should generally not be used at all... The use case presented at jquery/jquery-migrate#461 (comment) says something about a slow JIRA instance. I wonder: would updating jQuery with this patch help here, considering that JIRA would have to update the included jQuery?
That said, I'd be leaning towards accepting this change with my suggestions as trimming is a pretty common operation and I imagine $.trim
must be used a lot, especially in older code. My suggestions should make the size increase significantly down.
Thoughts, @timmywil @dmethvin @gibson042?
I've merged functions into one:
|
I've created
|
Of course, it would definitely help. |
81d37f7
to
0621c3e
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minor comments, otherwise looks good! I'm also happy where this landed size-wise.
Adding 29 bytes for performance that only matters with pathological input seems like too much to me, but I would be comfortable with still improving performance by orders of magnitude for the earlier 6-byte increase: @@ -15,7 +15,7 @@ define( [
// Support: Android <=4.0 only
// Make sure we trim BOM and NBSP
-var rtrim = /^[\s\uFEFF\xA0]+|[\s\uFEFF\xA0]+$/g;
+var rtrim = /^[\s\uFEFF\xA0]+|([^\s\uFEFF\xA0])[\s\uFEFF\xA0]+$/g;
// Bind a function to a context, optionally partially applying any
// arguments.
@@ -82,6 +82,6 @@ jQuery.isNumeric = function( obj ) {
jQuery.trim = function( text ) {
return text == null ?
"" :
- ( text + "" ).replace( rtrim, "" );
+ ( text + "" ).replace( rtrim, "$1" );
};
} ); |
6051fc7
to
159f9c1
Compare
src/deprecated/support.js
Outdated
|
||
"use strict"; | ||
|
||
support.trim = "" === "\uFEFF\xA0".trim(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I am arguing against the introduction of support.trim
at +29 bytes (and really, even at +10).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me personally the biggest issue with size increases is they creep in, adding up to a meaningful increase over time. In this case, this danger does not exist as this is deprecated code that's already removed in 4.0.0-pre
. That makes +29 bytes more acceptable to me.
BTW, we could still make it smaller by dropping \xA0
- this is actually trimmed properly by Android 4.0, \uFEFF
is not, I just checked. That lowers the size increase to +24 bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Native implementation is still can be an order of magnitude faster, and trim
is used a lot in the wild.
So it looks wrong to default to regexp implementation which is needed for a single stale browser only.
I agree that keeping the dependencies small is important, however, I would say that battery life might be even more important than 29 bytes of extra transfer.
Code size is easy to compare as long as npm prints it. However, it would be nice to have benchmarks that would capture energy consumed by "real-life websites".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you claiming that a real-life website, let alone a significant number of them, are spending so much time in jQuery.trim
that an improvement from O(n²) to O(n) (with real effect of 240x for input with hundreds of medial whitespace characters and 23000x for input with tens of thousands, taking the OP 20 seconds down to about 1 ms) is not sufficient to address the overall experience? I'm not denying that native String.prototype.trim
is astronomically faster still, but how would such sites even get the input data fast enough for that further improvement to make a material difference?
That said, I'm not going to panic over 18 bytes in a legacy branch. If @mgol is in favor, that's fine with me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A real site, can and should, of course, call the native trim()
method directly if its product requirements allows that, and could do so, at least for the handful of rare cases where this is known to make a significant difference.
While sites may be embedding a particular jQuery version of a variety of different reasons (e.g. blocked on migration of code, or offer jQuery API to other plugins, or browser support for one of the legacy browsers, such as the case for Wikipedia where the JS payload supports IE11 and Android 5 but not IE9 or Android 4). It's quite likely a consumer could even adopt String-trim unconditionally in its code, if all its clients support ES5, even if jQuery itself maintains wider support.
Thanks for your contribution @vlsi! We discussed this in the meeting and we'd prefer to go with @gibson042's shorter solution and not add |
For reference, I've captured the problematic text (~900KiB). Old regexp takes ~5 seconds. For trivial cases (e.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Looks good to me now.
@gibson042 since your "Request changes" review is pending - can you have a look and approve if you have no remarks? This now implements your +6 bytes proposal, as I understand. |
… of whitespace runs Regex imp implementation takes O(N^2) time to trim the string when multiple adjacent spaces were present. The new expression require that the "whitespace run" starts from a non-whitespace to avoid O(N^2) behavior when the engine would try matching "\s+$" at each space position.
Landed, thanks! This change will be included in the upcoming |
…s of whitespaces The old implementation took O(N^2) time to trim the string when multiple adjacent spaces were present. For instance, consider the string "A B" (10 spaces between A and B). Then old regexp /[\s]+$/ would take 10 steps until it realizes the regexp does not match at position 1. Then it would try to match the regexp at position 2, and it would take 9 steps. Then 8 steps for position 3, ... so it would take 10*10/2 steps until it figures out the regexp does not match. See jquery/jquery#5068 The new approach is to require "non-whitespace" char before the whitespace run, so it spends just one step for each space in the string.
…space Regex imp implementation takes `O(N^2)` time to trim the string when multiple adjacent spaces were present. The new expression require that the "whitespace run" starts from a non-whitespace to avoid `O(N^2)` behavior when the engine would try matching `\s+$` at each space position. Closes gh-461 Ref jquery/jquery#5068
The old implementation took O(N^2) time to trim the string when multiple adjacent spaces
were present.
In my case, the issue impacts JIRA which takes ~20 seconds to display an issue where most of the time is spent in
jQuery.trim
.For instance, consider the string "A B" (10 spaces between A and B).
Then old regexp /[\s]+$/ would take 10 steps until it realizes
the regexp does not match at position 1.
Then it would try to match the regexp at position 2, and it would take 9 steps.
Then 8 steps for position 3, ... so it would take 10*10/2 steps until
it figures out the regexp does not match.
The new approach is to require "non-whitespace" char before the whitespace run,
so it spends just one step for each space in the string.