-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Avoid string.Split allocations in System.Net.Http #7468
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if colonIndex was -1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephentoub The current code seems to assume that the resulting array will have at least 2 elements, so I just mirrored that. I can change it if need be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current code will throw if that's not the case. Yours could result in corrupt results.
We have the type at https://github.com/dotnet/corefx/blob/master/src/Common/src/System/IO/StringParser.cs to avoid needing to do lots of these manual IndexOf operations when converting String.Splits, e.g. instead of: string input = ...;
char separator = ...;
string[] parts = input.Split(separator);
if (parts.Length != 3) return false;
Use(parts[0]);
Use(parts[2]);it can be something like that: string input = ...;
char separator = ...;
var parser = new StringParser(input, separator);
string part0, part2;
if (!parser.MoveNext()) return false;
part0 = parser.ExtractCurrent();
if (!parser.MoveNext()) return false;
if (!parser.MoveNext()) return false;
part2 = parser.ExtractCurrent();
if (parser.MoveNext()) return false;
Use(part0);
Use(part2);Note that I'm not implying we want to use that here; it's more complex than String.Split, and I've not seen evidence that the usage of String.Split you highlight is on any kind of hot path that would demand the extra optimization and complexity. If we did decide it was important, though, I'd prefer such an approach. |
|
@stephentoub The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add exception text, it needs to be localized. This is also changing the exception type and message from what it would have been
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this method already throw InvalidOperationException? If not, we should not be changing behavior by throwing a new exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidsh These methods are internal and shouldn't throw under normal circumstances, though (as @stephentoub commented below). Would you recommend changing these to Debug.Asserts? The code consuming the array seems to assume that it will have at least 2 elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does the input for the method come from? If the input strings come from the developer/end-user, then this code path needs exceptions. But if the input is coming from code logic generated internally, then we would use Asserts to validate our algorithm logic.
This PR changed from using .Append to .IndexOf methods. Looking at this, it seems like we still need exceptions. But we can't surface them if they are of a different exception type than before since that would change external behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend you write new tests from the public API to validate that you are not changing behavior (i.e. the types of exceptions being thrown).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidsh I don't think this code is actually being used anywhere else in the repo: https://github.com/dotnet/corefx/search?utf8=%E2%9C%93&q=GetDateTimeString (it's not being used in MailBnfHelper either)
Would it be a good idea to limit my changes to just ContentDispositionHeaderValue for now, and investigate changing/removing this function in another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this code is actually being used anywhere else in the repo
Then what's the value in changing it? We should only be churning code to improve performance in places where it matters, and it doesn't matter in code that's not being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephentoub OK then, I'll remove the changes to everything except ContentDispositionHeaderValue so this PR can get through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change removed a newline from the end of the file. The newline needs to be put back.
|
I've reduced the changes in this PR to just |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any tests that hit this? I don't see any in either the functional or unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephentoub Added tests in a new commit. Let me know if/when this is ready to be merged, and I'll squash it.
|
Thanks, @jamesqo. LGTM. |
…et/corefx#7468) * Avoid string.Split allocations in ContentDispositionHeaderValue * Add tests for changes in ContentDispositionHeaderValue Commit migrated from dotnet/corefx@5bb6088
Similar changes that were made in the
FrameworkNamePR in #7288, except for System.Net.Http. Also switched a couple of places to usenameof.@stephentoub and @davidsh, I found another place that could be optimized similarly to not use
string.Splithere. Do you think it's worth it? It would lead to a lot of boilerplate code since this is a 5-element array, but would also help us avoid a couple ofSubstringallocations in addition to theSplitones, sinceparts[0],parts[4], andparts[2]wouldn't actually have to be used.