Conversation
FYI, building the string up one at a time is really slow... (I know, not new code). I was curious how slow, so I came up with four alternatives:
For test data without quotes, the times are:
In other words, it really pays off (2x faster) to add the check (if indexof('"') == -1 then append string). But what if the string has a quote (or more)? How much time does it add? For test data with a quote, the times are:
So adding the extra check added ~6% overhead in this case. (which we can make back up by using an incremental chunked approach). Probably worth it since the bulk of strings probably won't have embedded quotes, so we'll probably average 2x faster. Refers to: src/Microsoft.PowerShell.Commands.Utility/commands/utility/CSVCommands.cs:1081 in fe379e2. [](commit_id = fe379e2122f4ee04cccbd7d30fe00d27bbbd3899, deletion_comment = False) |
FYI, here's the optimized code: In reply to: 466544670 [](ancestors = 466544670) Refers to: src/Microsoft.PowerShell.Commands.Utility/commands/utility/CSVCommands.cs:1081 in fe379e2. [](commit_id = fe379e2122f4ee04cccbd7d30fe00d27bbbd3899, deletion_comment = False) |
In fairness, might be better as a separate change, since it's an optimization of existing code, only indirectly related to the new code. In reply to: 466545320 [](ancestors = 466545320,466544670) Refers to: src/Microsoft.PowerShell.Commands.Utility/commands/utility/CSVCommands.cs:1081 in fe379e2. [](commit_id = fe379e2122f4ee04cccbd7d30fe00d27bbbd3899, deletion_comment = False) |
|
@DavidBerg-MSFT Many thanks! It is very interesting! @SteveL-MSFT If your team is interesting to have this in 6.2 please review. |
fe379e2 to
f64493f
Compare
| /// <summary> | ||
| /// Never escape output. | ||
| /// </summary> | ||
| internal static void AppendStringWithEscapeNever(StringBuilder dest, string source) |
There was a problem hiding this comment.
This method isn't actually used. I understand the original intent, but I think just calling .Append() rather than calling this method is preferred. So you should remove this.
|
@DavidBerg-MSFT agree that the code optimization should be a separate PR. @iSazonov can you address the Codacy issues? Having taken a look at the code, the risk is small so we may be able to take this for 6.2-GA, but it won't make it into RC. |
Done. |
|
@iSazonov Looks like there is still 2 minor Codacy updates needed. |
|
@anmenaga First was fixed. Second is false positive. |
|
@SteveL-MSFT Will this be added to 6.2-GA? |
|
Can you ask this in the maintainers teams channel with a link to the PR with your reason why and an analysis of the regression risk? |
|
@TravisEz13 This is only an informational question to the previous Steve's comment. I don't need to have it in 6.2. |
|
We have already forked for 6.2-GA based no RC.1. So, this won't be in the build. |
|
@DavidBerg-MSFT I tested your code on .Net Core 3.0 Preview3 and can not confirm results. In my tests original code is faster. |
|
I looked the code again. StringBuilder delays only on reallocating buffers. We reuse StringBuilder and exclude such re-allocations. Underlying method in StringBuilder copies very efficient by means of indexer for both char and strings. So I can not find how we could speed up the code - IndexOf() slow down the code and my tests confirm this. |
PR Summary
Close #8890
Add new parameter
-UseQuotesto Export-Csv and ConvertTo-Csv cmdlets:The PR doesn't change default to
AsNeededas requested in #8890. We could do this later because it is breaking change and we need to discuss this.The PR doesn't add
StringsOnly. This allows to produce broken output csv files where values contain unquoted delimiter.AsNeededis best choice. If I wrong we can add this later.PR Context
Please look #8890.
Some application don't like quotes in cvs files.
/cc @DavidBerg-MSFT
PR Checklist
.h,.cpp,.cs,.ps1and.psm1files have the correct copyright headerWIP:or[ WIP ]to the beginning of the title (theWIPbot will keep its status check atPendingwhile the prefix is present) and remove the prefix when the PR is ready.[feature]to your commit messages if the change is significant or affects feature tests