Pass /utf-8 to cl.exe, sanitize encoding#458
Conversation
Use UTF-8 as source and executable encodings for C++ projects that contain UTF-8 source files. Tree 35229a7 fails to build where the default code page is 936 (GBK), due to multiple files containing UTF-8 encoded strings and/or comments, and these projects set to fail-on-warning mode. Additionally, `src/host/ut_host/UnicodeLiteral.hpp` contains a Windows-1252 encoded string. This commit adds the `/utf-8` option to the affected projects, and converts `src/host/ut_host/UnicodeLiteral.hpp` to UTF-8.
| // Licensed under the MIT license. | ||
|
|
||
| #define REMOTE_STRING L"�remote �npipe:pipe=foo,server=bar�\t" | ||
| #define REMOTE_STRING L"–remote “npipe:pipe=foo,server=bar”\t" |
There was a problem hiding this comment.
Are these even the right type of quotes? Especially since on the next line, they expect 'normal' quotes?
There was a problem hiding this comment.
I believe that the quotes are correct. it's possible to ask the console to filter some of the "smart" characters back to their normal ascii equivalents so that they don't break other applications that might not handle them correctly. This particular header is only included in the clipboard unit tests but isn't actually used, I suspect they were part of a test that was removed/lost/not written.
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| <ItemDefinitionGroup> | ||
| <ClCompile> | ||
| <AdditionalIncludeDirectories>..;$(SolutionDir)src\inc;$(SolutionDir)src\inc\test;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> | ||
| <AdditionalOptions Condition="'$(Configuration)|$(Platform)'=='AuditMode|ARM64'">/utf-8 %(AdditionalOptions)</AdditionalOptions> |
There was a problem hiding this comment.
checking the configuration and platform isn't really necessary for this PR's change since it's the same setting for every combination of configurations and platforms.
| <AdditionalIncludeDirectories>..;$(SolutionDir)src\inc;$(SolutionDir)src\inc\test;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories> | ||
| <AdditionalOptions Condition="'$(Configuration)|$(Platform)'=='AuditMode|ARM64'">/utf-8 %(AdditionalOptions)</AdditionalOptions> | ||
| <AdditionalOptions Condition="'$(Configuration)|$(Platform)'=='Debug|ARM64'">/utf-8 %(AdditionalOptions)</AdditionalOptions> | ||
| <AdditionalOptions Condition="'$(Configuration)|$(Platform)'=='Release|ARM64'">/utf-8 %(AdditionalOptions)</AdditionalOptions> |
There was a problem hiding this comment.
we have a common props file at /src/common.build.pre.props that already defines some additional options for all of the projects in the solution. It might make more sense to enable utf-8 across the whole repo and standardize all of the files on that instead of mixing encodings.
There was a problem hiding this comment.
There are like 200+ C3437: #pragma execution_character_set is not supported when /source-charset, /execution-charset, or /utf-8 has been specified errors jumping out when /utf-8 is applied globally. Most of them seem to result from macros like TRACELOGGING_DEFINE_PROVIDER and TraceLogging*.
There was a problem hiding this comment.
hrmm, I see what you mean. so it sounds like the tracelogging stuff doesn't play nice with the /utf-8 flag. It looks like it's setting the execution_character_set to be utf-8, which we want anyway. I'll need to do some more reading about what's going on here.
There was a problem hiding this comment.
Yeah, it's kind of self-contradictory that the Windows SDK, even the latest versions, uses a compiler directive that has been deprecated since VS 2015...
| // Licensed under the MIT license. | ||
|
|
||
| #define REMOTE_STRING L"�remote �npipe:pipe=foo,server=bar�\t" | ||
| #define REMOTE_STRING L"–remote “npipe:pipe=foo,server=bar”\t" |
There was a problem hiding this comment.
I believe that the quotes are correct. it's possible to ask the console to filter some of the "smart" characters back to their normal ascii equivalents so that they don't break other applications that might not handle them correctly. This particular header is only included in the clipboard unit tests but isn't actually used, I suspect they were part of a test that was removed/lost/not written.
|
Does |
|
I just built the latest master (6c7dfd2) without encountering encoding issues. I'm not clear if some commit into master has resolved this; if this is the case then I think I can close this. |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
It still fails with current(2019-06-14) master(d82eab4) on a Chinese version of Windows whose code page is 936(GBK). Issues got solved by passing these utf-8 options to cl. So I think this commit helps non-english users. |
|
I did this in #1929 for the entire project and deleted UnicodeLiteral.hpp because no one was using it. So I'll close this. |
Use UTF-8 as source and executable encodings for C++ projects that
contain UTF-8 source files.
Tree 35229a7 fails to build where the
default code page is 936 (GBK), due to multiple files containing UTF-8
encoded strings and/or comments, and these projects set to
fail-on-warning mode. Additionally,
src/host/ut_host/UnicodeLiteral.hppcontains a Windows-1252 encodedstring. This commit adds the
/utf-8option to the affected projects,and converts
src/host/ut_host/UnicodeLiteral.hppto UTF-8.