-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Correct the information about the default encoding, document BOM behavior with respect to UTF-8 and other Unicode encodings #3188
Description
Topics for cmdlets that have an -Encoding parameter (Set-Content, Out-File, ...) need to be updated to reflect that PowerShell Core consistently uses BOM-less UTF-8 as the default encoding.
(The current information pertains to Windows PowerShell only, and, as an aside, is in part incorrect - see #1483).
Additionally, given that many cmdlets have an -Encoding parameter, it is worth considering introducing a new conceptual topic named, say, about_character_encodings, rather than to repeat (too much) information in each cmdlet's topic.
In addition to an enumeration of the supported encodings and a description, such a new topic could:
-
describe the fundamental differences between the editions with respect to the default encoding in more detail, along with the ramifications.
-
provide more guidance, along the lines of the following:
A note re the use of a BOM (a.k.a. Unicode signature) with UTF-8 and other Unicode encodings:
-
In Windows PowerShell,
-Encoding utf8invariably creates a BOM (applies not just toSet-Content, but also to other cmdlets that produce file output, such asOut-FileandExport-Csv).- Direct use of the .NET framework is required to create BOM-less UTF-8 files. Note that the .NET framework's default encoding has always been BOM-less UTF-8.
-
PowerShell Core creates BOM-less UTF-8 files by default (and also when you explicitly use
-Encoding utf8); you can opt to have a BOM created with-Encoding utf8BOM.
For best overall compatibility, BOMs in UTF-8 files should be avoided: Unix platforms and Unix-heritage utilities also used on Windows Platforms generally don't know how to handle them.
Similarly, -Encoding UTF7 should be avoided, because it is not a standard Unicode encoding (and is written without a BOM in both PowerShell editions).
In both PowerShell editions, all other Unicode encodings available with -Encoding do create an (encoding-appropriate) BOM: Unicode (UTF-16LE), bigendianunicode (UTF-16BE), and utf32 (UTF-32).
Version(s) of document impacted
- Impacts 6.next document
- Impacts 6 document
- Impacts 5.1 document
- Impacts 5.0 document
- Impacts 4.0 document
- Impacts 3.0 document
Reason(s) for not selecting all version of documents
- The documented feature was introduced in selected version of PowerShell
- This issue only shows up in selected version of the document