Skip to content

Correct the information about the default encoding, document BOM behavior with respect to UTF-8 and other Unicode encodings #3188

@mklement0

Description

@mklement0

Topics for cmdlets that have an -Encoding parameter (Set-Content, Out-File, ...) need to be updated to reflect that PowerShell Core consistently uses BOM-less UTF-8 as the default encoding.
(The current information pertains to Windows PowerShell only, and, as an aside, is in part incorrect - see #1483).

Additionally, given that many cmdlets have an -Encoding parameter, it is worth considering introducing a new conceptual topic named, say, about_character_encodings, rather than to repeat (too much) information in each cmdlet's topic.

In addition to an enumeration of the supported encodings and a description, such a new topic could:

  • describe the fundamental differences between the editions with respect to the default encoding in more detail, along with the ramifications.

  • provide more guidance, along the lines of the following:


A note re the use of a BOM (a.k.a. Unicode signature) with UTF-8 and other Unicode encodings:

  • In Windows PowerShell, -Encoding utf8 invariably creates a BOM (applies not just to Set-Content, but also to other cmdlets that produce file output, such as Out-File and Export-Csv).

    • Direct use of the .NET framework is required to create BOM-less UTF-8 files. Note that the .NET framework's default encoding has always been BOM-less UTF-8.
  • PowerShell Core creates BOM-less UTF-8 files by default (and also when you explicitly use
    -Encoding utf8); you can opt to have a BOM created with -Encoding utf8BOM.

For best overall compatibility, BOMs in UTF-8 files should be avoided: Unix platforms and Unix-heritage utilities also used on Windows Platforms generally don't know how to handle them.

Similarly, -Encoding UTF7 should be avoided, because it is not a standard Unicode encoding (and is written without a BOM in both PowerShell editions).

In both PowerShell editions, all other Unicode encodings available with -Encoding do create an (encoding-appropriate) BOM: Unicode (UTF-16LE), bigendianunicode (UTF-16BE), and utf32 (UTF-32).

Version(s) of document impacted

  • Impacts 6.next document
  • Impacts 6 document
  • Impacts 5.1 document
  • Impacts 5.0 document
  • Impacts 4.0 document
  • Impacts 3.0 document

Reason(s) for not selecting all version of documents

  • The documented feature was introduced in selected version of PowerShell
  • This issue only shows up in selected version of the document

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-coreArea - Microsoft.PowerShell.Core moduleresolution-duplicateStatus - closed as duplicate issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions