Skip to content

Sort-Object inappropriately sacrifices stability for performance #7846

@IISResetMe

Description

@IISResetMe

PR #2518 by @KirkMunro adds the ability to select the top or bottom N items of a sorted set, introducing stability to the sorting method. However, this behavior is undermined whenever N equals or exceeds the input size.

This deviance can be observed when sorting by congruence modulo for example:

Steps to reproduce

1..20 |Sort-Object {$_ % 3} -Top 20

Expected behavior

3
6
9
12
15
18
1
4
7
10
13
16
19
2
5
8
11
14
17
20

Actual behavior

18
3
15
6
12
9
1
16
13
10
7
4
19
11
8
14
5
17
2
20

The culprit seems to be the last two conditions in this branch: https://github.com/KirkMunro/PowerShell/blob/master/src/Microsoft.PowerShell.Commands.Utility/commands/utility/sort-object.cs#L247

            // If -Top & -Bottom were not used, or if -Top or -Bottom would return all objects, invoke
            // an in-place full sort
            if ((Top == 0 && Bottom == 0) || Top >= dataToProcess.Count || Bottom >= dataToProcess.Count)
            {
                sortedItemCount = FullSort(dataToProcess, comparer);
            }
            // Otherwise, use an indexed min-/max-heap to perform an in-place sort of all objects
            else
            {
                sortedItemCount = Heapify(dataToProcess, comparer);
            }

Suggested fix would be to either add a -Stable switch parameter that forces Sort-Object to use the stable min/max heap sort implementation, or change the branch above to only default to FullSort() when -Top/-Bottom are omitted completely.

My personal preference would be to go for both changes - -Stable would be nice for when you don't know input size up front but want to maintain input order for equal items

Environment data

> $PSVersionTable

Name                           Value
----                           -----
PSVersion                      6.1.0
PSEdition                      Core
GitCommitId                    6.1.0-57-g0f0c46dfe51440470b86f67b258e14aea32fa0aa
OS                             Microsoft Windows 10.0.17134 
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Issue-Discussionthe issue may not have a clear classification yet. The issue may generate an RFC or may be reclassifWG-Cmdlets-Utilitycmdlets in the Microsoft.PowerShell.Utility module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions