Skip to content

Why is the speedup of this parallelization of CPU-bound runspaces limited to ~3x? #6965

@alx9r

Description

@alx9r

I'm experimenting with running PowerShell unit tests in parallel. The execution of most tests in my corpus is CPU-bound.

Steps to reproduce

Run the following on a computer with 8 or more cores that are otherwise idle.

$processorCount = [System.Environment]::ProcessorCount

Write-Host "Processor Count: $processorCount"

$sb = {
    function fibonacci {
        param([int]$n)
        [bigint]$a=0
        [bigint]$b=1
        foreach ($x in 0..$n)
        {
            $a,$b = $b,($a+$b)
        }
        $b
    }
    fibonacci 100000
}


$t_sync = Measure-Command $sb

$rsp = [runspacefactory]::CreateRunspacePool(1,$processorCount)
$rsp.Open()

foreach ( $n in 1..$processorCount )
{
    $ps = 1..$n | % { [powershell]::Create().AddScript($sb) }
    $ps | % { $_.RunspacePool = $rsp }

    $t_begin = Measure-Command {
        $invocation = $ps.BeginInvoke()
    }

    $t_wait = Measure-Command {
        while ( $invocation.IsCompleted -contains $false )
        {
            sleep 0.1
        }
    }

    [pscustomobject]@{
        'n '                 = $n
        'BeginInvoke() (ms)' = [int]$t_begin.TotalMilliseconds
        'Wait (ms)'          = [int]$t_wait.TotalMilliseconds
        'speedup'            = (($t_sync.TotalMilliseconds*$n)/$t_wait.TotalMilliseconds) | % { [math]::Round($_,3) }
    }
}

Expected behavior

I expected the speedup and processor utilization to be somewhat proportional to the number of runspaces as long as the number of runspaces is fewer than the number of cores.

Actual behavior

The actual speedup seems limited to ~3x even on a computer with 16 cores. As the number of runspaces nears the number of cores, the actual processor utilization seems to peak only momentarily and settles to approximately 50%.

Here are the results of a typical run:

Processor Count: 16

n  BeginInvoke() (ms) Wait (ms) speedup
-- ------------------ --------- -------
 1                  1      2910   1.162
 2                 15      2990   2.263
 3                 15      3655   2.776
 4                 38      6884   1.965
 5                 38      8144   2.077
 6                 38      8450   2.402
 7                 44      9712   2.438
 8                 53     11435   2.366
 9                 53     14680   2.074
10                 47     12754   2.652
11                 61     19348   1.923
12                 78     21901   1.853
13                 92     23640    1.86
14                 76     22776   2.079
15                 85     27332   1.856
16                136     29318   1.846

Here is the CPU utilization graph from the end of the above test run. The peak occurred as the n=16 run began. The peak and plateau shape seems to be characteristic for all n.

image

Environment data

> $PSVersionTable
Name                           Value
----                           -----
PSVersion                      6.1.0-preview.691
PSEdition                      Core
GitCommitId                    v6.1.0-preview.691
OS                             Microsoft Windows 6.3.9600
Platform                       Win32NT
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0...}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Issue-Discussionthe issue may not have a clear classification yet. The issue may generate an RFC or may be reclassifWG-Enginecore PowerShell engine, interpreter, and runtimeWG-Engine-Performancecore PowerShell engine, interpreter, and runtime performance

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions