Skip to content

Normal distribution: generation algorithm #2189

@arthurits

Description

@arthurits

I've been using functions RandomNormal from DataGen.cs and Population from Population.cs in order to visualize the distribution of a given variable

public Population(Random rand, int pointCount, double mean = .5, double stdDev = .5)
{
values = DataGen.RandomNormal(rand, pointCount, mean, stdDev);
Recalculate();
}

I'm no expert whatsoever in statistical algorithms, but it seems that both functions make use of RandomNormalValue where some kind of Box-Muller algorithm is implemented to generate gaussian numbers:

public static double RandomNormalValue(Random rand, double mean, double stdDev, double maxSdMultiple = 10)
{
while (true)
{
double u1 = 1.0 - rand.NextDouble();
double u2 = 1.0 - rand.NextDouble();

However, the gaussian data generated seems to be sistematically underestimated as shown below, where the grey vertical line represents the mean and the bell curve appears to be slightly offsetted to the left.
image

There's an article in Wikipedia where the Box-Mueller algorithm is explained. It states that
Suppose U1 and U2 are independent samples chosen from the uniform distribution on the unit interval (0, 1).

Since Random.NextDouble() does in fact return the value 0, the current computation of both u1 and u2 will eventually return some 1s.

Therefore, I've been using the following modification of RandomNormalValue:

public double SampleGaussian(Random random, double mean, double stdDev)
    {
        double u1 = NextDouble(random);
        double u2 = NextDouble(random);

        double y1 = Math.Sqrt(-2.0 * Math.Log(u1)) * Math.Sin(2.0 * Math.PI * u2);    // Math.Cos is also fine
        return mean + stdDev * y1;

        double NextDouble(Random random)
        {
            return ((double)random.Next(1, Int32.MaxValue)) / Int32.MaxValue;    // random.Next includes 1 and exludes Int32MaxValue
        }
    }

This modification provides a more visually mean-centered gaussian curve:
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions