Skip to content

A script aiming to visualize basic probability theory concepts.

Notifications You must be signed in to change notification settings

jonathanmfung/ProbTheory.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ProbTheory.jl

A script aiming to visualize basic probability theory concepts.

julia -e "include(\"ProbTheory.jl\");
          LLN();
          CLT();
          BE_binom_heatmap(tight = false);
          BE_binom_heatmap(tight = true);
          BE_binom_slice()"

Law of Large Numbers (n = 600) [Ref]

$$\bar{X}_n = \frac{1}{n} (X_1 + \cdots + X_n)$$

$$n → ∞ \Longrightarrow \bar{X_n} → μ$$

This is sampling from a Uniform(-100000, 100000), so $μ = 0$

./media/LLN.gif

Central Limit Theorem - Classical (n = 600)

./media/CLT.gif

Berry-Esseen Theorem - Binomial Case [Ref (Thm 1)]

Theorem 1 Verbatim:

Let $p ∈ (0,1)$ and $n ∈ \mathbb{N}$ and let Fn,p denote the distribution function of the binomial distribution with parameters $n$ and $p$. Then we have with $q := 1 - p$

$$supx ∈ \mathbb{R}{\left|Fn,p(x) - Φ\left(\frac{x - np}{\sqrt{npq}}\right) \right| < \frac{\sqrt{10} + 3}{6\sqrt{2π}} ⋅ \frac{p^2 + q^2}{\sqrt{npq}} }$$

In the case $p ∈ \left[\frac{1}{3}, \frac{2}{3}\right]$ we even have the sharper inequality

$$supx ∈ \mathbb{R}{\left|Fn,p(x) - Φ\left(\frac{x - np}{\sqrt{npq}}\right) \right| < \frac{3 + |p - q|}{6\sqrt{2π}\sqrt{npq}} }$$


This is a specific version of the Berry-Esseen Theorem, which states that the distribution of the sample means from a specific subset of random variables will converge to the standard normal, by an order of $\frac{1}{\sqrt{n}}$.

This Binomial variant gives two bounds, the first being general for all values of p, and the second being tighter but only for the middle third range.

The next two plots depict the “error” term between the Binomial and Standard Normal. Essentially, they numerically back Schulz’ Theorem.

Binomial BE Heatmap (n = 500, tight = false) ./media/BE_binom_heatmap_500_regbound.png

Binomial BE Heatmap (n = 500, tight = true) ./media/BE_binom_heatmap_500_tightbound.png

Binomial BE Slice (n = 20, tight = true) ./media/BE_binom_slice.png

We can also check that none of the tight errors are negative in the middle third, but are in the outer two thirds:

julia -e "include(\"ProbTheory.jl\"); BE_binom_bounds(1000)"

For n = 1000 on p = [0.001, 0.999], there are 183 negative differences.

For n = 1000 on p = [0.333, 0.666], there are 0 negative differences.

Appendix

Reference

$$E[X] = ∫-∞∞ xf(x)dx$$

Using a Uniform Distribution:

$$f(x) = \frac{1}{b-a}$$

Expectations

$$E[X] = ∫_a^b x \frac{1}{b-a} = \frac{1}{2(b-a)} [x^2]_a^b = \frac{a+b}{2}$$

$$E[X^2] = ∫_a^b x^2 \frac{1}{b-a} = \frac{1}{3(b-a)} [x^3]_a^b = \frac{(b-a)(b^2 + ba + a^2)}{3(b-a)} = \frac{b^2 + ba + a^2}{3}$$

Checking the Variance

$$Var(X) = E[X^2] - E[X]^2$$

$$Var(X) = \frac{b^2 + ba + a^2}{3} - (\frac{a+b}{2})^2$$

$$Var(X) = \frac{b^2 + ba + a^2}{3} - \frac{a^2+2ab+b^2}{4}$$

$$Var(X) = \frac{4b^2 + 4ba + 4a^2}{12} - \frac{3a^2+6ab+3b^2}{12}$$

$$Var(X) = \frac{b^2 - 2ba + a^2}{12} = \frac{(b-a)^2}{12}$$

E[|X|^3]

$$E[|X|^3] = ∫_a^b |x|^3 \frac{1}{b-a}$$

$$ = \frac{1}{b-a} (∫_a^0 -x^3 + ∫_0^b x^3)$$

$$ = \frac{1}{b-a} (-\frac{1}{4}[x^4]_a^0 + \frac{1}{4}[x^4]_0^b)$$

$$ = \frac{1}{b-a} (\frac{1}{4}a^4 + \frac{1}{4}b^4)$$

$$ = \frac{1}{4} \frac{1}{b-a} (a^4 + b^4)$$

About

A script aiming to visualize basic probability theory concepts.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages