Statistical POWER
explained
simple & easy!
A/B Testing
01 02 03 04 05 06 07 08 09 10 11 12
What is power?
The whole idea of statistical testing is to
see if you can reject the null hypothesis
and tell your business partners:
“Hey, the effect is statistically significant.”
That’s not always the case, though.
Sometimes, you can’t reject the null
hypothesis due to “low power”. (fret not,
we’ll dive it shortly)
Power, like the name, is what you want
more. The higher (power) the better.
01 02 03 04 05 06 07 08 09 10 11 12
Assuming null hypothesis is true, we make a decision
the decision
accept null hypothesis reject null hypothesis
Type I error rate
(false positive) = 0.05
reject the null
hypothesis when it's
actually true
= 0.05
What if the null
hypothesis is false,
and the distribution
looks like this?
01 02 03 04 05 06 07 08 09 10 11 12
accept null hypothesis reject null hypothesis
power = 1 -
correctly reject the
null hypothesis
when it’s false
should have
rejected (coz H-
null is false) but
didn’t
Type II error rate (false
negative) =
accept the null hypothesis
when it's actually false
01 02 03 04 05 06 07 08 09 10 11 12
we want to make this yellow area
bigger without moving the
decision line to the left
so we have more power
But HOW?
01 02 03 04 05 06 07 08 09 10 11 12
Sample size
impacts the spread of the distributions (ste = std / sqrt(n))
sample size is small sample size is large
vs
Effect size
distance between the two means
Effect size (distance) near 0 Effect size (distance) is large
vs
So, what to make all of this
in an A/B test design?
01 02 03 04 05 06 07 08 09 10 11 12
Make A/B testing reliable
Increase alpha (decrease
confidence level) Risky
Increase sample size
Increase effect size We can’t control
Decrease standard deviation
of the sample We can’t control
Found this
useful?
Save it
Follow me
Hit my bell