8
Vertrouensintervalle
Confidence Intervals
A statistics professor dies and so the test scheduled for that day is cancelled.
A student rings the department at 5 minute intervals to ask if the test is on.
The guy answering the phone asks him, "Why the bloody hell are you ringing
so often? I've told you 16 times the professor has passed away! What are
you doing, some sort of research, are you experimenting on me? What the
bloody hell is it?"
"Nah, the student replies, no research. I just like to hear you say it."
Inferensie
Inference
Hierdie kwartaal stel ons belang in wat in die populasie
gebeur.
This term we are interested in knowing what is going on in
the population.
15% van ‘n steekproef sal ‘n produk koop, hoeveel in die algemeen?
15% of a sample will buy a product, how many in general?
20 blikkies is gevul met gemiddeld 500ml, wat van alle blikkies?
20 cans are filled with an average of 500ml, what about all cans?
Beweeg vanaf bekende steekproef inligting, na onbekende
populasie hoeveelhede.
Move from known sample information, to unknown
population quantities.
Statistieke (Latyn) Parameters (Grieks)
Statistics (Latin) Parameters (Greek)
Ons weet vooraf dat ons nie 100% korrek kan wees nie.
We know beforehand that we cannot be 100% correct.
Gebruik steeds steekproewe as ons beginpunt.
We shall still use samples as our starting point.
1
Ons gaan byvoorbeeld ‘n steekproef van 20 Coke blikkies
gebruik om iets te probeer sê oor alle Coke blikkies.
We shall, for example, use a sample of 20 Coke cans to try
and say something about all Coke cans.
Verdelings behandel in die vorige hoofstukke gaan gebruik
word om die fout wat ons weet ons gaan maak te
kwantifiseer en te probeer beheer.
Distributions covered in the previous chapters will be used
to quantify and to try and control the errors that we
know we’ll make.
Die inferensie tegnieke wat ons gaan gebruik is:
The inferential techniques that we’ll be using are:
Vertrouensintervalle (Hoofstuk 8)
Confidence Intervals (Chapter 8)
Gee vir ons ‘n interval wat ons verwag die populasie
parameter sal insluit.
Gives us an interval that we expect includes the
population parameter.
Hipotese Toetse (Hoofstukke 9, 10)
Hypothesis Testing (Chaptes 9, 10)
Toets ‘n spesifieke stelling oor ‘n populasie parameter.
Test a specific statement about a population parameter.
Vertrouensintervalle
Confidence Intervals
Vorige statistieke waarna ons gekyk het word ook
puntberamers genoem.
Statistics that we looked at previously are also called point
estimates.
Gebaseer op die huidige steekproef is dit my “beste”
beramer van die populasie parameter.
Based on the current sample this value is my “best”
estimate of the population parameter.
2
Probleem met ‘n punt beramer is dat ons vooraf weet dat
dit nooit die regte waarde kan wees nie.
Problem with a point estimate is that we know beforehand
that it can never be the correct value.
20 blikkies is dalk gevul met gemiddeld 501ml, alle blikkies
sal nooit ook met presies 501ml gevul wees nie.
20 cans might be filled with an average of 501ml, but all
cans will never be filled with exactly 501ml.
‘n Interval van moontlike waardes sal ‘n beter idee gee van
wat die populasie parameter is, [495 ; 505].
An interval of possible values will give us a better idea of
what the population parameter is, [495 ; 505].
Die interval gebruik steeds steekproef inligting, so as ons ‘n
“snaakse” steekproef het mag die interval steeds nie
baie werd wees nie.
The interval still uses sample information, so if we have a
“funny” sample the interval might still not be of much
use.
‘n Vertrouensinterval is afhanlik van drie waardes:
A confidence interval depends on three values:
1) Die puntberamer
The point estimate
2) Die verdeling van die beramer
The distribution of the estimator
3) Die variasie van die beramer
The variation of the estimator
Die steekproefverdelings wat in hoofstuk 7 gedoen is vorm
die basis vir vertrouensintervalle.
The sampling distribution discussed in chapter 7 form the
basis of confidence intervals.
Vir hierdie kursus is die wiskunde agter die intervalle nie
belangrik nie, eerder die gebruik en interpretasie.
For this course the mathematics behind the intervals are
not important, rather the use and interpretation.
3
Vertrouensinterval vir μ, σ bekend
Confidence interval for μ, σ known
Indien σ bekend is kan ons die steekproefverdeling van
gebruik om ‘n vertrouensinterval op te stel.
If σ is known we can use the sampling distribution of to
construct a confidence interval.
Drie dele van die interval:
Three parts of the interval:
Steekproefgemiddelde (puntberamer)
Sample average (point estimate)
Waarde uit die normaal tabel wat ‘n aanduiding is
van die vertroue wat ons verlang.
Value from the normal table that gives an indication
of the confidence that we want in the interval.
Variansie van die beramer.
Variance of the estimator.
Bekend of bereken
Known or calculate
Bekend
Known
Lees af uit die normaal tabel (sakrekenaar)
Read from the normal table (calculator)
4
Betekenispeil
Level of significance
Dit is die fout wat ons bereid is om te maak.
It is the error we are willing to make.
bereid om ‘n 5% fout te maak, dus ons wil
95% korrek wees.
willing to make a 5% mistake, so we want to
be 95% correct.
Kan teoreties gebruik, is egter nie van veel
praktiese waarde nie.
Can theoretically us , this is however of not much
practical use.
Ons is dus bereid om wel ‘n seker fout te aanvaar sodat
ons ‘n prakties betekenisvolle interval kan kry.
We are willing to accept a certain error to get a practically
meaningful interval.
5
Tipies kies ons:
Typically we select:
EXAMPLE 8-1 (p. 255)
Steekproefgrote
Sample size
Wydte van ‘n interval is
Width of an interval is
indien ons dus weet wat σ en is kan ons n oplos.
if we know what σ and are, we can solve for n.
word die steekproeffout genoem.
is called the sampling error.
6
Om n te bereken:
To calculate n:
met
with
EXAMPLE 8-2 (p. 257)
Vertrouensinterval vir μ, σ onbekend
Confidence interval for μ, σ unknown
Indien σ (populasie standaard afwyking) onbekend is moet
ons die ooreenstemmende steekproef waarde, S, gebruik
om dit eers te beraam.
If σ (population standard deviation) is unknown we have to
use the corresponding sample value, S, to first estimate
it.
Ongelukkig is
Unfortunately
Indien ons dus σ met S vervang kan ons nie meer die
normaal tabel gebruik om die interval op te stel nie.
If we replace σ with S we can no longer use the normal
table to construct the interval.
Gelukkig is daar in ‘n ander verdeling wat ons kan gebruik,
die sogenaamde t-verdeling.
Luckily we can use a different distribution, the so called t-
distribution.
7
t-Verdeling / t-
Distribution
Normaal /
Normal
(n – 1) is die vryheidsgrade van die t-verdeling en is ‘n
ekstra waarde wat ons moet gebruik indien ons waardes
vanaf die t-tabel wil lees.
(n – 1) is the degrees of freedom of the t-distribution and
is an extra value that we have to use when reading
values from the t-table.
Die vertrouensinterval indien ons nie σ ken nie is:
The confidence interval if σ is unknown is:
EXAMPLE 8-5 (p. 264)
8
Vertrouensinterval vir die variansie (σ2)
Confidence interval for the variance (σ2)
EXAMPLE 8-6 (p. 269)
Slegs tweekantig intervalle
Only two-sided intervals
Vertrouensinterval vir die proporsie (π)
Confidence interval for the proportion (π)
EXAMPLE 8-7 (p. 271)
9
Proporsie / Proportion
Wydte van ‘n interval is
Width of an interval is
indien ons ‘n idee het wat π en is kan ons n oplos.
if we have an idea of what π and are, we can solve for
n.
word die steekproeffout genoem.
is called the sampling error.
Om n te bereken:
To calculate n:
met
With
EXAMPLE 8-8 (p. 272)
Voorspellingsinterval
Prediction interval
EXAMPLE 8-9 (p. 275)
10