LEE
NOTES
INTRODUCTION 10 sTATIS TICS
SIATISTI\cs
* yMe sclence of collecting ) OCJANI2 Ing
analysis anh interpreting Acta.
* Ancwers ctohchcal Qvectione.
¥ The taske of mating \sence of the
Wn fermanen contameda IN tne data to Poly
Under stand the poene mena you are
swaying,
M= 32
more \ese nw Wetween
> 22 <32 x
=}
Aa +O
Pescriptwe Stahstcal Kalycic
% Whe task of orga niz M9, man polating
and descrvoing asset of data+
FM wvovel the comporation of elementary
Hesen\pAwe measures SCM act ave ragec,€ dd
b&b
NOTES
Kequencies and percentages.
¥ Ws ww volves the graphic gor tra yale such ac
PX charte \wme chart¢e and ber grapns *
* Raw Davy * Deccripine Stanictes
\nferenhal Stok sheal Analycrs
* We process of making wterence fom data.
F Mere are Stypec Of Stahsteal ference !
Est maton, Decision-Making Coompansons),
& ssociaton CRelothouships), Data-Reducton
and classificahon.
Classification, Ramp \ Not RIPE
Tdennfication, Peceace , Not Deceave
Predieton 3 Jan ie 12 mothe
Types 0€ Statcteal Analye<
Vv Deccnptive What ace ine characttrncnes of
whe feSpon dents ?
V \nferemtal what are ne characteris tes of ane PepLlar
Lions ?NOTES
W Vitlerences Are two oF More grovpl tne came
or Afferent ?
PM hecociarve Are tWO of More VANA wie c
related wa cystematc Way ?
Predictive can we ere tet one vona ole
\€ we Know one or more Otel
vanavies 7
\mvages , Interview , eta Experiment
\nferenmal clarisnce’s gen
mation from a random Sample
eralize mfor-
xo A popvlatons
generate?
Population of
40, 0000
ene Ter
Deccriptwe stonsics do not g
th Ane
vpecavuce they dea) wi
population weelf.
Population of 10,0000EG
‘b&b
NOTES
The Scientific methed / Enginee nng metned
Phe mestnod Mot is appl\icalle in any
S\WOAOM Where prowlem 1S to be solved
wy Colltctng AataA, werner MA
Yo mare In ferences cieout some Ppoputcrton
Cor uniwerse) bacecd Lpon the wformoton
cContamed my acimpe from duet PoPUIatHOn.
pomlanon diuatecest
Compart sons
stTimaryer << Relation
Sample in formehon SNIEe
AY
Patra reduction classi € cahone
The elements of the scenntic Wetnod
#\MAINi Avals okyect® Amat ace deseriped by
aata~
¥ Namavle Aphe no mena of charactenste o¢
the was vidv als,
A Vara mformotion cartected about Me variavie
for Ane dividuals ater observation or
MeEacurement ysvally numbers wort at times
_ - Care gor ies.VV COT ET
NOTES.
* Population ai) matviduals of wnterect, He
ennce group of individuals that
a recearchers wantc +0 reseor™N
or learn more about. THs 16 the
group that resear chert make
inferences aloo Ve
y Sample asob grvp Of mayvidvale fom the
pope lanion of interest, This 1s the
Qroop that date ss actually collected
From, Me growp Whose data the
wferences are bacedt upen :
* Censes an attemet *© nelode the entire
popu lahen of individvale m the sompe-
Reliability
whe probability that a staohshcal nfecence
alzout a pop larion vaset upon aramprte wi'!
be correct.
Getimg Dota worth analyz ine
\w formation is gathered in the Grn o€ camples
oc collection of OoservationssNOTES
q 2
Zwaye to Organize the deseription of dota:
~
© Grapin cally \ a
=, & Nemenically , 7 o?
w Stan stes nome nical deccmetore
de scribe by a cample.
=y © Parameters numerical des eriptoce
describes a populaton
Z Siwahons where Samples are
Collecteal 2
& [Link] data colleeted
wrthout manipulating oF
effecting the ettvetons,
> EXPERIMENTS (data collec tect where
resecarchere manipulates
the citeation to determine
the effect upon Individuals
»NOTES |
STUDY DESIGN
~ J
Expenmental
opcervathona!
ctu dior
(subjects are
Studies
Cintervention of
tne recearcher,
Obcerved, NO
clecervanon of
action from +h
what hap
<0s researcher)
Randomised
Contre Neck
fraihe (RCTs)
Conor stwaier
case control
tude
Cross sectoral
studies
Ecological
nd ctvdies
Tx. 1 Survey] fampling, wierviews or
Non - fon domisecd
Contre Ned trials
L
aves tonnaires.
FePrRIMENTAL STUDIES
TEST
Control Trpee \menaal
N=|
NOTES
\nde pendent variable + wm, tater
Amboy Mt of water
+ Water — Nater
Pupermental Conto}
GeUP grove 7
Deven dent vana ble:
fraction of seed that cprout
Al10 manipulated by
Mhe re Searcher mM an
experiment Calso called
Aue inde pew dent variable).
H Placeloo treatment Considered an option=e
\ |
NOTES
P RESPONSE the oculcame vanabie in an
expernment were changes in
AMW\g Nanawle are meacured andl
analy 2 ed C dependent vamos te).
V CON FOON DING VARIABLE A yanatole Bhat
Mary affect me Cee ponce 10
AAAINON AO We explanatory
vanall®, good exgenmente
Attempt to contol tnece
NAMaAv\eS intne way the
Samples are celected |
ConFoon ring VARIABLE
( Com Foonpiwa ) 1
( \ Vaeiaee l
WV | Grd variable) | VV
= a
\WPependent Vanavle }~-5|De pen dent
seeriorl | Variatole
Corre latioi
Con fovn AingNoariavies Caka thicd variables)
Are Narvalles that the researcher fared +0 Control,
OF elvmrmate , damaging Ane wherwal Validity of
on experiment.NOTES
Minimizing tne effects of CLonfoonding Variables
A well planed _expecimental design and con stant
CMecks , will {yer out whe worst coonfounding variables.
Yor tram ple x _ =
Randomi2 mg Qrmups, ot2ing
Strict controle , and coun operahonali-
Zaton practice al\ cont bote +O eNmiaay
nung potential ticd Variables.
A confounding vanablec, also Known
aca shird variable OF A mediator Vana-
wie, in fluences born Me inde pendent
vanavie and deeendent vanable
Bemg onaware of oF failing to contol
I foy con foun Ang VANallec may cavce
ANe researcher to analyze the reco lt
\n correctiy. The results may show a {alse
\
cworvelahoy wetween the dependent and
\nde gen dent vanavlec, leading Ye an
a ncovrect veiection of the null hypotnecicNOTES
CWNRACTERISTICS 0F CON FOUNDING VARIABLE
Mere are 3B condihons tMat most be
present Gr conforndaing to eccur.
A. Me counfoun ding factor muct be asson
ciated witw votn tne risk factor of Interest
anc\ we ovtome.
2. The con Punding factor must we
Aistr\ outed VME quanyamong he
Qroves veing compared.
2. BR confounder cannot be an interme-
racy steg wine cavsal patnway Fomthe
¢€xXpo sure of nterect to the Ovtcome
Of mvrerest-
Exposure —? Ovtcome
=
© wn founding,
Navialole on — a
Physical Weard
Iwactiwity 2 A\seace
Older people
O\derpeog'e exewCice le ce eid nae
Wha iE groues diGer in ageNOTES
SAMPLING
“~~
_—— DESION 2
4 OD 14 a 74
5 10 15 20 25
4-1 fandom fam pling
We hove previously Seen that the
wd term target population repre sent¢
the collection of vnrte Cpeople sokagectr eto)
in which we are jnterex ted . In the ab sence
of time and budgetary constraints we
Conduct m census, tat Ig A total enomera-
riow Of the population, \rs advantage 1s
Hart Amere \¢ nO _camPling error because all
_gopulation units are observed and so therea5
at
val
NOTES
\s Ho eshmation of gopulaton Ear me xers .
Due to te large size, No € most pope \anou,
2 Soviovg disadvantage witw a census IC
A cost, cots often not Rasiole tn Practice.
Even witha cence £ non campling ©6Cor
may ocur, or example \f{ we have tO cere
to_U sing cheaper Clrence les celialole)
vnterviewers wno may eno neoucly ce eel
Aata misunderstanr a res-Pon deut ete.
Wpes Of evroyv &
& Sampling
Quant fy 4WiC Sort of err ityercally
AMMUGQh Separate \nves hgotion. we disti>
7 nauis vetween 250%. Of non camping
ereot on|
=
o)
a
NOTES
a celechon biac- the Sampling frame
not being equal iS,
Me target poe lation .
x Fesponse Yias- the actval mea_
____ surements might be —
wrong, fr examele
aml guev Sauecton
Wording »aisunder 7 _
Standing OF a word _
\n_ a 4veSNonnaire or
censihuity Of information
m which 16 sovght :
wm Yroloabity Samp mean every copy lator
element nas a Known
nen-zer projatbrility
of bemg eerecttd in
tae sample.
% Simple vandom cam pling (see)
each element in the
pemla tion has a Known
and tqua!l proweaby|ity of
Selec tlor?.\nttoaveton to statisthes
\ a Provide wmgormeation by organizing ano
sum mMadzing «
= * Describe tue natee of a sample
w Descnpive Siatishes
zy a Measotes that best charac te nze afrequency
distmbuttion.
oe Mese meacures Aesombe score’ that grep
arwnadaa central valve.
Iwo sets of nomencal Aesop tors Of
Qvantitahve data.
[Link] of Central Tendency
= are nem vert Mat meacore the central
wt location of the data sets OFten wes Mey
are Cef{eretea Oo AC measure s OF location”
ef or * measure © OF ty¥RI cal", tree different
mea core 6 OF central Fendeney mw Be
ere SENtED | ComPaLEA and con *WasteA manic
a section. Mean, MeAian, and Mode. he
mean, MEALAN, and Mode are cach a type of
. overage a2. Measures of Vana wility
= are number that measure the Amount of
Nana hry oF A\s peccion present Na set
of data, convecs ely these state Hes
a\so measure the amount} Of con SIENON EY
present in a ara set, since varia Bitty
Ona con sisten ey are Oppo SINS charac -
yemsmes Wa aata set: OREN ames:
sney ace weeerrea yo ac “meacore s of
ceread" "Taree different measures of
vyanaty Wil be presented, compared,
ae =
ana ceonttasted withic cechon > Range,
Nanance , and Standard dena shen
® Measores OF central wendency -
% Me mean of a quan Htothve set of data
aS 16 defined ag the anthmetc mean.
* The mode is the more frequently occurring
wf) category. A “Fre quen cy aVstnioonon may nave
mort Man One mode (bimodal or mult-
mo dal)d
DD
£
P he median
Ma fre quency dist but0n
4 Scores ace placed Ww order from lowest
to nig hect.
% The median \c Wwe madte of AME Aye tTrvovtion
RAR VS The BOM percent?
BOe/s OF the score IN AME Frequency
Aistry PUNCH fal\ below ANA aloove tne
> Properties of the mechan
Attn bores of tye Median
* Stao\\ity
+ me medion \s ma tsecred by extreme scores
2 VEINS Calculated by counting the numer of
cases
e\t does not consider the valve of jhe case,
PR calcolating he Median
eThe Median can we calwlated fasiy and
determined by inspec ton
N= Me NnUmber Of caces
We median = NIZ~y
=~
my
+o \arge.
NOTES
\E am numbers are ordered From enailene
#M Is Odd, ane comple mechan \s the number
Ww gestion
nar/2,
¥)\F 1 is even, tne sample median ic tne
overage of the numbers in positons n/2
and N/2 +A.
P Themean
*™Me ave TWgJe Score Ww the distmnovton.
* A\so called as ane ANtMmenc Mtan, or miernge.
= Calcslated byacding all the cores
Alsm bution and AWIAING the total by the
number of cases,
Let x4,
mean
1s
seem Lal
X= 1 Yi
i=1
In a
Xn be a Sampte ~The cample=)
=y
oJ
od
af
NOTES
PK charactensncs of We mea -
+ The Mean ig LUnlike tne mode and mearan
Senci we to extreme scores
= We attrnioute of the mean ocwrt wecause It
se computed by using we valoe of each Score
in WE Agtrovhon.
-Tue mode and median far\ +O LLe Me value
of each score Wa Alster Ovton , THe mode
ls deviNed from the Prequensy OF the cores
The median \s based on ME Posihnon of the
Sores, regardless of their valves.
* Te mean is amenable +o statistical analysis
and compan sone between) cistn buTnon while
ane made and medion are not:
+ The attmipote of the mean Occurs Pecavee it
1g computed by using the valve of each
score in tHe dittribution.
* Also, tne Som Of the deviahons from the mean
Chow for each ceore ctands in relaton to the
mean) 16 2erd
vy Symmetric Aisin bovhon +
ZEO Skew ness
mode = median
= mean5
f
aa
NOTES
Posinvely Skwewed ! Mean and median
are tothe right of tne mode.
Negatively crew!’ Mean and
Median are to the left of theMode
Trample 4
A simpte random sample of five men tc Chosen
from a lame population of men, and tneir heignts
ave measured. The five heights Cin Inches) are
G5.51 13.30, 8-31, Y1.05 and 710-C6,
Find the sample mean.
x= ZDixi/n
= ©9.51+ 713 -30+ ©8.31, &T.05+
yo: 68 /5
= 08-47 mehes
wie standard deviation
+ Ravannty that meacures ihe degree of spread
Ina sample.
* p two list of numbers) 28,24. 30, 34, 32 and jo
!
20, 30, +0, 50. Which nas more spread ?NOTES
* \mehe s that when the spread is large, the
SAMPIE values WI tend +0 be far fom the
mean. When the seread is email the, tHe
Valves tena to be dose 4> ttre neean.
db The sample Vanance
2 Me Avevage of Ane squarea devianons.
vet Xa, 0. Xu be a sanple. Me camele
Vananee 1s the quanhty
“ —
62=4/n-4 XY Cxe -%)*
in
An equivalent formula, Which canbe easier to
compute, \c
cre Un-AC"™ A ied A Ae)
» The samele Standard Aeviation
* The square root of the vanance
Let X41, ...,%n le A CamMPIE. THE cample standard
deviaton is the quannty
S$ =) 1 /n-4 ot (xy x)?
=i gtAN equivalent formula, whitn ean ve easier to
compute, ic
Naa CH xen?)
ima
Me sample ctemdard Aevianon | ste cqvare
root of the sample variance.
Tramete 2.
«Tina the cample vanance and the sampte
standard Aevation for tne. height nthe data,
S? = WAL (65-54 ~ 68.17)? 412.30 - oB.
T1)* 4 (G8. 34 -@8. 17)? 4 GT-05-
@B-1N* t C10- 68 - 6B ATP
-VT-4766@5
= 2. 1S y
> Out lienc
Pointe that are much larger orcmalier than
the rest Consider the sample valves: 4,2,3,¢
and 2¢NOTES
"When a sample contams ootlerc, the
Median maybe more ceere sentative Of
Me sampe than the meanis-
Median Mean
& The Thmmed mean
. Ameasore of center thet \¢ designed to
be on aftected by outers.
+ Computta by arranging the wampte Vole
in order, THmNmmgaA an equal Wwmber o£
them from each end and competing the
mean of these remaining,
“lf pth of the data are tummed Som each
end, the resulting datais “py,
Vv
rammed mean”,
Lxrameple 3
In aachde, tite following values of Facture
strese Cin mega pascals were measured for a
V \
camere of 24 mixtures OF hot mixed asphalt
CMR).NOTES
30 15 4a BO BO 105 126 176
=y 225 232 232 23%@ 240 242 245 241
199 gq et 1
=) 254, T+ 384 410
Compute the mean ,meaian pana Me 9° 10 *%/s
Mm and 20% trimmed means.
TesTAt.. F214 7 eH /22 = 190-45
To compote the 10% timmed mean, roondA SFF
(oy (24) =
From each end, and then average the
w+yo2, Droez observations
remai ning ZO:
F494 BO tees + 254 + 2TH / 20> 186.55
_ To compute the 20% sewritnel mean, roond
of€ LO2Z)\C24I& 4-H 405, DropS sloserva-
thon © fromeach tud, and then ave rage Me
4 remaining 14°,
104 12~4-.. + 2H2 4 245 / 4y
J = 194.07\
NOTES
Example 4
Yor each of the following data cets , commut
Ane mean , Median and Mmodce.
Carta set
A, ©, 8,8
2,6, 3,3,4
1,2, BB), 3, 5,9
1,4, 313, 315, &) &
7441-23, Oe
Mean Median
2115=5-4 &
22/5 74-G 4
32lb=uy (ata)/2 23.5
30/1 8= 595 (346)/2 =4
101 =1.01 (044) )a=1.5
M ode
8
none
316
none
noneid
\
Eyample S
Find Me vanance and the standard
deviation of eacn trimmed mean
(5% , 10°, ANA 20%, ) \n the Proolem.
50.75, 14, BO, 80,105, 116 138,147,
11,474 141,223, 252, 231, 23@ 240,
242, 245, 247, - 254,274, 384 470
Compote the mean , median, ana the
Bo 10% , and 20% tnmmed means.