Biostatistics& Data-
analysis using SPSS&
Minitab
Dr. Khalid Ghalib
Learning Objectives
Discuss what is meant by study variables.
Differentiate between types of data.
Use appropriate scale of measurement.
Differentiate between descriptive and inferential
statistics.
Pick up appropriate statistical test.
Outlines
Study
Variables
Scale of
Measurem
ent
Descriptiv
e Vs
Inferential
Statistics
Outlines
Study
Variables
Scale of
Measurem
ent
Descriptiv
e Vs
Inferential
Statistics
Study variables:
any entity that can take on different values
Independent variables Dependent variable
• Is responsible for bringing • Is the effects, impacts or
about change in a consequences of a
phenomenon, situation or independent variable.
circumstance.
Dependent Vs Independent
• Students taught first aid by The dependent variable is:
programmed instruction will
achieve at a higher level than • A) students
those taught first aid by the • B) level of achievement
traditional method." The
independent variable in this • C) programmed
hypothesis is: instruction
• A) students • D) method of instruction
• B) level of achievement
• C) programmed instruction
• D) method of instruction
Study variables
background Confounding
• Age • Unmeasured variables
• Gender affecting the cause-and-
effect relationship.
• etc
Dependent Vs Independent
Smoking
Ca lung
Genetic
predisposition
Outlines
Study
Variables
Scale of
Measurem
ent
Descriptiv
e Vs
Inferential
Statistics
Types of Data:
Facts Opinions
• A fact is something that • An opinion is what
can be proved as true or someone think or feel.
false. • They take the formof
• Use measurement scales perceptions or
(temperature, pressure, impressions
weight, volume, etc.) • People might think
• They can be backed up or feel differently
with evidence. on the same topic.
Is Your Data a
Fact
(Quantitative)
Or an
Opinion
(Qualitative) ?
Nominal (categorical). Gender, colour. Here the numbers
do not imply order.
Ordinal such as rank or satisfaction
level (the interval between the values may not be
equal)
Interval/ratio such as number of customers,
age, weight,etc. (also called scale). It can be
discrete or continuous data.
information
Level of
Level of measurement
• At a hospital nursing station, the following information is
available about a patient:
Temperature; 32,5
Blood Group; A
Response to treatment; Excellent
Indicate the level of measurement (nominal, ordinal or
interval) of each variable.
Outlines
Study
Variables
Scale of
Measurem
ent
Descriptiv
e Vs
Inferential
Statistics
What is statistics?
• Statistics is defined as the
science that care for
collecting, organizing,
summarizing, analyzing of
data, and interpreting
results.
• Good statistics come from
good sample that used to
draw conclusions about
population
Descriptive Vs Inferential
Population
• Presenting, organizing
and summarizing a given
data set through
Inferential
numerical summaries and • Parameters
Sample
graphs.
• Drawing conclusions
about a population based
on data observed in a Descriptive
• Statistics
sample.
Descriptive Analysis
Measures of central tendency;
Mean
Mode
Median
Measures of dispersion;
Standard deviation
Variance
Range
Interquartile Range
• The central tendency of a
distribution is typically contrasted
with its dispersion (or variability).
• The most common measures of central
tendency are the mean, the median and
the mode.
Measures of Central
Tendency
The Mean
To calculate the mean, we need to add all
the values up and then divide by the number of values.
The Median
To determine the median, we need to put the numbers in order and
find the middle value.
The Mode
To determine the mode, we need to
look at which value appears the
most often. It can help if the
numbers are in order.
• In statistics, dispersion (variability, scatter, or spread)
describe the extent to which a distribution is
stretched or squeezed.
• Common examples of measures of statistical
dispersion are:
i. The standard deviation,
ii. The variance
iii. The range
iv. The interquartile range.
Bigger standard deviation wider
range of variation
Example
• Consider the following blood lead concentration
values (µg/dl) among a random sample of male
workers exposed to lead over a 5-year- period in a
battery manufacturing factory:
• 64, 66, 77, 72, 80, 72, 78, 63, 68, 79
• Calculate the measures of central tendency:
- Mean
- Median
- Mode
Example
• If the mean height of the water bottles is determined to be 300
mm.
• A random sample of 10 bottles was taken and found as follows:
301
301, 304, 296, 299, 303, 297, 305, 301, 302, 298 304
o What is the mean? 296
o What is the median? 299
o What is the standard deviation? 303
o What is the first quartile? 297
o What is the third quartile? 305
o What is the range? 301
o What is the interquartile range? 302
298
Outlines
Study
Variables
Scale of
Measurem
ent
Descriptiv
e Vs
Inferential
Statistics
Test of Hypothesis
Hypothesis:
• A hypothesis is a premise or claim that we want to test (or
investigate).
• Hypothesis testing is an objective method of making
decisions or inferences from sample data (evidence).
• Our status quo, is called the null hypothesis, and the alternative
claim is called the alternative hypothesis.
Null hypothesis : Currently accepted value for the
parameter.
Alternative hypothesis : The research
hypothesis. Involve the claim to be tested.
Possible outcomes of these tests:
• Reject the null hypothesis
• Fail to reject the null
hypothesis
• Test of normality is performed to provide evidence whether the data has
been drawn from a normally distributed population (within the confidence
interval).
• The most famous normality tests are Anderson–Darling test & Kolmogorov–
Smirnov tests.
Hypothesis:
• H0: Data are from a normally distributed population
• Ha: Data are not from a normally distributed population
Probability Plot of Probability Plot of
normal not
99. Normal 99. Normal
9 Mean 9 Mean
9.155 9.304
99 StDev 99 StDev
1.359 9.426
95 N 95 N
Exampl
100 100
90 90
AD KS
80 0.221 80 0.175
e of
70 P-Value 0.828 70 P-Value <0.010
60 60
Perce
Perce
50 50
40 40
nt
nt
normali
30 30
20 20
10 10
5
ty tests 5
1 1
0.1 0.1
5 6 7 8 9 10 11 12 13 -20 -10 0 1 20 30 40 50
14 norm 0 no 60
al t
The Normal Distribution
Curve
• The normal distribution curve,
is most familiar and
useful
referenc to researchers
e and
scientist
s. is important in
statistical inference
• Normalit
• From a statistical point of view:
y
• 68% of all the values
are
within 1 standard deviation of
the mean,
• 95% are 2 standard
deviations from the mean
and Lin
• 99.7% are 3
standard deviations from the k
Types of statistical
tests
There is a wide range of statistical tests.
The decision of which statistical test to use depends on:
1. the research design or question,
2. the distribution of the data, and
3. the scale of measurement.
In general,
o if the data is normally distributed,we will
choose from
parametric tests.
o If the data is non-normal or categorical, we
will choose from the set of non-parametric tests.
Parametric Analysis
1-Sample t-test
Paired sample t-test
2-Sample t-test
One-way ANOVA
Two-way ANOVA
Correlation
Simple linear regression analysis
Nonparametric Analysis
1-Sample Wilcoxon rank signed test
Mann-Whitney U test (Wilcoxon ran sum test)
Kruskal-Wallis test
Friedman test
Chi-Square tests
Example of Hypothesis Testing (IQ
For IQ test, it given
• µ = 100,
that: testing)
• σ= 15 and
• n = 75 ,
• sample mean = 105
Is the variation random
variation or there is a
significant difference?
Conclusi
Hypothesis:
H0: µ = 100 • p-value = 0.04 , <
on:
Ha: µ not = • α
We(0.05)
reject H0 and
“Learning by doing” is the best
way to gain experience in
conduct of research.
khalidghalib18@[Link]
0912569966
[Link]
2/
[Link]
b
[Link]