INTRODUCTION TO STATISTICS
Instructor: Dr. Irum Naqvi
Meaning
Plural Noun (Statistics)
Statistics are numerical statements of facts in any
subject of inquiry, placed in relation to each other.
Not statistics
A goat has four legs.
Hassan has a pocket money of 1000Rs.
Statistics
There are 37 students in [Link] Morning and 28
students in [Link] Evening Group
Features
Aggregate of facts (5000 in school but 2000 in science,
1500 in general science, and 1500 in humanities )
Numerically expressed (quantitative figures)
Affected by multiplicity of causes (petrol rise 20 % and
phone price rise 20%)
Reasonable accuracy
Collected in a systematic manner (collection,
organization, summarizing data)
Enumerated or estimated (exact count or estimated like
people in Imran khan Jalsa and students in Mphil class)
Singular Noun (Statistic)
In singular sense statistics means science of statistical
methods.
It is defined as:
Data collection
Presentation
Analysis
Interpretation of numerical data
Definition
Statistics consist of facts and figures such as the average annual
snowfall in Buffalo or the average yearly income of recent college
graduates.
These statistics are usually informative and time-saving because they
condense large quantities of information into a few simple figures.
Specifically, we use the term statistics to refer to a general field of
mathematics. In this case, we are using the term statistics as a
shortened version of statistical methods or statistical procedures
that are used to summarize and evaluate research results in the
behavioral sciences.
Statistics are used to organize and summarize the
information so that the researcher can see what happened in
the research study and can communicate the results to others.
Statistics help the researcher to answer the questions that
initiated the research by determining exactly what general
conclusions are justified based on the specific results that
were obtained.
The term statistics refers to a set of mathematical
procedures for organizing, summarizing, and interpreting
information.
Scope of Statistics
Nature
Statistics is considered both a science as well as arts
Subject Matter
Descriptive Statistics (describing what is at hand)
Inferential Statistics (using a sample to generate
conclusions about the universe)
Limitations
Study of numerical facts only
Study of aggregates only
Homogeneity of data
Can be used by experts
Without reference results may go wrong
IMPORTANT TERMS
Variables
A variable is a characteristic or condition that can change
or take on different values.
Most research begins with a general question about the
relationship between two variables for a specific group of
individuals.
Data
Data (plural) are measurements or observations. A
data set is a collection of measurements or
observations.
A datum (singular) is a single measurement or
observation and is commonly called a score or raw
score.
The goal of statistics is to help researchers organize and
interpret the data.
Descriptive Statistics
Descriptive statistics are methods for organizing and
summarizing data. For example, tables or graphs are used
to organize data, and descriptive values such as the
average score are used to summarize data.
A descriptive value for a population is called a
parameter and a descriptive value for a sample is called
a statistic.
Inferential Statistics
Inferential statistics are methods for using sample data
to make general conclusions (inferences) about
populations.
Because a sample is typically only a part of the whole
population, sample data provide only limited information
about the population. As a result, sample statistics are
generally imperfect representatives of the corresponding
population parameters.
Parameter Vs. Statistic
A parameter is a value, usually a numerical value, that
describes a population. A parameter is usually derived
from measurements of the individuals in the
population.
A statistic is a value, usually a numerical value, that
describes a sample. A statistic is usually derived from
measurements of the individuals in the sample.
Population
The entire group of individuals is called the population.
For example, a researcher may be interested in the
relation between class size (variable 1) and academic
performance (variable 2) for the population of third-
grade children.
Sample
Usually populations are so large that a researcher cannot
examine the entire group. Therefore, a sample is selected
to represent the population in a research study. The goal
is to use the results obtained from the sample to help
answer questions about the population.
Sampling Error
The discrepancy between a sample statistic and its
population parameter is called sampling error.
Defining and measuring sampling error is a large part
of inferential statistics.
Constructs are internal attributes or characteristics that
cannot be directly observed but are useful for describing
and explaining behavior.
An Operational Definition identifies a measurement
procedure (a set of operations) for measuring an external
behavior and uses the resulting measurements as a
definition and a measurement of a hypothetical construct.
Note that an operational definition has two components.
First, it describes a set of operations for measuring a
construct. Second, it defines the construct in terms of the
resulting measurements.
Types of Variables
Variables can be classified as discrete or continuous.
Discrete variables (such as class size) consist of
indivisible categories)
Continuous variables (such as time or weight) are
infinitely divisible into whatever units a researcher may
choose. For example, time can be measured to the nearest
minute, second, half-second, etc.
Real Limits
To define the units for a continuous variable, a
researcher must use real limits which are boundaries
located exactly halfway between adjacent categories.
A score of X=8 seconds actually represents an interval
bounded by the real limits 7.5 seconds and 8.5
seconds.
Measuring Variables
To establish relationships between variables,
researchers must observe the variables and record their
observations. This requires that the variables be
measured.
The process of measuring a variable requires a set of
categories called a scale of measurement and a
process that classifies each individual into one
category.
4 Types of Measurement Scales
A nominal scale is an unordered set of categories identified only by name.
Measurements on a nominal scale label and categorize observations, but do not
make any quantitative distinctions between observations. Nominal measurements
only permit you to determine whether two individuals are the same or different.
An ordinal scale is an ordered set of categories. Ordinal measurements tell you the
direction of difference between two individuals. Measurements on an ordinal scale
rank observations in terms of size or magnitude.
An interval scale is an ordered series of equalized categories. Interval
measurements identify the direction and magnitude of a difference. The zero point is
arbitrary and does not indicate a zero amount of the variable being measured. Equal
differences between numbers on scale reflect equal differences in magnitude.
A ratio scale is an interval scale where a value of zero indicates none of the variable.
Ratio measurements identify the direction and magnitude of differences and allow
ratio comparisons of measurements. Ratios of numbers do reflect ratios of
magnitude.
Differentiated based on
• Names
• direction
• direction and
magnitude (or
distance) between
categories
• magnitude
Correlational Studies
The goal of a correlational study is to determine
whether there is a relationship between two variables
and to describe the relationship.
A correlational study simply observes the two
variables as they exist naturally.
In a correlational design, you measure variables
without manipulating any of them.
Experimental Studies
An EXPERIMENT deliberately imposes a treatment on
a group of objects or subjects in the interest of
observing the response
The goal of an experiment is to demonstrate a cause-
and-effect relationship between two variables; that is,
to show that changing the value of one variable causes
changes to occur in a second variable.
To see the changes in one variable researcher usually
implies the treatment.
What is a treatment
Treatment is something that researchers administer
to experimental units. For example, a corn field is
divided into four, each part is 'treated' with a different
fertilizer to see which produces the most corn
Example:
◦ a corn field is divided into four, each part is 'treated'
with a different fertilizer to see which produces the
most corn
In an experiment, one variable is manipulated
to create treatment conditions.
A second variable is observed and measured to obtain scores for
a group of individuals in each of the treatment conditions. The
measurements are then compared to see if there are differences
between treatment conditions.
All other variables are controlled to prevent them from
influencing the results. In an experiment, the manipulated
variable is called the independent variable and the observed
variable is the dependent variable.
Parts of the Classic Experiment
Treatment or Independent Variable – is the variable that is manipulated by the researcher. In behavioral
research, the independent variable usually consists of the two (or more) treatment conditions to which
subjects are exposed. The independent variable consists of the antecedent conditions that were
manipulated prior to observing the dependent variable.
Dependent Variable - that is observed to assess the effect of the treatment.
Pretest is an initial measurement or assessment conducted before an intervention or treatment. It
establishes a baseline that helps researchers understand participants' initial status on certain variables.
By comparing pretest and posttest results, researchers can assess changes due to the intervention
Posttest is a measurement or assessment conducted after an intervention or treatment. It helps determine
the effects or outcomes of the intervention by providing data that can be compared with the pretest results
to measure any changes in the variables being studied.
Experimental Group - The group of participants that receives the treatment or intervention in a study. This
group is exposed to the independent variable, and researchers observe and measure the effects to
determine if the intervention causes any changes.
Control Group – The group of participants that does not receive the experimental treatment or
intervention. Instead, they may receive a placebo or no intervention at all. The control group serves as a
baseline to compare against the experimental group, helping to isolate the effect of the independent
variable by showing what happens without the intervention.
Random Assignment - a process in which participants are allocated to different groups (such as
experimental and control groups) by chance rather than by choice. This method reduces bias and ensures
that each participant has an equal probability of being placed in any group, increasing the likelihood that
the groups are equivalent at the start of the experiment.
Quasi-Experiemental
Non-experimental or quasi experimental, are similar to
experiments because they also compare groups of scores. They infer
causation when random assignment isn’t feasible, allowing some
degree of control but not full experimental rigor.
These studies do not use a manipulated variable to differentiate the
groups. Instead, the variable that differentiates the groups is usually a
pre-existing participant variable (such as male/female) or a time
variable (such as before/after).
Because these studies do not use the manipulation and control of true
experiments, they cannot demonstrate cause and effect relationships.
As a result, they are similar to correlational research because they
simply demonstrate and describe relationships.
Notation
The individual measurements or scores obtained for a
research participant will be identified by the letter X (or X
and Y if there are multiple scores for each individual).
The number of scores in a data set will be identified by N
for a population or n for sample.
Summing a set of values is a common operation in
statistics and has its own notation. The Greek letter sigma,
Σ, will be used to stand for "the sum of." For example, ΣX
identifies the sum of the scores.
Order of Operations
1. All calculations within parentheses are done first.
2. Squaring or raising to other exponents is done second.
3. Multiplying, and dividing are done third, and should
be completed in order from left to right.
4. Summation with the Σ notation is done next.
5. Any additional adding and subtracting is done last
and should be completed in order from left to right.