0% found this document useful (0 votes)
37 views31 pages

1 Introduction To Statistics-MPhil Lecture

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views31 pages

1 Introduction To Statistics-MPhil Lecture

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

INTRODUCTION TO STATISTICS

Instructor: Dr. Irum Naqvi


Meaning
 Plural Noun (Statistics)
 Statistics are numerical statements of facts in any

subject of inquiry, placed in relation to each other.


Not statistics
 A goat has four legs.
 Hassan has a pocket money of 1000Rs.

Statistics
 There are 37 students in [Link] Morning and 28

students in [Link] Evening Group


Features
 Aggregate of facts (5000 in school but 2000 in science,
1500 in general science, and 1500 in humanities )
 Numerically expressed (quantitative figures)
 Affected by multiplicity of causes (petrol rise 20 % and

phone price rise 20%)


 Reasonable accuracy
 Collected in a systematic manner (collection,

organization, summarizing data)


 Enumerated or estimated (exact count or estimated like

people in Imran khan Jalsa and students in Mphil class)


Singular Noun (Statistic)
 In singular sense statistics means science of statistical

methods.
It is defined as:
 Data collection
 Presentation
 Analysis
 Interpretation of numerical data
Definition
 Statistics consist of facts and figures such as the average annual
snowfall in Buffalo or the average yearly income of recent college
graduates.

 These statistics are usually informative and time-saving because they


condense large quantities of information into a few simple figures.

 Specifically, we use the term statistics to refer to a general field of


mathematics. In this case, we are using the term statistics as a
shortened version of statistical methods or statistical procedures
that are used to summarize and evaluate research results in the
behavioral sciences.
 Statistics are used to organize and summarize the
information so that the researcher can see what happened in
the research study and can communicate the results to others.

 Statistics help the researcher to answer the questions that


initiated the research by determining exactly what general
conclusions are justified based on the specific results that
were obtained.

 The term statistics refers to a set of mathematical


procedures for organizing, summarizing, and interpreting
information.
Scope of Statistics

Nature
 Statistics is considered both a science as well as arts

 Subject Matter

 Descriptive Statistics (describing what is at hand)

 Inferential Statistics (using a sample to generate

 conclusions about the universe)

Limitations
 Study of numerical facts only

 Study of aggregates only

 Homogeneity of data

 Can be used by experts

 Without reference results may go wrong


IMPORTANT TERMS
Variables
 A variable is a characteristic or condition that can change

or take on different values.

 Most research begins with a general question about the


relationship between two variables for a specific group of
individuals.
Data

 Data (plural) are measurements or observations. A


data set is a collection of measurements or
observations.
 A datum (singular) is a single measurement or

observation and is commonly called a score or raw


score.
 The goal of statistics is to help researchers organize and

interpret the data.


Descriptive Statistics

Descriptive statistics are methods for organizing and


summarizing data. For example, tables or graphs are used
to organize data, and descriptive values such as the
average score are used to summarize data.

A descriptive value for a population is called a


parameter and a descriptive value for a sample is called
a statistic.
Inferential Statistics

Inferential statistics are methods for using sample data


to make general conclusions (inferences) about
populations.
Because a sample is typically only a part of the whole
population, sample data provide only limited information
about the population. As a result, sample statistics are
generally imperfect representatives of the corresponding
population parameters.
Parameter Vs. Statistic
 A parameter is a value, usually a numerical value, that
describes a population. A parameter is usually derived
from measurements of the individuals in the
population.

 A statistic is a value, usually a numerical value, that


describes a sample. A statistic is usually derived from
measurements of the individuals in the sample.
Population
 The entire group of individuals is called the population.
 For example, a researcher may be interested in the

relation between class size (variable 1) and academic


performance (variable 2) for the population of third-
grade children.
Sample
 Usually populations are so large that a researcher cannot
examine the entire group. Therefore, a sample is selected
to represent the population in a research study. The goal
is to use the results obtained from the sample to help
answer questions about the population.
Sampling Error

 The discrepancy between a sample statistic and its


population parameter is called sampling error.

 Defining and measuring sampling error is a large part


of inferential statistics.
 Constructs are internal attributes or characteristics that
cannot be directly observed but are useful for describing
and explaining behavior.
 An Operational Definition identifies a measurement
procedure (a set of operations) for measuring an external
behavior and uses the resulting measurements as a
definition and a measurement of a hypothetical construct.
 Note that an operational definition has two components.
First, it describes a set of operations for measuring a
construct. Second, it defines the construct in terms of the
resulting measurements.
Types of Variables

Variables can be classified as discrete or continuous.

 Discrete variables (such as class size) consist of


indivisible categories)
 Continuous variables (such as time or weight) are

infinitely divisible into whatever units a researcher may


choose. For example, time can be measured to the nearest
minute, second, half-second, etc.
Real Limits
 To define the units for a continuous variable, a
researcher must use real limits which are boundaries
located exactly halfway between adjacent categories.

 A score of X=8 seconds actually represents an interval


bounded by the real limits 7.5 seconds and 8.5
seconds.
Measuring Variables

 To establish relationships between variables,


researchers must observe the variables and record their
observations. This requires that the variables be
measured.

 The process of measuring a variable requires a set of


categories called a scale of measurement and a
process that classifies each individual into one
category.
4 Types of Measurement Scales

A nominal scale is an unordered set of categories identified only by name.


Measurements on a nominal scale label and categorize observations, but do not
make any quantitative distinctions between observations. Nominal measurements
only permit you to determine whether two individuals are the same or different.
An ordinal scale is an ordered set of categories. Ordinal measurements tell you the
direction of difference between two individuals. Measurements on an ordinal scale
rank observations in terms of size or magnitude.
An interval scale is an ordered series of equalized categories. Interval
measurements identify the direction and magnitude of a difference. The zero point is
arbitrary and does not indicate a zero amount of the variable being measured. Equal
differences between numbers on scale reflect equal differences in magnitude.
A ratio scale is an interval scale where a value of zero indicates none of the variable.
Ratio measurements identify the direction and magnitude of differences and allow
ratio comparisons of measurements. Ratios of numbers do reflect ratios of
magnitude.
Differentiated based on
• Names
• direction
• direction and
magnitude (or
distance) between
categories
• magnitude
Correlational Studies

 The goal of a correlational study is to determine


whether there is a relationship between two variables
and to describe the relationship.

 A correlational study simply observes the two


variables as they exist naturally.

 In a correlational design, you measure variables


without manipulating any of them.
Experimental Studies
 An EXPERIMENT deliberately imposes a treatment on
a group of objects or subjects in the interest of
observing the response
 The goal of an experiment is to demonstrate a cause-

and-effect relationship between two variables; that is,


to show that changing the value of one variable causes
changes to occur in a second variable.
 To see the changes in one variable researcher usually

implies the treatment.


What is a treatment
 Treatment is something that researchers administer
to experimental units. For example, a corn field is
divided into four, each part is 'treated' with a different
fertilizer to see which produces the most corn
 Example:

◦ a corn field is divided into four, each part is 'treated'


with a different fertilizer to see which produces the
most corn
In an experiment, one variable is manipulated
to create treatment conditions.

A second variable is observed and measured to obtain scores for


a group of individuals in each of the treatment conditions. The
measurements are then compared to see if there are differences
between treatment conditions.

All other variables are controlled to prevent them from


influencing the results. In an experiment, the manipulated
variable is called the independent variable and the observed
variable is the dependent variable.
Parts of the Classic Experiment
 Treatment or Independent Variable – is the variable that is manipulated by the researcher. In behavioral
research, the independent variable usually consists of the two (or more) treatment conditions to which
subjects are exposed. The independent variable consists of the antecedent conditions that were
manipulated prior to observing the dependent variable.
 Dependent Variable - that is observed to assess the effect of the treatment.
 Pretest is an initial measurement or assessment conducted before an intervention or treatment. It
establishes a baseline that helps researchers understand participants' initial status on certain variables.
By comparing pretest and posttest results, researchers can assess changes due to the intervention
 Posttest is a measurement or assessment conducted after an intervention or treatment. It helps determine
the effects or outcomes of the intervention by providing data that can be compared with the pretest results
to measure any changes in the variables being studied.
 Experimental Group - The group of participants that receives the treatment or intervention in a study. This
group is exposed to the independent variable, and researchers observe and measure the effects to
determine if the intervention causes any changes.
 Control Group – The group of participants that does not receive the experimental treatment or
intervention. Instead, they may receive a placebo or no intervention at all. The control group serves as a
baseline to compare against the experimental group, helping to isolate the effect of the independent
variable by showing what happens without the intervention.
 Random Assignment - a process in which participants are allocated to different groups (such as
experimental and control groups) by chance rather than by choice. This method reduces bias and ensures
that each participant has an equal probability of being placed in any group, increasing the likelihood that
the groups are equivalent at the start of the experiment.
Quasi-Experiemental
Non-experimental or quasi experimental, are similar to
experiments because they also compare groups of scores. They infer
causation when random assignment isn’t feasible, allowing some
degree of control but not full experimental rigor.

These studies do not use a manipulated variable to differentiate the


groups. Instead, the variable that differentiates the groups is usually a
pre-existing participant variable (such as male/female) or a time
variable (such as before/after).

Because these studies do not use the manipulation and control of true
experiments, they cannot demonstrate cause and effect relationships.
As a result, they are similar to correlational research because they
simply demonstrate and describe relationships.
Notation

 The individual measurements or scores obtained for a


research participant will be identified by the letter X (or X
and Y if there are multiple scores for each individual).

 The number of scores in a data set will be identified by N


for a population or n for sample.

 Summing a set of values is a common operation in


statistics and has its own notation. The Greek letter sigma,
Σ, will be used to stand for "the sum of." For example, ΣX
identifies the sum of the scores.
Order of Operations

1. All calculations within parentheses are done first.

2. Squaring or raising to other exponents is done second.


3. Multiplying, and dividing are done third, and should
be completed in order from left to right.

4. Summation with the Σ notation is done next.

5. Any additional adding and subtracting is done last


and should be completed in order from left to right.

Common questions

Powered by AI

To measure variables effectively, especially constructs that cannot be directly observed, a researcher can use operational definitions. This involves identifying a procedure for measuring an external behavior that reflects the construct and then using the resulting measurements as a definition and a measurement of the hypothetical construct. This helps in providing a clear method for recording observations .

Random assignment is crucial in experimental studies because it reduces bias by ensuring that each participant has an equal probability of being placed in any group, such as the experimental or control groups. This process increases the likelihood that the groups are equivalent at the start of the experiment, thereby strengthening the validity of the research findings by isolating the effects of the independent variable .

Real limits are used to define units for continuous variables by identifying boundaries located exactly halfway between adjacent categories. For example, a score of X=8 seconds represents an interval bounded by 7.5 and 8.5 seconds. Defining real limits is important because it specifies the precise measurement range of continuous data, ensuring accurate representation and analysis of these observations .

Descriptive statistics are used to organize and summarize data in a meaningful way. For example, tables or graphs can be used to organize data, while descriptive values such as the average score can be used to summarize data to facilitate understanding and communication of the data's main features .

Pretests and posttests in experimental research establish baselines and assess changes due to interventions. A pretest measures participants' initial status on certain variables, while a posttest measures outcomes after the intervention. Comparing the results helps researchers assess the effectiveness of the intervention by determining if observed changes are due to the treatment .

Sampling error, the discrepancy between a sample statistic and the corresponding population parameter, influences the interpretation of research findings by indicating the level of accuracy in generalizing results from the sample to the population. Understanding sampling error helps assess the reliability and precision of sample-derived conclusions, which is a key aspect of inferential statistics .

Quasi-experimental studies approximate experiments by comparing groups of scores based on pre-existing participant variables or time variables (such as before/after) instead of using random assignment. While they offer some control over variables, they cannot fully demonstrate cause-and-effect relationships due to the lack of manipulation and control over all variables. This makes them similar to correlational research, which also describes but does not establish causation .

Inferential statistics differ from descriptive statistics as they are used to make general conclusions about a population based on sample data. While descriptive statistics summarize data to describe the current state or characteristics of the data set, inferential statistics apply methods to use the sample data to make inferences about the larger population, dealing with sampling error and estimation of population parameters .

Constructs are operationally defined by specifying a set of operations for measuring an external behavior that reflects the construct and using these measurements as the construct's definition and assessment. This process enables researchers to indirectly observe and quantify characteristics that cannot be directly measured, providing a framework for systematic observation and analysis .

Nominal scales categorize data without any quantitative distinction, only identifying whether two observations are the same or different. Ordinal scales provide a rank order among observations but do not measure distances between ranks. Interval scales have equal-sized categories, reflecting equal differences in magnitude; however, they lack a true zero point. Ratio scales include a true zero, allowing for comparisons of magnitude. These distinctions are crucial as they determine the type of statistical analyses that can be performed on the data .

You might also like