Chapter 2
SOME IMPORTANT CONCEPTS
2.1 Introduction
In this chapter we shall first define some important basic concepts, which are
needed to study and understand the subject statistics.
2.1.1 Population: Statistical methods are particularly useful for studying, analyzing
and learning about population. Literally, population means total inhabitants of a
country. But in Statistics it has a wide meaning. In statistics, a population does not
necessarily mean a collection of people. It can, be a collection of people or of any
kind of objects such as cows, chickens, trees, houses, books, television sets, cars
etc. In statistics population is defined as:
Definition 2.1: Population is the totality or collection of all objects, items or
individuals on which observations are taken on the basis of some characteristics of
the objects in any field of enquiry.
Actually, it is the aggregate of individuals possessing some characteristic in
common.
Example 2.1.1 The population may be
i) The collection of all customers who prefer a particular brand of refrigerator;
ii) The list of all workers of a factory;
iii) The list of all employees of a firm;
iv) The list of all students of Ranada Prasad Shaha University;
v) The gross sales of all companies in Dhaka city for a particular year;
vi) The list of cows of a dairy firm;
vii) The prices of all individual houses of Chittagong City; etc.
The population of interest is usually called the target population.
Target Population. The population that is being studied is called the target
population.
Definition 2.2 Each individual or object of a population is called an
experimental unit. Observations are collected on experimental units.
Example 2.1.2 Customers, workers, employees, students, companies, cows
and houses are the experimental units of the above populations.
Example 2.1.3 An experimental unit may be
i) An employee of a firm,
ii) A student of a class,
iii) A cow of a firm,
iv) A patient of a clinic,
v) A plot of an agriculture land etc.
The bold words of the above examples denote the experimental units. Population
may be finite and infinite.
Definition 2.3 Finite Population. A population is called finite if it contains finite
number of experimental units. All the examples cited in Example 2.1.1 are the
examples of finite populations.
Definition 2.4 Infinite Population.
A population is called infinite if it contains infinite number of experimental units.
Example 2.1.4
i) In a coin tossing experiment, number of tosses required to get a head,
ii) The length of life of a bulb are the examples of infinite population.
2.1.2 Variable.
It is a very important concept in statistics.
Definition 2.5
A variable is a changeable characteristic of the experimental units under
consideration. Actually, it is the characteristic of experimental units which varies
from experimental unit to experimental unit.
It is customary to represent variables by the last capital letters of the English
alphabets. That means, the variables are generally denoted by X, Y, Z, U, V, W etc.
Example 2.1.5
The bold words of the following examples denote the variables:
i) Age of a worker,
ii) Religion of a student,
iii) Wage of a worker,
iv) Gender of a garment worker,
v) Height of a student,
vi) Income of a household,
vii) Gross profits of a company,
viii) The number of insurance policies sold by a salesperson per day, etc., some
examples of variable.
The bold words of the above examples denote the variables. Actually, population is
named according to the characteristic of the experimental unit. For example, the
ages of all workers of a factory, incomes of all workers of a factory are the
population of age and the population of income respectively.
2.1.3 Observation, measurement or datum.
An observation or measurement is obtained when a characteristic is measured on
an experimental unit. Or when a variable is measured on an experimental unit, we
get an observation or measurement. A single observation is called datum. Data is
the plural of datum. The word datum is rarely used in statistics.
Definition 2.6 Data.
A set of observations obtained from a particular enquiry is called data or a data set.
Usually, data are the numerical results of scientific measurements. For example, it
could be the
i) incomes of 15 workers of a factory;
ii) heights of 20 students of a class;
iii) salaries of 50 employees of a firm;
iv) IQ of 10 students of a class;
v) examination marks of 20 students;
vi) ages of 25 workers of a factory, etc.
Actually, data are the raw and disorganized facts and figures in any field of enquiry.
Most of the time, decisions are made on portion of populations. For example, the
customers' preference of a particular brand of refrigerator in Dhaka city to estimate
the percentage of customers who prefer the particular brand based on some
customers in Dhaka city. In this case, population consists of all the customers who
prefer the brand. The sample is made up of some customers who prefer that
particular brand. Thus, the collection of a few elements selected from a population is
called a sample.
Definition 2/7 Sample: A sample is a part of a population that is taken and
considered for study. A representative sample is a good sample. To select a
representative is one of the important subjects in any statistical inquiry. Actually, it is
a subset of the population.
Usually, sample is a small but representative part of a population which contains a
finite number of observations. Some examples of sample are
i) Some workers of a factory,
ii) Some employees of a firm,
iii) Some students of a class,
iv) Some cows of a diary firm,
v) Some trees of a forest,
vi) Sales of a store for some days of a month, etc.
Sample should represent the population characteristics under study. So, selection of
a representative sample is very important to take decision on the whole population.
Representative Sample: A sample that represents the characteristics of the
population as closely as possible is called a representative sample.
Census and Sample survey: The collection of information from the elements of a
population or a sample is called a survey. A survey that includes every element of
the target population is called census. Often the target population is very large.
Hence, in practice, a census is rarely taken because it is expensive and time-
consuming. In many cases, it is even impossible to identify each element of the
target population. Usually, to conduct a survey, we select a sample and collect the
required information from the elements include in that sample. We then make a
decision based on this sample information. Such a survey conducted on a sample is
called sample survey. As an example, if we collect information on the 2013 incomes
of all families in Dhaka city, it will be referred to as a census. On the other hand, if
we collect information on the 2013 incomes of 60 families from Dhaka city, it will be
called a sample survey.
■ Census: A survey that includes every member of a population is called a census.
Sample survey: The technique of collecting information from a portion of the
population is called a sample survey.
2.1.4 Sampling: Sampling is the process of selecting a sample from the population.
Most of the times it is not feasible technically and economically to take entire
population for analysis, so we must take a representative part of the population as a
sample for the purpose of such analysis. Simple random sampling technique is an
important procedure for selecting a representative sample called random sample.
Definition 2.8: Simple random sample. A sample is called simple random sample if
every element of the population has an equal chance of being included in the
sample.
Definition 2.9: Parameter. Any numerical value describing a characteristic of a
population is called a parameter.
It is customary to represent parameters by Greek letters. By tradition the arithmetic
mean of a population is denoted by the Greek letter µ (mu). Similarly, population
variance (σ²), correlation coefficient (ρ), regression coefficient (β), proportion (π) etc.
are the examples of parameter. Note that a parameter is a constant value describing
the population characteristic.
Definition 2.10 Statistic: Any numerical value describing a characteristic of a
sample is called a statistic.
A statistic is usually represented by a small letter of the English alphabet. If the
statistic is the sample arithmetic mean, it is denoted by X. The sample variance (s²),
correlation coefficient (r), regression coefficient (b), sample proportion (p), etc. are
the examples of some statistics. This means any summary value calculated from the
sample is called statistic. Usually, these values are used to estimate the
corresponding population parameters.
The concepts of all the terms discussed so far are illustrated below with examples.
Example 2.1.6: Suppose we want to find the average sales of a certain commodity
sold in 60 shops in Chittagong Metropolitan area. Then these 60 shops will be our
population of interest. It is a finite population. Each shop of this area will be our
experimental unit. The characteristic of interest is the volume of sales.
Example 2.1.7: Suppose there are 80 students in your class. We want to find the
average height of these 80 students. Then these 80 students will be our population
of interest. It is a finite population. Each student of this class will be our experimental
unit. The characteristic of interest is height. Here height is the variable. If you collect
numerical information on the height of all the students, then the collection of heights
of 80 students will be the population data or the population of height. Suppose the
average height of these 80 students say 5.7 feet. Then µ = 5.7 foot is our parameter,
since it is a characteristic of the population.
Suppose it is not possible to get the population data. In that case, we can take a
random sample of 10 students (say) to estimate the average height of the class.
Then the 10 students will constitute the sample and the collection of the heights of
10 students will be the sample data or sample. Suppose the average height of these
10 students is 5.6 feet. Then the sample arithmetic mean X = 5.6 feet is a statistic
and this value is used as an estimate of the population mean µ.
Size of a population. The size of a population is the number of observations
or experimental units in it. It is usually denoted by N. In Example 2.1.7, the
population size is N = 80. It is the total number of students in the class.
Size of a sample. The size of the sample is the number of observations or
experimental units in it. It is denoted by n. In Example 2.1.7, the sample size
is n = 10. It is the number of students in the sample.
2.2 Types of Variables
According to whether a variable takes numerical or non-numerical values, it can be
classified into two categories, viz. (i) Qualitative variable, and (ii) Quantitative
variable.
Qualitative or Categorical variables. Variables that cannot be measured
numerically but can be classified into different categories are called qualitative
or categorical variables.
Definition 2.11. Qualitative variable. A variable that cannot assume a numerical
value but can be classified into two or more non-numerical categories is called a
1
qualitative or categorical variable. Or
A variable is called qualitative when it measures a qualitative characteristic on each
experimental unit.
Actually, it measures a qualitative characteristic on each experimental unit.
Qualitative variable cannot be measured on a natural numerical scale. Characteristic
of a qualitative variable are also known as attribute. Qualitative variables produce
qualitative data that can be classified according to different categories; hence they
are often called categorical variables and the data are called categorical data. Some
examples of qualitative variable are
i) Religion of a student; ii) Gender of a patient; iii) Economic status of a person; iv)
Teaching quality of a professor; v) Efficiency of a worker; vi) Colour of a car entering
the parking lot; vii) Hair colour of a student; viii) Quality of a finished product; ix) Size
of an industry.
The bold words of the above examples are the qualitative variables.
Definition 2.12 Quantitative variable. A variable that can be measured numerically
is called a quantitative variable. Or
A variable is called quantitative variable when it measures a quantitative
characteristic on each experimental unit by numerical value.
Actually, it measures a numerical quantity or amount on each experimental unit.
Quantitative variables are usually denoted by the last capital English alphabets such as
X, Y, Z, U, V, W etc. Some examples of quantitative variables are cited now.
1. X: Number of children per family,
2. X: Price of a shirt produced by a garment factory,
3. X: Production of rod in tons produced daily by a steel mill,
4. X: Production of sugar in kg produced daily in a sugar mill,
5. X: Daily rainfall in inches in Narayanganj city during the rainy season,
6. X: Daily wage of workers of a factory,
7. X: Number of printing mistakes per page of a book.
8. X: Systolic blood pressure of a patient etc.
Note that there are differences in the types of numerical values that the quantitative
variables assume. The number of children per family, for example can take on only the
values X = 0, 1, 2, 3,... whereas the daily rainfall X in inches can take on any value
greater than zero or less than a finite quantity, that means 0 < X < b, where b is a
positive quantity. It is to be noted here, that the number of children is a countable
quantity, while the daily rainfall is a measurable quantity.
Hence, on the basis of whether a variable is countable or measurable, it is again
classified as (i) Discrete variable, and (ii) Continuous variable.
Definition 2.13 Discrete variable: A variable, which can take, only isolated or
countable finite or infinite number of values is called a discrete variable.
In other words, there are no possible intermediate values between consecutive values
of a discrete variable. Usually, discrete variable takes natural number. Sometimes, it
can also take countable number of fractional values. The following are some examples
of discrete variables
1. Number of defective items in a lot of 10 items,
2. Number of children per family,
3. Number of accidents per day in a busy corner of a road,
4. Number of printing mistakes per page of a book, etc.
Here, number of defective items, number of children per family, number of accidents per
day and number of printing mistakes can take only integer values. Now we cite some
examples where discrete variable can take fractional values. The discrete variables are
underlined.
1. Size of shoes sold in a shop may be 3, 3.5, 4, 4.5 etc.
2. Size of nails in inches available in a shop may be 0.25, 0.50, 1.00, 1.25 and
1.50 etc.
3. Coins in taka of a cash box may be 0.01, 0.05, 0.10, 0.25, 0.50, 1.00, 2.00, 5.00
etc.
Here the size of shoes, size of nails and coins in a cash box take fractional but isolated
values. So, they are discrete variables. The most important characteristics of discrete
variables are that they are countable.
Definition 2.14 Continuous variable: A variable that can take infinitely many values
over a certain interval or intervals is called a continuous variable.
The values of a continuous variable cannot be counted, as they cannot take any
isolated value. That is, continuous variable can be measured only. Some examples of
continuous variables are
i) Age of a worker, ii) Systolic blood pressure of a patient, iii) Weight of an employee, iv)
Weight of a package ready to be shipped, v) Height of a salesman, vi) Monthly salary of
a worker etc. Here the bold words are the quantitative variables.
Example 2.2.1 Identify each of the following underlined variables as qualitative or
quantitative
1. The number of unregistered taxicabs in a city,
2. The number of consumers who refuse to answer a telephone survey,
3. The winning time for a horse running in a race,
4. Gender of an employee of a garment factory,
5. Ethnic origin of a candidate for a public office,
6. Brands of soft drinks sold in a café.
Solution. Variables in examples 1 and 2 are quantitative and discrete. In example 1,
the number of unregistered taxicabs is a discrete variable that can take on any of the
value X=0, 1, 2,... with a maximum value depending on the number of unregistered
taxicabs. Similarly, in example 2, the number of consumers is a discrete variable that
can take on any of the values X=0, 1, 2,... with a maximum value depending on the
number of consumers called. In example 3, the winning time is the only quantitative and
1
continuous variable in the test. Variables in examples 4, 5, and 6 are qualitative
because qualitative response would be obtained for those variables.
Example 2.2.2 A medical researcher wants to estimate the survival time in years of a
patient after the onset of a particular type of cancer and after a particular regime of
radiotherapy. A sample of 50 patients having cancer and radiotherapy who are not alive
have been selected randomly from a cancer hospital.
i) What is the population? ii) What is the sample? iii) What is the experimental unit? iv)
What is the variable to be measured? v) Is the variable qualitative, or discrete or
continuous?
Solution. i) The population of interest is the set of all patients listed in the registrar of
cancer hospital having that particular type of cancer who died after undergoing the
particular type of radiotherapy.
ii) The 50 patients selected at random from the cancer hospital is the sample.
iii) Every cancer patient who died having undergone the particular type of radiotherapy
is an experimental unit.
iv) Survival times in years is the variable to be measured.
v) The variable is quantitative and continuous.
• Constant
Definition 2.15 Constant: The characteristic of the experimental units which cannot
vary from experimental unit to experimental unit is called constant.
Example 2.2.3
i) Number of legs of a cow.
ii) Number of figures per hand of a person.
iii) Number of eyes of a cow.
iv) Number of horns of a sheep etc.
Constants are usually denoted by a, b, c etc. On the other hand π=22/7=3.141 and
e=2.718 are the mathematical constants. Differences between variable and constant
are stated as follows
Difference between variable and constant:
Variable Constant
1 Changeable characteristics of experimental units are Unchangeable characteristics of experimental units are
called variable. called constants.
2 There are various types of variables. But constants have no type.
3 The value of a variable changes. The value of a constant does not change.
4 Height, weight, age etc are the examples of variable. π=22/7=3.141 and e=2.718 are the examples of
mathematical constant.
5 There are some scales of measurements to measure a There is no such scale to measure a constant.
variable.
6 Variables are usually denoted by the last letters of the Constants are denoted by a, b, c.
English alphabets.
2.3 Scales of Measurement
World civilization is enriched by the idea of number and measurement. It was first felt in
physical sciences but now a day it is spread nearly all branches of knowledge. At the
end of the eighteenth-century Lord Kelvin said, "When you can measure what you are
speaking about and express it in numbers you know something about it; but when you
1
can not express it in numbers your knowledge is of a meager and unsatisfactory kind". 2
It is the belief of some researchers that if it is researchable then it must be measurable.
If the measurement procedure in a statistical investigation is poor, the usefulness of the
findings of the investigation will be severely affected. For every researcher, it becomes
necessary to explain the variables under study as well as the level of measurements of
the selected variables during the