100% found this document useful (3 votes)

4K views120 pages

Statistical Models and Data Classification

The document contains a series of multiple choice questions (MCQs) related to introductory statistics concepts. The MCQs cover topics such as descriptive versus inferential statistics, types of variables (e.g. quantitative, qualitative), types of data (e.g. primary, secondary), methods of data collection, and basic statistical terminology.

Uploaded by

Saad Salman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

100% found this document useful (3 votes)

4K views120 pages

Statistical Models and Data Classification

Uploaded by

Saad Salman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 120

MCQ’S OF INTRODUCTION

MCQ No 1.1
The science of collecting, organizing, presenting, analyzing and interpreting data to
assist in making more effective decisions is called:
(a) Statistic (b) Parameter (c) Population (d) Statistics

MCQ No 1.2
Methods of organizing, summarizing, and presenting data in an informative way are called:
(a) Descriptive statistics (b) Inferential statistics (c) Theoretical statistics (d) Applied statistics

MCQ No 1.3
The methods used to determine something about a population on the basis of a sample
is called:
(a) Inferential statistics (b) Descriptive statistics (c) Applied statistics (d) Theoretical statistics

MCQ No 1.4
When the characteristic being studied is nonnumeric, it is called a:
(a) Quantitative variable (b) Qualitative variable (c) Discrete variable (d) Continuous variable

MCQ No 1.5
When the variable studied can be reported numerically, the variable is called a:
(a) Quantitative variable (b) Qualitative variable (c) Independent variable (d) Dependent variable

MCQ No 1.6
A specific characteristic of a population is called:
(a) Statistic (b) Parameter (c) Variable (d) Sample

MCQ No 1.7
A specific characteristic of a sample is called:
(a) Variable (b) Constant (c) Parameter (d) Statistic

MCQ No 1.8
A set of all units of interest in a study is called:
(a) Sample (b) Population (c) Parameter (d) Statistic

MCQ No 1.9
A part of the population selected for study is called a:
(a) Variable (b) Data (c) Sample (d) Parameter

MCQ No 1.10
Listing of the data in order of numerical magnitude is called:
(a) Raw data (b) Arrayed data (c) Discrete data (d) Continuous data

MCQ No 1.11
Listings of the data in the form in which these are collected are known as:
(a) Secondary data (b) Raw data (c) Arrayed data (d) Qualitative data

MCQ No 1.12
Data that are collected by any body for some specific purpose and use are called:
(a) Qualitative data (b) Primary data (c) Secondary data (d) Continuous data

MCQ No 1.13
The data which have under gone any treatment previously is called:
(a) Primary data (b) Secondary data (c) Symmetric data (d) Skewed data
MCQ No 1.14
The data obtained by conducting a survey is called:
(a) Primary data (b) Secondary data (c) Continuous data (d) Qualitative data

MCQ No 1.15
The data collected from published reports is known as:
(a) Discrete data (b) Arrayed data (c) Secondary data (d) Primary data

MCQ No 1.16
A survey in which information is collected from each and every individual of the population is
known as:
(a) Sample survey (b) Pilot survey (c) Biased survey (d) Census survey

MCQ No 1.17
Data used by an agency which originally collected them are:
(a) Primary data (b) Raw data (c) Secondary data (d) Grouped data

MCQ No 1.18
Registration is the source of:
(a) Primary data (b) Secondary data (c) Qualitative data (d) Continuous data

MCQ No 1.19
Data in the population census reports are:
(a) Ungrouped data (b) Secondary data (c) Primary data (d) Arrayed data

MCQ No 1.20
Issuing a national identity card is an example of:
(a) Sampling (b) Statistic (c) Census (d) Registration

MCQ No 1.21
A variable that assumes only some selected values in a range is called:
(a) Continuous variable (b) Quantitative variable (c) Discrete variable (d) Qualitative variable

MCQ No 1.22
A variable that assumes any value within a range is called:
(a) Discrete variable (b) Continuous variable (c) Independent variable (d) Dependent variable

MCQ No 1.23
A variable that provides the basis for estimation is called:
(a) Dependent variable (b) Independent variable (c) Continuous variable (d) Qualitative variable

MCQ No 1.24
The variable that is being predicted or estimated is called:
(a) Dependent variable (b) Independent variable (c) Discrete variable (d) Continuous variable

MCQ No 1.25
Monthly rainfall in a city during the last ten years is an example of a:
(a) Discrete variable (b) Continuous variable (c) Qualitative variable (d) Independent variable

MCQ No 1.26:
The proportion of females in a sample of 50 accounts officers is an example of a:
(a) Parameter (b) Statistic (c) Array (d) Variable
MCQ No 1.27:
Number of family members in different families in a town is an example of a:
(a) Discrete variable (b) Continuous variable (c) Dependent variable (d) Qualitative variable

MCQ No 1.28
Colours of flowers are an example of:
(a) Quantitative variable (b) Qualitative variable (c) Skewed variable (d) Symmetric variable

MCQ No 1.29
If each measurement in a data set falls into one and only one of a set of categories,
the data set is called:
(a) Quantitative (b) Qualitative (c) Continuous (d) Constant

MCQ No 1.30
Any phenomenon which is not measurable is called:
(a) Variable (b) Constant (c) Parameter (d) Attribute

MCQ No 1.31
A constant can assume values:
(a) Zero (b) One (c) Fixed (d) Not fixed

MCQ No 1.32
A value which does not change from one individual to another individual is called:
(a) Variable (b) Statistic (c) Constant (d) Array

MCQ No 1.33
In the plural sense, statistics means:
(a) Numerical data (b) Methods (c) Population data (d) Sample data

MCQ No 1.34
In the singular sense, statistics means:
(a) Methods (b) Numerical data (c) Sample data (d) Population data

MCQ No 1.35
Weight of earth is:
(a) Discrete variable (b) Qualitative variable (c) Continuous variable (d) Difficult to tell

MCQ No 1.36
Weights of students in a class marks is a:
(a) Discrete data (b) Continuous data (c) Qualitative data (d) Constant data

MCQ No 1.37
Life of a T.V tube is a:
(a) Discrete variable (b) Continuous variable (c) Qualitative variable (d) Constant

MCQ No 1.38
Questionnaire method is used in collecting:
(a) Primary data (b) Secondary data (c) Published data. (d) True data

MCQ No 1.39
Census returns are:
(a) Primary data (b) Secondary data (c) Qualitative data (d) True data
MCQ No 1.40
Students divided into different groups according to their intelligence and gender
will generate:
(a) Quantitative data (b) Qualitative data (c) Continuous data (d) Constant

MCQ No 1.41
Statistics are:
(a) Aggregate of facts and figures (b) Always true (c) Always continuous (d) Always qualitative

MCQ No 1.42
Statistics results are:
(a) Randomly true (b) Always true (c) Not true (d) True on average

MCQ No 1.43 Statistics does not study:

(a) Constant (b) Statistic (c) Parameter (d) Individual

MCQ No 1.44
A statistical population may consist of:
(a) Finite number of values (b) Infinite number of values
(c) Either of (a) and (b) (d) None of (a) and (b)

MCQ No 1.45
The only continuous variable here is:
(a) Rain fall on different days in a city (b) Number of customers entering a store on different days
(c) Number of flights landing on an airport on different days (d) None of them

MCQ No 1.46
Example of descriptive statistics is:
(a) 70% people in Pakistan live in rural areas. (b) 50% people are likely to vote in the national
election (c) 20% of the bulbs produced in a factory will be defective (d) Difficult to tell.

MCQ No 1.47
Example of inferential statistics is:
(a) Percentage of smokers in Pakistan (b) Percentage of skilled workers in a factory.
(c) Estimate of increase in prices in the next year (d) None of the above

MCQ No 1.48
Statistics are always:
(a) Exact (b) Estimated values (c) Constant (d) Population values

MCQ No 1.49
Statistics must be:
(a) Comparable (b) Not comparable (c) Discrete in nature (d) Qualitative in nature

MCQ No 1.50
Given 6 quantities, X1 through X6, the correct notation for adding quantities 3 through 6 is:

MCQ No 1.51

(a) 36 (b) 48 (c) 41 (d) 29

MCQ No 1.52

(a) Add all quantities from Y1 through Yn (b) Add all quantities from Y=2 through Yn
(c) Add all quantities from Y=2 through Y=n (d) Add all quantities from Y2 through Yn

MCQ No 1.53

MCQ No 1.54
The figure 22.25 rounded to one decimal place is:
(a) 22.3 (b) 22.1 (c) 22.2 (d) 22

MCQ No 1.55
The figure 22.15 rounded to one decimal place is:
(a) 22.2 (b) 22.1 (c) 22 (d) 22.3

MCQ No 1.56
The figure 22.26 rounded to one decimal place is:
(a) 22.2 (b) 22.3 (c) 22.1 (d) 22

MCQ No 1.57
The figure 22.24 rounded to one decimal place is:
(a) 22.2 (b) 22.3 (c) 22.1 (d) 22

MCQ No 1.58
How many methods are used for the collection of data?
(a) 4 (b) 3 (c) 2 (d) 1
MCQ’S OF PRESENTATION OF DATA

MCQ No 2.1:
When data are classified according to a single characteristic, it is called:
(a) Quantitative classification (b) Qualitative classification
(c) Area classification (d) Simple classification

MCQ No 2.2:
Classification of data by attributes is called:
(a) Quantitative classification (b) Chronological classification
(c) Qualitative classification (d) Geographical classification

MCQ No 2.3:
Classification of data according to location or areas is called:
(a) Qualitative classification (b) Quantitative classification
(c) Geographical classification (d) Chronological classification

MCQ No 2.4:
Classification is applicable in case of:
(a) Normal characters (b) Quantitative characters (c) Qualitative characters (d) Both (b) and (c)

MCQ No 2.5:
In classification, the data are arranged according to:
(a) Similarities (b) Differences (c) Percentages (d) Ratios

MCQ No 2.6:
When data are arranged at regular interval of time, the classification is called:
(a) Qualitative (b) Quantitative (c) Chronological (d) Geographical

MCQ No 2.7:
When an attribute has more than three levels it is called:
(a) Manifold-division (b) Dichotomy (c) One-way (d) Bivariate

MCQ No 2.8:
The series
Country Pakistan India Britain Egypt Japan
Birth rate 45 40 10 35 10
is of the type:
(a) Discrete (b) Continuous (c) Individual (d) Time series

MCQ No 2.9:
The series
Country Pakistan India Britain Egypt Japan
Death rate 15 16 10 12 10
is of the type:
(a) Inclusive (b) Exclusive (c) Geographical (d) Time series

MCQ No 2.10
In an array, the data are:
(a) In ascending order (b) In descending order (c) Either (a) or (b) (d) Neither (a) or (b)
MCQ No 2.11
The number of tally sheet count for each value or a group is called:
(a) Class limit (b) Class width (c) Class boundary (d) Frequency

MCQ No 2.12
The frequency distribution according to individual variate values is called:
(a) Discrete frequency distribution (b) Cumulative frequency distribution
(c) Percentage frequency distribution (d) Continuous frequency distribution

MCQ No 2.13
A series arranged according to each and every item is known as:
(a) Discrete series (b) Continuous series (c) Individual series (d) Time series

MCQ No 2.14
A frequency distribution can be:
(a) Qualitative (b) Discrete (c) Continuous (d) Both (b) and (c)

MCQ No 2.15
The following frequency distribution:
X 5 15 38 47 68
f 2 4 9 3 1
Is classified
(a) Relative frequency distribution (b) Continuous distribution
(c) Percentage frequency distribution (d) Discrete distribution

MCQ No 2.16
Frequency distribution is often constructed with the help of:
(a) Entry table (b) Tally sheet (c) Both (a) and (b) (d) Neither (a) and (b)

MCQ No 2.17
The data given as 3, 5, 15, 35, 70, 84, 96 will be called as:
(a) Individual series (b) Discrete series (c) Continuous series (d) Time series

MCQ No 2.18
Frequency of a variable is always in:
(a) Fraction form (b) Percentage form (c) Less than form (d) Integer form

MCQ No 2.19
Data arranged in ascending or descending order of magnitude is called:
(a) Ungrouped data (b) Grouped data (c) Discrete frequency distribution (d) Arrayed data

MCQ No 2.20
The grouped data are called:
(a) Primary data (b) Secondary data (c) Raw data (d) Difficult to tell

MCQ No 2.21
A series of data with exclusive classes along with the corresponding frequencies is called:
(a) Discrete frequency distribution (b) Continuous frequency distribution
(c) Percentage frequency distribution (d) Cumulative frequency distribution
MCQ No 2.22
In an exclusive classification, the limits excluded are:
(a) Upper limits (b) Lower limits (c) Both lower and upper limits (d) Either lower or upper limits

MCQ No 2.23
The series
Weights(pounds) 15----20 20----25 25----30 30----35 35----40
No. of items 10 15 30 10 5
is categorized as:
(a) Continuous series (b) Discrete series (c) Time series (d) Geometric series

MCQ No 2.24
The series
Year 2007 2008 2009 2010 2011
Profit (000 Rs.) 7 10 16 18 22
will be called as:
(a) Time series (b) Discrete series (c) Continuous series (d) Individual series

MCQ No 2.25:
The suitable formula for computing the number of classes is:
(a) 3.322 logN (b) 0.322 logN (c) 1+3.322 logN (d) 1- 3.322 logN

MCQ No 2.26:
The number of classes in a frequency distribution is obtained by dividing the range of variable by
the:
(a) Total frequency (b) Class interval (c) Mid-point (d) Relative frequency

MCQ No 2.27:
If the number of workers in a factory is 256, the number of classes will be:
(a) 8 (b) 9 (c) 10 (d) 12

MCQ No 2.28:
The largest and the smallest values of any given class of a frequency distribution are called:
(a) Class Intervals (b) Class marks (c) Class boundaries (d) Class limits

MCQ No 2.29
If there are no gaps between consecutive classes, the limits are called:
(a) Class limits (b) Class boundaries (c) Class intervals (d) Class marks

MCQ No 2.30
The extreme values used to describe the different classes in a frequency distribution are called:
(a) Class intervals (b) Class boundaries (c) Class limits (d) Cumulative frequency

MCQ No 2.31
If in a frequency table, either the lower limit of first class or the upper limit of last class is not a fixed
number, then classes are called:
(a) One-way classes (b) Two-way classes (c) Discrete classes (d) Open-end classes
MCQ No 2.32
The class boundaries can be taken when the nature of variable is:
(a) Discrete (b) Continuous (c) Both (a) and (b) (d) Qualitative
MCQ No 2.33
Class boundaries are also called:
(a) Mathematical limits (b) Arithmetic limits (c) Geometric limits (d) Qualitative limits

MCQ No 2.34
The average of lower and upper class limits is called:
(a) Class boundary (b) Class frequency (c) Class mark (d) Class limit

MCQ No 2.35
The lower and upper class limits are 20 and 30, the midpoints of the class is:
(a) 20 (b) 25 (c) 30 (d) 50

MCQ No 2.36
A frequency distribution that contains a class with limits of "10 and under 20" would have a midpoint:
(a) 10 (b) 14.9 (c) 15 (d) 20

MCQ No 2.37
If the number of workers in a factory is 128 and maximum and minimum hourly wages are 100 and 20
respectively. For the frequency distribution of hourly wages, the class interval is:
(a) 8 (b) 9 (c) 10 (d) 80

MCQ No 2.38
Width of interval h is equal to:

MCQ No 2.39
Length of interval is calculated as:
(a) The difference between upper limit and lower limit (b) The sum of upper limit and lower limit
(c) Half of the difference between upper limit and lower limit (d) Half of the sum of upper limit and lower limit

MCQ No 2.40
The class marks are given below:
10,12,14,16,18. The first class of the distribution is:
(a) 9----12 (b) 10.5----12.5 (c) 9----11 (d) 10----12

MCQ No 2.41
If the midpoints are 10, 15, 20, 25 and 30. The last class boundary of the distribution is:
(a) 25----30 (b) 27.5----32.5 (c) 20----35 (d) 30----35

MCQ No 2.42
The number of classes depends upon:
(a) Class marks (b) Frequency (c) Class interval (d) Class boundary

MCQ No 2.43
The class interval is the difference between:
(a) Two extreme values (b) Two successive frequencies
(c) Two successive upper limits (d) Two largest values
MCQ No 2.44
When the classes are 40----44, 45----49, 50----54, ... the class interval is:
(a) 4 (b) (c) 100 (d) 5

MCQ No 2.45:
A grouping of data into mutually exclusive classes showing the number of observations in each class
is called:
(a) Frequency polygon (b) Relative frequency
(c) Frequency distribution (d) Cumulative frequency

MCQ No 2.46:
The following frequency distribution
Classes Less than 2 Less than 4 Less than 6 Less than 8 Less than 10
Frequency 2 6 16 19 20
is classified as:
(a) Inclusive classification (b) Exclusive classification
(c) Discrete classification (d) Cross classification

MCQ No 2.47:
The following frequency distribution
Classes 10----20 20----30 30----40 40----50 50----60
Frequency 2 4 6 4 2
is classified as:
(a) Exclusive classification (b) Inclusive classification
(c) Geographical classification (d) Two-way classification

MCQ No 2.48:
The following frequency distribution
Classes 0----4 5----9 10----14 15----19 20----24
Frequency 2 3 7 5 3
is classified as:
(a) Multiple classification (b) Qualitative classification
(c) Inclusive classification (d) Exclusive classification

MCQ No 2.49:
The following frequency distribution
Classes More than 4 More than 4 More than 6 More than 8 More than 10
Frequency 2 6 16 19 20
is classified as:
(a) Geographical classification (b) Chronological classification
(c) Inclusive classification (d) Exclusive classification

MCQ No 2.50:
The class frequency divided by the total number of observations is called:
(a) Percentage frequency (b) Relative frequency
(c) Cumulative frequency (d) Bivariate frequency

MCQ No 2.51:
The relative frequency multiplied by 100 is called:
(a) Percentage frequency (b) Cumulative frequency
(c) Bivariate frequecy (d) Simple frequency
MCQ No 2.52
In a relative frequency distribution, the total of the relative frequencies is:
(a) 100 (b) One (c) ∑f (d) ∑ X

MCQ No 2.53:
In a percentage frequency distribution, the total of the percentage frequencies is always equal to:
(a) 1 (b) ∑f (c) 100% (d) ∑X

MCQ No 2.54
The cumulative frequency of first group in more than cumulative frequency distribution is always equal to:
(a) 1 (b) 100 (c) ∑f (d) ∑X

MCQ No 2.55
The cumulative frequency of last class in less than cumulative frequency distribution is always equal to:
(a) ∑f (b) ∑X (c) 1 (d) 100

MCQ No 2.56:
The following frequency distribution:
Classes Less than 10 Less than 20 Less than 30 Less than 40 Less than 50
Frequency 2 6 16 19 20
is classified as:
(a) Less than cumulative frequency distribution (b) More than cumulative frequency distribution
(c) Discrete frequency distribution (d) Cumulative percentage frequency distribution

MCQ No 2.57:
The following frequency distribution
Classes 50----55 55----60 60----65 65----70 70----75
Frequency 40 36 30 16 4
is classified as:
(a) Relative frequency distribution (b) Less than cumulative frequency distribution
(c) More than cumulative frequency distribution (d) Bivariate frequency distribution

MCQ No 2.58
A frequency distribution formed considering two variables at a time is called:
(a) Univariate frequency distribution (b) Bivariate frequency distribution
(c) Trivariate frequency distribution (d) Bimodal distribution

MCQ No 2.59
The sum of rows or sum of columns, of a bivariate, frequency distribution is equal to:
(a) ∑X (b) ∑fX (c) ∑(f+X) (d) ∑f

MCQ No 2.60:
The arrangement of data in rows and columns is called:
(a) Classification (b) Tabulation (c) Frequency distribution (d) Cumulative frequency distribution

MCQ No 2.61:
When the qualitative or quantitative raw data are classified according to one characteristic, the
tabulation of different groups is called:
(a) Dichotomy (b) Manifold-division (c) Bivariate (d) One-way
MCQ No 2.62
A statistical table consists of at least:
(a) Two parts (b) Three parts (c) Four parts (d) Five parts

MCQ No 2.63
In a statistical table, prefatory note is shown:
(a) Below the body (b) Box head ` (c) Foot note (d) Below the title

MCQ No 2.64
A source note in a statistical table is given:
(a) At the end of a table (b) In the beginning of a table
(c) In the middle of a table (d) Below the body of a table

MCQ No 2.65
In a statistical table, column captions are called:
(a) Box head (b) Stub (c) Body (d) Title

MCQ No 2.66
In a statistical table, row captions are called:
(a) Box head (b) Stub (c) Body (d) Title

MCQ No 2.67:
The headings of the rows of a table are called:
(a) Prefatory notes (b) Titles (c) Stubs (d) Captions

MCQ No 2.68:
The headings of the columns of a table are called:
(a) Stubs (b) Captions (c) Footnotes (d) Source notes

MCQ No 2.69:
The budgets of two families can be compared by:
(a) Sub-divided rectangles (b) Pie diagram (c) Both (a) and (b) (d) Histogram

MCQ No 2.70:
Total angle of the pie-chart is:
(a) 45 (b) 90 (c) 180 (d) 360

MCQ No 2.71:
Diagram are another form of:
(a) Classification (b) Tabulation (c) Angle (d) Percentage

MCQ No 2.72
In pie diagram, the angle of a sub-sector is obtained as:

MCQ No 2.73:
A pie diagram is represented by a:
(a) Rectangle (b) Circle (c) Triangle (d) Square
MCQ No 2.74:
A sector diagram is also called:
(a) Bar diagram (b) Histogram (c) Historigram (d) Pie diagram

MCQ No 2.75:
Which of the following is not a one-dimensional diagram:
(a) Simple bar diagram (b) Multiple bar diagram
(c) Component bar diagram (d) Pie diagram

MCQ No 2.76:
Which of the following is a two-dimensional diagram:
(a) Sub-divided bar (b) Percentage component bar chart
(c) Sub-divided rectangles (d) Multiple bar diagram

MCQ No 2.77:
Pie diagram represents the components of a factor by:
(a) Circles (b) Sectors (c) Angles (d) Percentages

MCQ No 2.78:
The suitable diagram to represent the data relating to the monthly expenditure on different items by a
family is:
(a) Historigram (b) Histogram (c) Multiple bar diagram (d) Pie diagram

MCQ No 2.79
A graph of time series or historical series is called:
(a) Histogram (b) Historigram (c) Frequency curve (d) Frequency polygon

MCQ No 2.80
The historigram is the graphical presentation of data which are classified:
(a) Geographically (b) Numerically (c) Qualitatively (d) According to time

MCQ No 2.81
Historigram and histogram are:
(a) Always same (b) Not same (c) Off and on same (d) Randomly same

MCQ No 2.82
A distribution in which the observations are concentrated at one end of the distribution is called a:
(a) Symmetric distribution (b) Normal distribution
(c) Skewed distribution (d) Uniform distribution

MCQ No 2.83
For graphic presentation of a frequency distribution, the paper to be used is:
(a) Carbon paper (b) Ordinary paper (c) Graph paper (d) Butter paper

MCQ No 2.84
Histogram can be drawn only for:
(a) Discrete frequency distribution (b) Continuous frequency distribution
(c) Cumulative frequency distribution (d) Relative frequency distribution

MCQ No 2.85
Histogram is a graph of:
(a) Frequency distribution (b) Time series (c) Qualitative data (d) Ogive
MCQ No 2.86
Histogram and frequency polygon are two graphical representations of:
(a) Frequency distribution (b) Class boundaries (c) Class intervals (d) Class marks

MCQ No 2.87
Frequency polygon can be drawn with the help of:
(a) Historigram (b) Histogram (c) Circle (d) Percentage

MCQ No 2.88
In a cumulative frequency polygon, the cumulative frequency of each class is plotted against:
(a) Mid-point (b) Lower class boundary (c) Upper class boundary (d) Upper class limit

MCQ No 2.89
The graph of the cumulative frequency distribution is called:
(a) Histogram (b) Frequency polygon (c) Pictogram (d) Ogive

MCQ No 2.90
When successive mid-points in a histogram are connected by straight lines, the graph is called a:
(a) Historigram (b) Ogive (c) Frequency curve (d) Frequency polygon

MCQ No 2.91
A frequency polygon is a closed figure which is:
(a) One sided (b) Two sided (c) Three sided (d) Many sided

MCQ No 2.92
Ogive curve can be occurred for the distribution of:
(a) Less than type (b) More than type (c) Both (a) and (b) (d) Neither (a) and (b)

MCQ No 2.93
The word ogive is also used for:
(a) Frequency polygon (b) Cumulative frequency polygon
(c) Frequency curve (d) Histogram

MCQ No 2.94
Cumulative frequency polygon can be used for the calculation of:
(a) Mean (b) Median (c) Mode (d) Geometric mean
MCQ’S OF MEASURES OF CENTRAL TENDENCY

MCQ No 3.1
Any measure indicating the centre of a set of data, arranged in an increasing or decreasing order of
magnitude, is called a measure of:
(a) Skewness (b) Symmetry (c) Central tendency (d) Dispersion

MCQ No 3.2
Scores that differ greatly from the measures of central tendency are called:
(a) Raw scores (b) The best scores (c) Extreme scores (d) Z-scores

MCQ No 3.3
The measure of central tendency listed below is:
(a) The raw score (b) The mean (c) The range (d) Standard deviation

MCQ No 3.4
The total of all the observations divided by the number of observations is called:
(a) Arithmetic mean (b) Geometric mean (c) Median (d) Harmonic mean

MCQ No 3.5
While computing the arithmetic mean of a frequency distribution, the each value of a class is
considered equal to:
(a) Class mark (b) Lower limit (c) Upper limit (d) Lower class boundary

MCQ No 3.6
Change of origin and scale is used for calculation of the:
(a) Arithmetic mean (b) Geometric mean
(c) Weighted mean (d) Lower and upper quartiles

MCQ No 3.7
The sample mean is a:
(a) Parameter (b) Statistic (c) Variable (d) Constant

MCQ No 3.8
The population mean µ is called:
(a) Discrete variable (b) Continuous variable (c) Parameter (d) Sampling unit

MCQ No 3.9
The arithmetic mean is highly affected by:
(a) Moderate values (b) Extremely small values
(c) Odd values (d) Extremely large values

MCQ No 3.10
The sample mean is calculated by the formula:
MCQ No 3.11
If a constant value is added to every observation of data, then arithmetic mean is obtained
by:
(a) Subtracting the constant (b) Adding the constant
(c) Multiplying the constant (d) Dividing the constant

MCQ No 3.12
Which of the following statements is always true?
(a) The mean has an effect on extreme scores (b) The median has an effect on extreme scores
(c) Extreme scores have an effect on the mean (d) Extreme scores have an effect on the median

MCQ No 3.13
The elimination of extreme scores at the bottom of the set has the effect of:
(a) Lowering the mean (b) Raising the mean (c) No effect (d) None of the above

MCQ No 3.14
The elimination of extreme scores at the top of the set has the effect of:
(a) Lowering the mean (b) Raising the mean (c) No effect (d) Difficult to tell

MCQ No 3.15
The sum of deviations taken from mean is:
(a) Always equal to zero (b) Some times equal to zero
(c) Never equal to zero (d) Less than zero

MCQ No 3.16
If = 25, which of the following will be minimum:
(a) ∑(X – 27)2 (b) ∑(X – 25)2 (c) ∑(X – 22)2 (d) ∑(X + 25)2

MCQ No 3.17
The sum of the squares fo the deviations about mean is:
(a) Zero (b) Maximum (c) Minimum (d) All of the above

MCQ No 3.18

(a) 10 (b) 50 (c) 60 (d) 100

MCQ No 3.19
For a certain distribution, if ∑(X -20) = 25, ∑(X- 25) =0, and ∑(X-35) = -25, then is
equal to:
(a) 20 (b) 25 (c) -25 (d) 35

MCQ No 3.20
The sum of the squares of the deviations of the values of a variable is least when the deviations are
measured from:
(a) Harmonic mean (b) Geometric mean (c) Median (d) Arithmetic mean

MCQ No 3.21
If X1, X2, X3, ... Xn, be n observations having arithmetic mean and if Y =4X ± 2, then is
equal to:
(a) 4X (b) 4 (c) 4 ± 2 (d) 4 ± 2
MCQ No 3.22
If =100 and Y=2X – 200, then mean of Y values will be:
(a) 0 (b) 2 (c) 100 (d) 200

MCQ No 3.23
Step deviation method or coding method is used for computation of the:
(a) Arithmetic mean (b) Geometric mean (c) Weighted mean (d) Harmonic mean

MCQ No 3.24
If the arithmetic mean of 20 values is 10, then sum of these 20 values is:
(a) 10 (b) 20 (c) 200 (d) 20 +
10

MCQ No 3.25
Ten families have an average of 2 boys. How many boys do they have together?
(a) 2 (b) 10 (c) 12 (d) 20

MCQ No 3.26
If the arithmetic mean of the two numbers X1 and X2 is 5 if X1=3, then X2 is:
(a) 3 (b) 5 (c) 7 (d) 10

MCQ No 3.27
Given X1=20 and X2= -20. The arithmetic mean will be:
(a) Zero (b) Infinity (c) Impossible (d) Difficult to tell

MCQ No 3.28
The mean of 10 observations is 10. All the observations are increased by 10%. The mean of increased
observations will be:
(a) 10 (b) 1.1 (c) 10.1 (d) 11

MCQ No 3.29
The frequency distribution of the hourly wage rate of 60 employees of a paper mill is as follows:
Wage rate (Rs.) 54----56 56----58 58----60 60----62 62----64
Number of workers 10 10 20 10 10
The mean wage rate is:
(a) Rs. 58.60 (b) Rs. 59.00 (c) Rs. 57.60 (d) Rs. 57.10

MCQ No 3.30
The sample mean of first n natural numbers is:
(a) n(n+ 1) / 2 (b) (n+ 1) / 2 (c) n/2 (d) (n+ 1) / 2

MCQ No 3.31
The mean of first 2n natural numbers is:

MCQ No 3.32
The sum of deviations is zero when deviations are taken from:
(a) Mean (b) Median (c) Mode (d) Geometric mean
MCQ No 3.33
When the values in a series are not of equal importance, we calculate the:
(a) Arithmetic mean (b) Geometric mean (c) Weighted mean (d) Mode

MCQ No 3.34
When all the values in a series occur the equal number of times, then it is not possible to calculate the:
(a) Arithmetic mean (b) Geometric mean (c) Harmonic mean (d) Weighted mean

MCQ No 3.35
The mean for a set of data obtained by assigning each data value a weight that reflects its relative
importance within the set, is called:
(a) Geometric mean (b) Harmonic mean (c) Weighted mean (d) Combined mean

MCQ No 3.36
If 1, 2, 3, ... , k be the arithmetic means of k distributions with respective frequencies n1, n2, n3, ... ,
nk, then the mean of the whole distribution c is given by:
(a) ∑ / ∑n (b) ∑n / ∑ (c) ∑n / ∑n (d) ∑(n+ ) / ∑n

MCQ No 3.37
The combined arithmetic mean is calculated by the formula:

MCQ No 3.38
The arithmetic mean of 10 items is 4 and the arithmetic mean of 5 items is 10. The combined arithmetic
mean is:
(a) 4 (b) 5 (c) 6 (d) 90

MCQ No 3.39
The midpoint of the values after they have been ordered from the smallest to the largest or the largest
to the smallest is called:
(a) Mean (b) Median (c) Lower quartile (d) Upper quartile

MCQ No 3.40
The first step in calculating the median of a discrete variable is to determine the:
(a) Cumulative frequencies (b) Relative weights
(c) Relative frequencies (d) Array

MCQ No 3.41
The suitable average for qualitative data is:
(a) Mean (b) Median (c) Mode (d) Geometric mean

MCQ No 3.42
Extreme scores will have the following effect on the median of an examination:
(a) They may have no effect on it (b) They may tend to raise it
(c) They may tend to lower it (d) None of the above

MCQ No 3.43
We must arrange the data before calculating:
(a) Mean (b) Median (c) Mode (d) Geometric mean
MCQ No 3.44
If the smallest observation in a data is decreased, the average which is not affected is:
(a) Mode (b) Median (c) Mean (d) Harmonic mean

MCQ No 3.45
If the data contains an extreme value, the suitable average is:
(a) Mean (b) Median (c) Weighted mean (d) Geometric mean

MCQ No 3.46
Sum of absolute deviations of the values is least when deviations are taken from:
(a) Mean (b) Mode (c) Median (d) Q3

MCQ No 3.47
The frequency distribution of the hourly wages rate of 100 employees of a paper mill is as follows:
Wage rate (Rs.) 54----56 56----58 58----60 60----62 62----64
Number of workers 20 20 20 20 20
The median wage rate is:
(a) Rs.55 (b) Rs.57 (c) Rs.56 (d) Rs.59

MCQ No 3.48
The values of the variate that divide a set of data into four equal parts after arranging the observations in
ascending order of magnitude are called:
(a) Quartiles (b) Deciles (c) Percentiles (d) Difficult to tell

MCQ No 3.49
The lower and upper quartiles of a symmetrical distribution are 40 and 60 respectively. The value of
median is:
(a) 40 (b) 50 (c) 60 (d) (60 – 40) / 2

MCQ No 3.50
If in a discrete series 75% values are less than 30, then:
(a) Q3 < 75 (b) Q3 < 30 (c) Q3 = 30 (d) Q3 > 30

MCQ No 3.51
If in a discrete series 75% values are greater than 50, then:
(a) Q1 = 50 (b) Q1 < 50 (c) Q1 > 50 (d) Q1 ≠ 50

MCQ No 3.52
If in a discrete series 25% values are greater than 75, then:
(a) Q1 > 75 (b) Q1 = 75 (c) Q3 = 75 (d) Q3 > 75

MCQ No 3.53
If in a discrete series 40% values are less than 40, then :
(a) D4 ≠ 40 (b) D4 < 40 (c) D4 > 40 (d) D4 = 40

MCQ No 3.54
If in a discrete series 15% values are greater than 40, then:
(a) P15 = 70 (b) P85 = 15 (c) P85 = 70 (d) P70 = 70

MCQ No 3.55
The middle value of an ordered series is called:
(a) Median (b) 5th decile (c) 50th percentile (d) All the above
MCQ No 3.56
If in a discrete series 50% values are less than 50, then:
(a) Q2 = 50 (b) D5 = 50 (c) P50 = 50 (d) All of the above

MCQ No 3.57
The mode or model value of the distribution is that value of the variate for which frequency is:
(a) Minimum (b) Maximum (c) Odd number (d) Even number

MCQ No 3.58
Suitable average for averaging the shoe sizes for children is:
(a) Mean (b) Mode (c) Median (d) Geometric mean

MCQ No 3.59
Extreme scores on an examination have the following effect on the mode:
(a) They tend to raise it (b) they tend to lower it
(c) They have no effect on it (d) difficult to tell

MCQ No 3.60
A measurement that corresponds to largest frequency in a set of data is called:
(a) Mean (b) Median (c) Mode (d) Percentile

MCQ No 3.61
Which of the following average cannot be calculated for the observations 2, 2, 4, 4, 6, 6, 8, 8, 10, 10 ?
(a) Mean (b) Median (c) Mode (d) All of the above

MCQ No 3.62
Mode of the series 0, 0, 0, 2, 2, 3, 3, 8, 10 is:
(a) 0 (b) 2 (c) 3 (d) No mode

MCQ No 3.63
A distribution with two modes is called:
(a) Unimodel (b) Bimodal (c) Multimodal (d) Normal

MCQ No 3.64
The model letter of the word “STATISTICS” is:
(a) S (b) T (c) Both S and I (d) Both S and T

MCQ No 3.65
The mode for the following frequency distribution is:
Weekly sales of burner units 0 1 2 3 Over 3
Number of weeks 38 6 5 1 0
(a) 0 (b) 2 (c) 3 (d) No mode

MCQ No 3.66
Which of the following statements is always correct?
(a) Mean = Median = Mode (b) Arithmetic mean = Geometric mean = Harmonic mean
(c) Median = Q2 = D5 = P50 (d) Mode = 2Median - 3Mean

MCQ No 3.67
In a moderately symmetrical series, the arithmetic mean, median and mode are related as:
(a) Mean - Mode = 3(Mean - Median) (b) Mean - Median = 2(Median - Mode)
(c) Median - Mode = (Mean - Median) / 2 (d) Mode – Median = 2Mean – 2Median
MCQ No 3.68
In a moderately skewed distribution, mean is equal to!
(a) (3Median - Mode) / 2 (b) (2Mean + Mode) / 3
(c) 3Median – 2Mean (d) 3Median - Mode

MCQ No 3.69
In a moderately asymmetrical distribution, the value of median is given by:
(a) 3Median + 2Mean (b) 2Mean + Mode
(c) (2Mean + Mode) / 3 (d) (3Median - Mode) / 2

MCQ No 3.70
For moderately skewed distribution, the value of mode is calculated as:
(a) 2Mean – 3Median (b) 3Median – 2Mean
(c) 2Mean + Mode (d) 3Median - Mode

MCQ No 3.71
In a moderately skewed distribution, Mean = 45 and Median = 30, then the value of mode is:
(a) 0 (b) 30 (c) 45 (d) 180

MCQ No 3.72
If for any frequency distribution, the median is 10 and the mode is 30, then approximate value of mean is
equal to:
(a) 0 (b) 10 (c) 30 (d) 60

MCQ No 3.73
In a moderately asymmetrical distribution, the value of mean and mode is 15 and 18 respectively. The value of
median will be:
(a) 48 (b) 18 (c) 16 (d) 15

MCQ No 3.74

(a) 2 (b) 3 (c) 1/2 (d) 1/3

MCQ No 3.75
Which of the following is correct in a positively skewed distribution?
(a) Mean = Median = Mode (b) Mean < Median < Mode
(c) Mean > Median > Mode (d) Mean + Median + Mode

MCQ No 3.76
If the values of mean, median and mode coincide in a unimodel distribution, then the distribution will
be:
(a) Skewed to the left (b) Skewed to the right (c) Multimodal (d) Symmetrical

MCQ No 3.77
A curve that tails off to the right end is called:
(a) Positively skewed (b) Negatively skewed (c) Symmetrical (d) Both (b) and (c)

MCQ No 3.78
The sum of the deviations taken from mean is:
(a) Always equal to zero (b) Some times equal to zero
(c) Never equal to zero (d) Less than zero
MCQ No 3.79
If a set of data has one mode and its value is less than mean, then the distribution is called:
(a) Positively skewed (b) Negatively skewed (c) Symmetrical (d) Normal

MCQ No 3.80
Taking the relevant root of the product of all non-zero and positive values are called:
(a) Arithmetic mean (b) Geometric mean (c) Harmonic mean (d) Combined mean

MCQ No 3.81
The best average in percentage rates and ratios is:
(a) Arithmetic mean (b) Lower and upper quartiles
(c) Geometric mean (d) Harmonic mean

MCQ No 3.82
The suitable average for computing average percentage increase in population is:
(a) Geometric mean (b) Harmonic mean (c) Combined mean (d) Population mean

MCQ No 3.83
If 10% is added to each value of variable, the geometric mean of new variable is added by:
(a) 10 (b) 1/100 (c) 10% (d) 1.1

MCQ No 3.84
If each observation of a variable X is increased by 20%, then geometric mean is also increased by:
(a) 20 (b) 1/20 (c) 20% (d) 100%

MCQ No 3.85
If any value in a series is negative, then we cannot calculate the:
(a) Mean (d) Median (c) Geometric mean (d) Harmonic mean

MCQ No 3.86
Geometric mean for X1 andX2 is:

MCQ No 3.87
Geometric mean of 2, 4, 8 is:
(a) 6 (b) 4 (c) 14/3 (d) 8

MCQ No 3.88
Geometric mean is suitable when the values are given as:
(a) Proportions (b) Ratios (c) Percentage rates (d) All of the above

MCQ No 3.89
If the geometric of the two numbers X1 and X2 is 9 if X1=3, then X2 is equal to:
(a) 3 (b) 9 (c) 27 (d) 81

MCQ No 3.90
If the two observations are a = 2 and b = -2, then their geometric mean will be:
(a) Zero (b) Infinity (c) Impossible (d) Negative
MCQ No 3.91
Geometric mean of -4, -2 and 8 is:
(a) 4 (b) 0 (c) -2 (d) Impossible

MCQ No 3.92
The ratio among the number of items and the sum of reciprocals of items is called:
(a) Arithmetic mean (b) Geometric mean (c) Harmonic mean (d) Mode

MCQ No 3.93
Harmonic mean for X1 and X2 is:

MCQ No 3.94
The appropriate average for calculating the average speed of a journey is:
(a) Median (b) Arithmetic mean (c) Mode (d) Harmonic mean

MCQ No 3.95
Harmonic mean gives less weightage to:
(a) Small values (b) Large values (c) Positive values (d) Negative values

MCQ No 3.96
The harmonic mean of the values 5, 9, 11, 0, 17, 13 is:
(a) 9.5 (b) 6.2 (c) 0 (d) Impossible

MCQ No 3.97
If the harmonic mean of the two numbers X1 and X2 is 6.4 if X2=16, then X1 is:
(a) 4 (b) 10 (c) 16 (d) 20

MCQ No 3.98
If a = 5 and b = -5, then their harmonic mean is:
(a) -5 (b) 5 (c) 0 (d) ∞

MCQ No 3.99
For an open-end frequency distribution, it is not possible to find:
(a) Arithmetic mean (b) Geometric mean (c) Harmonic mean (d) All of the above

MCQ No 3.100
If all the items in a variable are non zero and non negative then:
(a) A.M > G.M > H.M (b) G.M > A.M > H.M (c) H.M > G.M > A.M (d) A.M < G.M < H.M

MCQ No 3.101
The geometric mean of a set of positive numbers X1, X2, X3, ... , Xn is less than or equal to their
arithmetic mean but is greater than or equal to their:
(a) Harmonic mean (b) Median (c) Mode (d) Lower and upper quartiles

MCQ No 3.102
Geometric mean and harmonic mean for the values 3, -11, 0, 63, -14, 100 are:
(a) 0 and 3 (b) 3 and -3 (c) 0 and 0 (d) Impossible
MCQ No 3.103
If the arithmetic mean and harmonic mean of two positive numbers are 4 and 16, then their
geometric mean will be:
(a) 4 (b) 8 (c) 16 (d) 64

MCQ No 3.104
The arithmetic mean and geometric mean of two observations are 4 and 8 respectively, then harmonic
mean of these two observations is:
(a) 4 (b) 8 (c) 16 (d) 32

MCQ No 3.105
The geometric mean and harmonic mean of two values are. 8 and 16 respectively, then arithmetic
mean of values is:
(a) 4 (b) 16 (c) 24 (d) 128

MCQ No 3.106
Which pair of averages cannot be calculated when one of numbers in the series is zero?
(a) Geometric mean and Median (b) Harmonic mean and Mode
(c) Simple mean and Weighted mean (d) Geometric mean and Harmonic mean

MCQ No 3.107
In a given data the average which has the least value is:
(a) Mean (b) Median (c) Harmonic mean (d) Geometric mean

MCQ No 3.108
If all the values in a series are same, then:
(a) A.M = G.M = H.M (b) A.M ≠ G.M ≠ H.M (c) A.M > G.M > H.M (d) A.M < G.M < H.M

MCQ No 3.109
The averages are affected by change of:
(a) Origin (b) Scale (c) Both (a) and (b) (d) None of the above
MCQ’s of Measures of Dispersion

MCQ No 4.1
The scatter in a series of values about the average is called:
(a) Central tendency (b) Dispersion (c) Skewness (d) Symmetry

MCQ No 4.2
The measurements of spread or scatter of the individual values around the central point is called:
(a) Measures of dispersion (b) Measures of central tendency
(c) Measures of skewness (d) Measures of kurtosis

MCQ No 4.3
The measures used to calculate the variation present among the observations in the unit of the variable is
called:
(a) Relative measures of dispersion (b) Coefficient of skewness
(c) Absolute measures of dispersion (d) Coefficient of variation

MCQ No 4.4
The measures used to calculate the variation present among the observations relative to their average is
called:
(a) Coefficient of kurtosis (b) Absolute measures of dispersion
(c) Quartile deviation (d) Relative measures of dispersion

MCQ No 4.5
The degree to which numerical data tend to spread about an average value called:
(a) Constant (b) Flatness (c) Variation (d) Skewness

MCQ No 4.6
The measures of dispersion can never be:
(a) Positive (b) Zero (c) Negative (d) Equal to 2

MCQ No 4.7
If all the scores on examination cluster around the mean, the dispersion is said to be:
(a) Small (b) Large (c) Normal (d) Symmetrical

MCQ No 4.8
If there are many extreme scores on all examination, the dispersion is:
(a) Large (b) Small (c) Normal (d) Symmetric

MCQ No 4.9
Given below the four sets of observations. Which set has the minimum variation?
(a) 46, 48, 50, 52, 54 (b) 30, 40, 50, 60, 70 (c) 40, 50, 60, 70, 80 (d) 48, 49, 50, 51, 52

MCQ No 4.10
Which of the following is an absolute measure of dispersion?
(a) Coefficient of variation (b) Coefficient of dispersion
(c) Standard deviation (d) Coefficient of skewness

MCQ No 4.11
The measure of dispersion which uses only two observations is called:
(a) Mean (b) Median (c) Range (d) Coefficient of variation
MCQ No 4.12
The measure of dispersion which uses only two observations is called:
(a) Range (b) Quartile deviation (c) Mean deviation (d) Standard deviation

MCQ No 4.13
In quality control of manufactured items, the most common measure of dispersion is:
(a) Range (b) Average deviation (c) Standard deviation (d) Quartile deviation

MCQ No 4.14
The range of the scores 29, 3, 143, 27, 99 is:
(a) 140 (b) 143 (c) 146 (d) 70

MCQ No 4.15
If the observations of a variable X are, -4, -20, -30, -44 and -36, then the value of the range will be:
(a) -48 (b) 40 (c) -40 (d) 48

MCQ No 4.16
The range of the values -5, -8, -10, 0, 6, 10 is:
(a) 0 (b) 10 (c) -10 (d) 20

MCQ No 4.17
If Y = aX ± b, where a and b are any two numbers and a ≠ 0, then the range of Y values will be:
(a) Range(X) (b) a range(X) + b (c) a range(X) – b (d) |a| range(X)

MCQ No 4.18
If the maximum value in a series is 25 and its range is 15, the maximum value of the series is:
(a) 10 (b) 15 (c) 25 (d) 35

MCQ No 4.19
Half of the difference between upper and lower quartiles is called:
(a) Interquartile range (b) Quartile deviation (c) Mean deviation (d) Standard deviation

MCQ No 4.20
If Q3=20 and Q1=10, the coefficient of quartile deviation is:
(a) 3 (b) 1/3 (c) 2/3 (d) 1

MCQ No 4.21
Which measure of dispersion can be computed in case of open-end classes?
(a) Standard deviation (b) Range (c) Quartile deviation (d) Coefficient of variation

MCQ No 4.22
If Y = aX ± b, where a and b are any two constants and a ≠ 0, then the quartile deviation of Y values is
equal to:
(a) a Q.D(X) + b (b) |a| Q.D(X) (c) Q.D(X) – b (d) |b| Q.D(X)

MCQ No 4.23
The sum of absolute deviations is minimum if these deviations are taken from the:
(a) Mean (b) Mode (c) Median (d) Upper quartile

MCQ No 4.24
The mean deviation is minimum when deviations are taken from:
(a) Mean (b) Mode (c) Median (d) Zero
MCQ No 4.25
If Y = aX ± b, where a and b are any two numbers but a ≠ 0, then M.D(Y) is equal to:
(a) M.D(X) (b) M.D(X) ± b (c) |a| M.D(X) (d) M.D(Y) + M.D(X)

MCQ No 4.26
The mean deviation of the scores 12, 15, 18 is:
(a) 6 (b) 0 (c) 3 (d) 2

MCQ No 4.27
Mean deviation computed from a set of data is always:
(a) Negative (b) Equal to standard deviation
(c) More than standard deviation (d) Less than standard deviation

MCQ No 4.28
The average of squared deviations from mean is called:
(a) Mean deviation (b) Variance (c) Standard deviation (d) Coefficient of variation

MCQ No 4.29
The sum of squares of the deviations is minimum, when deviations are taken from:
(a) Mean (b) Mode (c) Median (d) Zero

MCQ No 4.30
Which of the following measures of dispersion is expressed in the same units as the units of observation?
(a) Variance (b) Standard deviation
(c) Coefficient of variation (d) Coefficient of standard deviation

MCQ No 4.31
Which measure of dispersion has a different unit other than the unit of measurement of values:
(a) Range (b) Standard deviation (c) Variance (d) Mean deviation

MCQ No 4.32
Which of the following is a unit free quantity:
(a) Range (b) Standard deviation (c) Coefficient of variation (d) Arithmetic mean

MCQ No 4.33
If the dispersion is small, the standard deviation is:
(a) Large (b) Zero (c) Small (d) Negative

MCQ No 4.34
The value of standard deviation changes by a change of:
(a) Origin (b) Scale (c) Algebraic signs (d) None

MCQ No 4.35
The standard deviation one distribution dividedly the mean of the distribution and expressing in
percentage is called:
(a) Coefficient of Standard deviation (b) Coefficient of skewness
(c) Coefficient of quartile deviation (d) Coefficient of variation

MCQ No 4.36
The positive square root of the mean of the squares of the cleviations of observations from their mean is
called:
(a) Variance (b) Range (c) Standard deviation (d) Coefficient of variation
MCQ No 4.37
The variance is zero only if all observations are the:
(a) Different (b) Square (c) Square root (d) Same

MCQ No 4.38
The standard deviation is independent of:
(a) Change of origin (b) Change of scale of measurement
(c) Change of origin and scale of measurement (d) Difficult to tell

MCQ No 4.39
If there are ten values each equal to 10, then standard deviation of these values is:
(a) 100 (b) 20 (c) 10 (d) 0

MCQ No 4.40
If X and Y are independent random variables, then S.D(X ± Y) is equal to:
(a) S.D(X) ± S.D(Y) (b) Var(X) ± Var(Y) (c) (d)

MCQ No 4.41
S.D(X) = 6 and S.D(Y) = 8. If X and Yare independent random variables, then S.D(X-Y) is:
(a) 2 (b) 10 (c) 14 (d) 100

MCQ No 4.42
For two independent variables X and Y if S.D(X) = 1 and S.D(Y) = 3, then Var(3X - Y) is equal to:
(a) 0 (b) 6 (c) 18 (b) 12

MCQ No 4.43
If Y = aX ± b, where a and b are any two constants and a ≠ 0, then Vat (Y) is equal to:
(a) a Var(X) (b) a Var(X) + b (c) a2 Var(X) – b (d) a2 Var(X)

MCQ No 4.44
If Y = aX + b, where a and b are any two numbers but a ≠ 0, then S.D(Y) is equal to:
(a) S.D(X) (b) a S.D(X) (c) |a| S.D(X) (d) a S.D(X) + b

MCQ No 4.45
The ratio of the standard deviation to the arithmetic mean expressed as a percentage is called:
(a) Coefficient of standard deviation (b) Coefficient of skewness
(c) Coefficient of kurtosis (d) Coefficient of variation

MCQ No 4.46
Which of the following statements is correct?
(a) The standard deviation of a constant is equal to unity
(b) The sum of absolute deviations is minimum if these deviations are taken from the mean.
(c) The second moment about origin equals variance
(d) The variance is positive quantity and is expressed in square of the units of the observations
MCQ No 4.47
Which of the following statements is false?
(a) The standard deviation is independent of change of origin
(b) If the moment coefficient of kurtosis β2 = 3, the distribution is mesokurtic or normal.
(c) If the frequency curve has the same shape on both sides of the centre line which divides the curve into
two equal parts, is called a symmetrical distribution.
(d) Variance of the sum or difference of any two variables is equal to the sum of their
respective variances

MCQ No 4.48
If Var(X) = 25, then is equal to:
(a) 15/2 (b) 50 (c) 25 (d) 5

MCQ No 4.49
To compare the variation of two or more than two series, we use
(a) Combined standard deviation (b) Corrected standard deviation
(c) Coefficient of variation (d) Coefficient of skewness

MCQ No 4.50
The standard deviation of -5, -5, -5, -5, 5 is:
(a) -5 (b) +5 (c) 0 (d) -25

MCQ No 4.51
Standard deviation is always calculated from:
(a) Mean (b) Median (c) Mode (d) Lower quartile

MCQ No 4.52
The mean of an examination is 69, the median is 68, the mode is 67, and the standard deviation is 3.
The measures of variation for this examination is:
(a) 67 (b) 68 (c) 69 (d) 3

MCQ No 4.53
The variance of 19, 21, 23, 25 and 27 is 8. The variance of 14, 16, 18, 20 and 22 is:
(a) Greater than 8 (b) 8 (c) Less than 8 (d) 8 - 5 = 3

MCQ No 4.54
In a set of observations the variance is 50. All the observations are increased by 100%. The variance of
the increased observations will become:
(a) 50 (b) 200 (c) 100 (d) No change

MCQ No 4.55
Three factories A, B, C have 100, 200 and 300 workers respectively. The mean of the wages is the same
in the three factories. Which of the following statements is true?
(a) There is greater variation in factory C.
(b) Standard deviation in. factory A is the smallest.
(c) Standard deviation in all the three factories are equal
(d) None of the above
MCQ No 4.56
An automobile manufacturer obtains data concerning the sales of six of its deals in the last week of
1996. The results indicate the standard deviation of their sales equals 6 autos. If this is so, the variance of
their sales equals:
(a) (b) 6 (c) (d) 36

MCQ No 4.57
If standard deviation of the values 2, 4, 6, 8 is 2.236, then standard deviation of the values 4, 8,12, 16 is:
(a) 0 (b) 4.472 (c) 4.236 (d) 2.236

MCQ No 4.58
Var(X) = 4 and Var(Y) =9. If X and Y are independent random variable then Var(2X + Y) is:
(a) 13 (b) 17 (c) 25 (d) -1

MCQ No 4.59
If = Rs.20, S= Rs.10, then coefficient of variation is:
(a) 45% (b) 50% (c) 60% (d) 65%

MCQ No 4.60
Which of the following measures of dispersion is independent of the units employed?
(a) Coefficient of variation (b) Quartile deviation
(c) Standard deviation (d) Range

MCQ No 4.61
In sheppard’s correction µ2 is equal to:

MCQ No 4.62
The moments about mean are called:
(a) Raw moments (b) Central moments (c) Moments about origin (d) All of the above

MCQ No 4.63
The moments about origin are called:
(a) Moments about zero (b) Raw moments (c) Both (a) and (b) (d) Neither (a) nor (b)

MCQ No 4.64
All odd order moments about mean in a symmetrical distribution are:
(a) Positive (b) Negative (c) Zero (d) Three

MCQ No 4.65
The second moment about arithmetic mean is 16, the standard deviation will be:
(a) 16 (b) 4 (c) 2 (d) 0

MCQ No 4.66
The first and second moments about arbitrary constant are -2 and 13 respectively, The standard deviation will
be:
(a) -2 (b) 3 (c) 9 (d) 13
MCQ No 4.67
Moment ratios β1 and β2 are:
(a) Independent of origin and scale of measurement
(b) Expressed in original unit of the data
(c) Unit less quantities
(d) Both (a) and (c)

MCQ No 4.68
The first moment about X = 0 of a distribution is 12.08. The mean is:
(a) 10.80 (b) 10.08 (c) 12.08 (d) 12.88

MCQ No 4.69
First two moments about the value 2 of a variable are 1 and 16. The variance will be:
(a) 13 (b) 15 (c) 16 (d) Difficult to tell

MCQ No 4.70
The first three moments of a distribution about the mean are 1, 4 and 0. The distribution is:
(a) Symmetrical (b) Skewed to the left (c) Skewed to the right (d) Normal

MCQ No 4.71
If the third central is negative, the distribution will be:
(a) Symmetrical (b) Positively skewed (c) Negatively skewed (d) Normal

MCQ No 4.72
If the third moment about mean is zero, then the distribution is:
(a) Positively skewed (b) Negatively skewed (c) Symmetrical (d) Mesokurtic

MCQ No 4.73
Departure from symmetry is called:
(a) Second moment (b) Kurtosis (c) Skewness (d) Variation

MCQ No 4.74
In a symmetrical distribution, the coefficient of skewness will be:
(a) 0 (b) Q1 (c) Q3 (d) 1

MCQ No 4.75
The lack of uniformity or symmetry is called:
(a) Skewness (b) Dispersion (c) Kurtosis (d) Standard deviation

MCQ No 4.76
For a positively skewed distribution, mean is always:
(a) Less than the median (b) Less than the mode
(c) Greater than the mode (d) Difficult to tell

MCQ No 4.77
For a symmetrical distribution:
(a) β1 > 0 (b) β1 < 0 (c) β1 = 0 (d) β1 = 3

MCQ No 4.78
If mean=50, mode=40 and standard deviation=5, the distribution is:
(a) Positively skewed (b) Negatively skewed (c) Symmetrical (d) Difficult to tell
MCQ No 4.79
If mean=25, median=30 and standard deviation=15, the distribution will be:
(a) Symmetrical (b) Positively skewed (c) Negatively skewed (d) Normal

MCQ No 4.80
If mean=20, median=16 and standard deviation=2, then coefficient of skewness is:
(a) 1 (b) 2 (c) 4 (d) -2

MCQ No 4.81
If mean=10, median=8 and standard deviation=6, then coefficient of skewness is:
(a) 1 (b) -1 (c) 2/6 (d) 2

MCQ No 4.82
If the sum of deviations from median is not zero, then a distribution will be:
(a) Symmetrical (b) Skewed (c) Normal (d) All of the above

MCQ No 4.83
In case of positively skewed distribution, the extreme values lie in the:
(a) Middle (b) Left tail (c) Right tail (d) Anywhere

MCQ No 4.84
Bowley's coefficient of skewness lies between:
(a) 0 and 1 (b) 1 and +1 (c) -1 and 0 (d) -2 and +2

MCQ No 4.85
In a symmetrical distribution, Q3 – Q1 = 20, median = 15. Q3 is equal to:
(a) 5 (b) 15 (c) 20 (d) 25

MCQ No 4.86
Which of the following is correct in a negatively skewed distribution?
(a) The arithmetic mean is greater than the mode
(b) The arithmetic mean is greater than the median
(c) (Q3 – Median) = (Median – Q1)
(d) (Q3 – Median) < (Median – Q1)

MCQ No 4.87
The lower and upper quartiles of a distribution are 80 and 120 respectively, while median is 100. The
shape of the distribution is:
(a) Positively skewed (b) Negatively skewed (c) Symmetrical (d) Normal

MCQ No 4.88
In a symmetrical distribution Q1 = 20 and median= 30. The value of Q3 is:
(a) 50 (b) 35 (c) 40 (d) 25

MCQ No 4.89
The degree of peaked ness or flatness of a unimodel distribution is called:
(a) Skewness (b) Symmetry (c) Dispersion (d) Kurtosis

MCQ No 4.90
For a leptokurtic distribution, the relation between second and fourth central moment is:
MCQ No 4.91
For a platydurtic distribution, the relation between and is:

MCQ No 4.92
For a mesokurtic distribution, the relation between fourth and second mean moment is:

MCQ No 4.93
The second and fourth moments about mean are 4 and 48 respectively, then the distribution is:
(a) Leptokurtic (b) Platykurtic (c) Mesokurtic or normal (d) Positively skewed

MCQ No 4.94
In a mesokurtic or normal distribution, µ4 = 243. The standard deviation is:
(a) 81 (b) 27 (c) 9 (d) 3

MCQ No 4.95
The value of β2 can be:
(a) Less than 3 (b) Greater than 3 (c) Equal to 3 (d) All of the above

MCQ No 4.96
In a normal (mesokurtic) distribution:
(a) β1=0 and β2=3 (b) β1=3 and β2=0 (c) β1=0 and β2>3 (d) β1=0 and β2<3

MCQ No 4.97
Any frequency distribution, the following empirical relation holds:
(a) Quartile deviation = Standard deviation
(b) Mean deviation = Standard deviation
(c) Standard deviation = Mean deviation = Quartile deviation
(d) All of the above
MCQ of REGRESSION AND CORRELATION

MCQ 14.1
A process by which we estimate the value of dependent variable on the basis of one or more independent
variables is called:
(a) Correlation (b) Regression (c) Residual (d) Slope

MCQ 14.2
The method of least squares dictates that we choose a regression line where the sum of the square of
deviations of the points from the lie is:
(a) Maximum (b) Minimum (c) Zero (d)
Positive

MCQ 14.3
A relationship where the flow of the data points is best represented by a curve is called:
(a) Linear relationship (b) Nonlinear relationship (c) Linear positive (d) Linear negative

MCQ 14.4
All data points falling along a straight line is called:
(a) Linear relationship (b) Non linear relationship (c) Residual (d) Scatter diagram

MCQ 14.5
The value we would predict for the dependent variable when the independent variables are all equal to zero
is called:
(a) Slope (b) Sum of residual (c) Intercept (d) Difficult to tell

MCQ 14.6
The predicted rate of response of the dependent variable to changes in the independent variable is called:
(a) Slope (b) Intercept (c) Error (d) Regression equation

MCQ 14.7
The slope of the regression line of Y on X is also called the:
(a) Correlation coefficient of X on Y (b) Correlation coefficient of Y on X
(c) Regression coefficient of X on Y (d) Regression coefficient of Y on X

MCQ 14.8
In simple linear regression, the numbers of unknown constants are:
(a) One (b) Two (c) Three (d) Four

MCQ 14.9
In simple regression equation, the numbers of variables involved are:
(a) 0 (b) 1 (c) 2 (d) 3

MCQ 14.10
If the value of any regression coefficient is zero, then two variables are:
(a) Qualitative (b) Correlation (c) Dependent (d) Independent

MCQ 14.11
The straight line graph of the linear equation Y = a+ bX, slope will be upward if:
(a) b = 0 (b) b < 0 (c) b > 0 (b) b ≠ 0

MCQ 14.12
The straight line graph of the linear equation Y = a + bX, slope will be downward If:
(a) b > 0 (b) b < 0 (c) b = 0 (d) b ≠ 0
MCQ 14.13
The straight line graph of the linear equation Y = a + bX, slope is horizontal if:
(a) b = 0 (b) b ≠ 0 (c) b = 1 (d) a = b

MCQ 14.14
If regression line of = 5, then value of regression coefficient of Y on X is:
(a) 0 (b) 0.5 (c) 1 (d) 5

MCQ 14.15
If Y = 2 - 0.2X, then the value of Y intercept is equal to:
(a) -0.2 (b) 2 (c) 0.2X (d) All of the above

MCQ 14.16
If one regression coefficient is greater than one, then other will he:
(a) More than one (b) Equal to one (c) Less than one (d) Equal to minus one

MCQ 14.17
To determine the height of a person when his weight is given is:
(a) Correlation problem (b) Association problem (c) Regression problem (d) Qualitative
problem

MCQ 14.18
The dependent variable is also called:
(a) Regression (b) Regressand (c) Continuous variable (d) Independent

MCQ 14.19
The dependent variable is also called:
(a) Regressand variable (b) Predictand variable (c) Explained variable (d) All of these

MCQ 14.20
The independent variable is also called:
(a) Regressor (b) Regressand (c) Predictand (d) Estimated

MCQ 14.21
In the regression equation Y = a+bX, the Y is called:
(a) Independent variable (b) Dependent variable (c) Continuous variable (d) None of the above

MCQ 14.22
In the regression equation X = a + bY, the X is called:
(a) Independent variable (b) Dependent variable (c) Qualitative variable (d) None of the above

MCQ 14.23
In the regression equation Y = a +bX, a is called:
(a) X-intercept (b) Y-intercept (c) Dependent variable (d) None of the above

MCQ 14.24
The regression equation always passes through:
(a) (X, Y) (b) (a, b) (c) ( , ) (d) ( , Y)

MCQ 14.25
The independent variable in a regression line is:
(a) Non-random variable (b) Random variable (c) Qualitative variable (d) None of the above
MCQ 14.26
The graph showing the paired points of (Xi, Yi) is called:
(a) Scatter diagram (b) Histogram (c) Historigram (d) Pie diagram

MCQ 14.27
The graph represents the relationship that is:
(a) Linear (b) Non linear (c) Curvilinear (d) No relation

MCQ 14.28
The graph represents the relationship that is.:
(a) Linear positive (b) Linear negative (c) Non-linear (d) Curvilinear

MCQ 14.29
When regression line passes through the origin, then:
(a) Intercept is zero (b) Regression coefficient is zero (c) Correlation is zero (d) Association is zero

MCQ 14.30
When bXY is positive, then byx will be:
(a) Negative (b) Positive (c) Zero (d) One

MCQ 14.31
The correlation coefficient is the of two regression coefficients:
(a) Geometric mean (b) Arithmetic mean (c) Harmonic mean (d) Median

MCQ 14.32
When two regression coefficients bear same algebraic signs, then correlation coefficient is:
(a) Positive (b) Negative (c) According to two signs (d) Zero

MCQ 14.33
It is possible that two regression coefficients have:
(a) Opposite signs (b) Same signs (c) No sign (d) Difficult to tell

MCQ 14.34
Regression coefficient is independent of:
(a) Units of measurement (b) Scale and origin (c) Both (a) and (b) (d) None of them

MCQ 14.35
In the regression line Y = a+ bX:
(a) (b) (c) (d)

MCQ 14.36
In the regression line Y = a + bX, the following is always true:
(a) (b) (c) (d)

MCQ 14.37
The purpose of simple linear regression analysis is to:
(a) Predict one variable from another variable
(b) Replace points on a scatter diagram by a straight-line
(c) Measure the degree to which two variables are linearly associated
(d) Obtain the expected value of the independent random variable for a given value of the
dependent variable
MCQ 14.38
The sum of the difference between the actual values of Y and its values obtained from the fitted
regression line is always:
(a) Zero (b) Positive (c) Negative (d) Minimum

MCQ 14.39
If all the actual and estimated values of Y are same on the regression line, the sum of squares of
error will be:
(a) Zero (b) Minimum (c) Maximum (d) Unknown

MCQ 14.40

(a) Residual (b) Difference between independent and dependent variables

MCQ 14.41
A measure of the strength of the linear relationship that exists between two variables is called:
(a) Slope (b) Intercept (c) Correlation coefficient (d) Regression
equation

MCQ 14.42
When the ratio of variations in the related variables is constant, it is called:
(a) Linear correlation (b) Nonlinear correlation (c) Positive correlation (d) Negative correlation

MCQ 14.43
If both variables X and Y increase or decrease simultaneously, then the coefficient of correlation will
be:
(a) Positive (b) Negative (c) Zero (d) One

MCQ 14.44
If the points on the scatter diagram indicate that as one variable increases the other variable tends to
decrease the value of r will be:
(a) Perfect positive (b) Perfect negative (c) Negative (d) Zero

MCQ 14.45
If the points on the scatter diagram show no tendency either to increase together or decrease together
the value of r will be close to:
(a) -1 (b) +1 (c) 0.5 (d) 0

MCQ 14.46
If one item is fixed and unchangeable and the other item varies, the correlation coefficient will be:
(a) Positive (b) Negative (c) Zero (d) Undecided

MCQ 14.47
In scatter diagram, if most of the points lie in the first and third quadrants, then coefficient of
correlation is:
(a) Negative (b) Positive (c) Zero (d) All of the above

MCQ 14.48
If the two series move in reverse directions and the variations in their values are always
proportionate, it is said to be:
(a) Negative correlation (b) Positive correlation
(c) Perfect negative correlation (d) Perfect positive correlation
MCQ 14.49
If both the series move in the same direction and the variations are in a fixed proportion, correlation
between them is said to be:
(a) Perfect correlation (c) Linear correlation
(c) Nonlinear correlation (d) Perfect positive correlation

MCQ 14.50
The value of the coefficient of correlation r lies between:
(a) 0 and 1 (b) -1 and 0 (c) -1 and +1 (d) -0.5 and +0.5

MCQ 14.51
If X is measured in yours and Y is measured in minutes, then correlation coefficient has the unit:
(a) Hours (b) Minutes (c) Both (a) and (b) (d) No unit

MCQ 14.52
The range of regressioin coefficient is:
(a) -1 to +1 (b) 0 to 1 (c) -∞ to +∞ (d) 0 to ∞

MCQ 14.53
The signs of regression coefficients and correlation coefficient are always:
(a) Different (b) Same (c) Positive (d) Negative

MCQ 14.54
The arithmetic mean of the two regression coefficients is greater than or equal to:
(a) -1 (b) +1 (c) 0 (d) r

MCQ 14.55
In simple linear regression model Y = α + βX + ε where α and β are called:
(a) Estimates (b) Parameters (c) Random errors (d) Variables

MCQ 14.56
Negative regression coefficient indicates that the movement of the variables are in:
(a) Same direction (b) Opposite direction (c) Both (a) and (b) (d) Difficult to tell

MCQ 14.57
Positive regression coefficient indicates that the movement of the variables are in:
(a) Same direction (b) Opposite direction (c) Upward direction (d) Downward direction

MCQ 14.58
If the value of regression coefficient is zero, then the two variable are called:
(a) Independent (b) Dependent (c) Both (a) and (b) (d) Difficult to tell

MCQ 14.59
The term regression was used by:
(a) Newton (b) Pearson (c) Spearman (d) Galton

MCQ 14.60
In the regression equation Y = a + bX, b is called:
(a) Slope (b) Regression coefficient (c) Intercept (d) Both (a) and (b)

MCQ 14.61
When the two regression lines are parallel to each other, then their slopes are:
(a) Zero (b) Different (c) Same (d) Positive
MCQ 14.62
The measure of change in dependent variable corresponding to an unit change in independent
variable is called:
(a) Slope (b) Regression coefficient (c) Both (a) and (b) (d) Neither (a) and (b)

MCQ 14.63
In correlation problem both variables are:
(a) Equal (b) Unknown (c) Fixed (d) Random

MCQ 14.64
In the regression equation Y = a + bX, where a and b are called:
(a) Constants (b) Estimates (c) Parameters (d) Both (a) and (b)

MCQ 14.65
If byx = bxy = 1 and Sx = Sy, then r will be:
(a) 0 (b) -1 (c) 1 (d) Difficult to calculate

MCQ 14.66
The correlation coefficient between X and -X is:
(a) 0 (b) 0.5 (c) 1 (d) -1

MCQ 14.67
If byx = bxy = rxy, then:
(a) Sx ≠ Sy (b) Sx = Sy (c) Sx > Sy (d) Sx < Sy

MCQ 14.68
If rxy = 0.4, then r(2x, 2y) is equal to:
(a) 0.4 (b) 0.8 (c) 0 (d) 1

MCQ 14.69
rxy is equal to:
(a) 0 (b) -1 (c) 1 (d) 0.5

MCQ 14.70
If rxy = 0.75, then correlation coefficient between u = 1.5X and v = 2Y is:
(a) 0 (b) 0.75 (c) -0.75 (d) 1.5

MCQ 14.71
If byx = -2 and rxy= -1, then bxy is equal to:
(a) -1 (b) -2 (c) 0.5 (d) -0.5

MCQ 14.72
If byx = 1.6 and bxy = 0.4, then rxy will be:
(a) 0.4 (b) 0.64 (c) 0.8 (d) -0.8

MCQ 14.73
If byx = -0.8 and bxy = -0.2, then ryx is equal to:
(a) -0.2 (b) -0.4 (c) 0.4 (d) -0.8

MCQ 14.74
If = 6 – X, then r will be:
(a) 0 (b) 1 (c) -1 (d) Both (b) and (c)
MCQ 14.75
If = X + 10, then r equal to:
(a) 1 (b) -1 (c) 1/2 (d) Difficult to tell

MCQ 14.76
If Y = -10X and X = -0.1Y, then r is equal to:
(a) 0.1 (b) 1 (c) -1 (d) 10

MCQ 14.77
If the figure +1 signifies perfect positive correlation and the figure -1 signifies a perfect negative
correlation, then the figure 0 signifies:
(a) A perfect correlation (b) Uncorrelated variables
(c) Not significant (d) Weak correlation

MCQ 14.78
A perfect positive correlation is signified by:
(a) 0 (b) -1 (c) +1 (d) -1 to +1

MCQ 14.79
If a statistics professor tells his class: "All those who got 100 on the statistics test got 20 on the
mathematics test, and all those that got 100 on the mathematics test got 20 on the statistics test", he
is saying that the correlation between the statistics test and the mathematics test is:
(a) Negative (b) Positive (c) Zero (d) Difficult to tell

MCQ 14.80
If is zero, the correlation is:
(a) Weak negative (b) High positive (c) High negative (d) None of the preceding

MCQ 14.81
If rxy = 1, then:
(a) byx = bxy (b) byx > bxy (c) byx < bxy (d) byx . bxy = 1

MCQ 14.82
The relation between the regression coefficient byx and correlation coefficient r is:

MCQ 14.83
The relation between the regression coefficient bxy and correlation coefficient r is:

MCQ 14.84
If the sum of the product of the deviation of X and Y from their means is zero, the correlation
coefficient between X and Y is:
(a) Zero (b) Maximum (c) Minimum (d) Undecided

MCQ 14.85
If the coefficient of correlation between the variables X and Y is r, the coefficient of correlation
between X2 and Y2 is:
(a) -1 (b) 1 (c) r (d) r2

MCQ 14.86
If rxy = 0.75, then rxy will be:
(a) 0.25 (b) 0.50 (c) 0.75 (d) -0.75
MCQ 14.87
If , then byx is equal to:
(a) Positive (b) Negative (c) Zero (d) One

MCQ 14.88
If , then intercept a is equal to:
(a) 0 (b) 1 (c) -1 to +1 (d) 0 to 1

MCQ 14.89
:
(a) Less than zero (b) Greater than zero (c) Equal to zero (d) Not equal to zero

MCQ 14.90
When rxy < 0, then byx and bxy will be:
(a) Zero (b) Not equal to zero (c) Less than zero (d) Greater than zero

MCQ 14.91
When rxy > 0, then byx and bxy are both:
(a) 0 (b) < 0 (c) > 0 (d) < 1

MCQ 14.92
If rxy = 0, then:
(a) byx = 0 (b) bxy = 0 (c) Both (a) and (b) (d) byx ≠ bxy

MCQ 14.93
If bxy = 0.20 and rxy = 0.50, then byx is equal to:
(a) 0.20 (b) 0.25 (c) 0.50 (d) 1.25

MCQ 14.94
A regression model may be:
(a) Linear (b) Non-linear (c) Both (a) and (b) (d) Neither (a)
and (b)

MCQ 14.95
If r is negative, we know that:
(a)
(b)
(c)
(d)
MCQ INDEX NUMBERS
MCQ No 5.1
An index number is called a simple index when it is computed from:
(a) Single variable (b) Bi-variable (c) Multiple variables (d) None of them

MCQ No 5.2
Index numbers are expressed in:
(a) Ratios (b) Squares (c) Percentages (d) Combinations

MCQ No 5.3
If all the values are of equal importance, the index numbers are called:
(a) Weighted (b) Unweighted (c) Composite (d) Value index

MCQ No 5.4
Index numbers can be used for:
(a) Forecasting (b) Fixed prices (c) Different prices (d) Constant prices

MCQ No 5.5
Index for base period is always taken as:
(a) 100 (b) One (c) 200 (d) Zero

MCQ No 5.6
When the prices of rice are to be compared, we compute:
(a) Volume index (b) Value index (c) Price index (d) Aggregative index

MCQ No 5.7
When index number is calculated for several variables, it is called:
(a) Composite index (b) Whole sale price index (c) Volume index (d) Simple index

MCQ No 5.8
How many types are used for the calculation of index numbers:
(a) 2 (b) 3 (c) 4 (d) 5

MCQ No 5.9
In chain base method, the base period is:
(a) Fixed (b) Not fixed (c) Constant (d) Zero

MCQ No 5.10
Which formula is used in chain indices?

MCQ No 5.11
Price relatives are a percentage ratio of current year price and:
(a) Base year quantity (b) Previous year quantity (c) Base year price (d) Current year quantity

MCQ No 5.12
Indices calculated by the chain base method are free from:
(a) Seasonal variations (b) Errors (c) Percentages (d) Ratios

MCQ No 5.13
The chain base indices are not suitable for:
(a) Long range comparisons (b) Short range comparisons (c) Percentages (d) Ratios
MCQ No 5.14
An index number that can serve many purposes is called:
(a) General purpose index (b) Special purpose index
(c) Cost of living index (d) None of them

MCQ No 5.15
Another name of consumer's price index number is:
(a) Whole-sale price index number (b) Cost of living index
(c) Sensitive (d) Composite

MCQ No 5.16
Consumer price index indicates:
(a) Rise (b) Fall (c) Both (a) and (b) (d) Neither (a) and (b)

MCQ No 5.17
Purchasing power of money can be accessed through:
(a) Simple index (b) Fisher’s index (c) Consumer price index (d) Volume index

MCQ No 5.18
Cost of living at two different cities can be compared with the help of:
(a) Value index (b) Consumer price index (c) Volume index (d) Un-weighted index

MCQ No 5.19
Consumer price index numbers are obtained by:
(a) Laspeyre's formula (b) Fisher ideal formula
(c) Marshall Edgeworth formula (d) Paasche's formula

MCQ No 5.20
Laspeyre's index = 110, Paasche's index = 108, then Fisher's Ideal index is equal to:
(a) 110 (b) 108 (c) 100 (d) 109

MCQ No 5.21
Most commonly used index number is:
(a) Volume index number (b) Value index number
(c) Price index number (d) Simple index number

MCQ No 5.22
For consumer price index, price quotations are collected from:
(a) Fair price shops (b) Government depots (c) Retailers (d) Whole-sale dealers

MCQ No 5.23
Price relatives computed by chain base method are called:
(a) Price relatives (b) Chain indices (c) Link relatives (d) None of them

MCQ No 5.24
Consumer price index are obtained by:
(a) Paasche's formula (b) Fisher's ideal formula
(c) Marshall Edgeworth formula (d) Family budget method formula

MCQ No 5.25
The aggregative expenditure method and family budget method always give:
(a) Different results (b) Approximate results (c) Same results (d) None of them
MCQ No 5.26
In fixed base method, the base period should be:
(a) For away (b) Abnormal (c) Unreliable (d) Normal

MCQ No 5.27
If all the values are not of equal importance the index number is called:
(a) Simple (b) Unweighted (c) Weighted (d) None

MCQ No 5.28
Which of the following formula satisfy the time reversal test?

MCQ No 5.29
When the price of a year is. divided by the price of a particular year we get:
(a) Simple relative (b) Link relative (c) (a) and (b) both (d) None of them

MCQ No 5.30
When the price of a divided by the price of the preceding year, we, get:
(a) Value index (b) Link relative (c) Simple relative (d) None of them

MCQ No 5.31
The most appropriate average in averaging the price relatives is:
(a) Median (b) Harmonic mean (c) Arithmetic mean (d) Geometric mean

MCQ No 5.32
In constructing index number geometric mean relatives are:
(a) Non-reversible (b) Reciprocal (c) Reversible (d) None of them

MCQ No 5.33
The general purchasing power of the currency of a country is determined by:
(a) Retail price index (b) Volume index (c) Composite index (d) Whole-sale price index

MCQ No 5.34
What type of index number can help the government to formulate its price policies and to take
appropriate economic measures to control prices:
(a) Whole sale price index (b) Consumer's price (c) Quantity (d) None of them

MCQ No 5.35
The most suitable average in chain base method is:
(a) Arithmetic mean (b) Median (c) Mode (d) Geometric mean

MCQ No 5.36
Base year quantities weights are used in:
(a) Laspeyre's method (b) Paasche's method (c) Fisher's ideal method (d) Difficult to tell

MCQ No 5.37
Chain process is used to make comparisons of price index numbers in:
(a) Price relative (b) Link relative (c) Simple relative (d) None of the above

MCQ No 5.38
In the computation of consumer price index numbers, we use:
(a) Aggregate expenditure method (b) Family budget method
(c) Chain base method (d) Both (a) and (b)
MCQ No 5.39
The Federal Bureau of Statistics prepares:
(a) The wholesale price index (b) The consumer price index
(c) The sensitive price indicator (d) All of the above

MCQ No 5.40
While computing a weighted index, the current period quantities are used in the:
(a) Laspeyre's method· (b) Paasche's method
(c) Marshall Edgeworth method (d) Fisher's ideal method

MCQ No 5.41
The best method to measure the relative change in prices of commodities is:
(a) Quantity index number (b) Value index number
(c) Volume index number (d) Price index number

MCQ No 5.42
When the base year values are used as weights, the weighted average of relatives price index
number is the same as the:
(a) Laspeyre's index (b) Paasche's index (c) Simple aggregative index (d) Quantity index

MCQ No 5.43
To measure the relative change in purchasing a specified basket of goods and services between two
periods for a certain locality for fixed income group of people, we can use:
(a) Consumer price index (b) Paasche's price index (c) Cost of living index (d) Both (a) and (c)

MCQ No 5.44
Fisher's ideal index number is the geometric mean of the:
(a) Laspeyre's and Marshall Edgeworth indices
(b) Laspeyre's and Paasche's indices
(c) Paasche's and Marshal Edgeworth indices (d) all of the above
(d) All of the above

MCQ No 5.45
A number that measures a relative change in a single variable with respect to abase.is called:
(a) Good index number (b) Composite index number
(c) Simple index number (d) Quantity index number

MCQ No 5.46
A number that measures an average relative change in a group of related variables with respect to
A base is called:
(a) Simple index number (b) Composite index number
(c) Price index number (d) Quantity index number

MCQ No 5.47
An index number constructed to measure the relative change in the price of an item or a group of
items is called:
(a) Quantity index number (b) Price index number (c) Volume index number (d) Difficult to tell

MCQ No 5.48
When relative change is measured for a fixed period, it is called:
(a) Chain base method (b) Fixed base method
(c) Simple aggregative method (d) Cost of living Index method
MCQ No 5.49
The ratio of a sum of prices ill current period to the sum of prices ill the base period, expressed as a
percentage is called:
(a) Simple price index number
(b) Simple aggregative price index number
(c) Weighted aggregative price index number
(d) Quantity index number

MCQ No 5.50
An index that measures the average relative change in group of variables keeping in view the relative
importance of the variables is called:
(a) Simple index number (b) Composite index number
(c) Weighted index number (d) Price index number

MCQ No 5.51
Link relative of current year is equal to:

MCQ No 5.52
Simple average of relatives is equal to:

MCQ No 5.53
Paasche's price index number is also called:
(a) Base year weighted (b) Current year weighted
(c) Simple aggregative index (d) Consumer price index

MCQ No 5.54
Laspeyre's price index number is also called:
(a) Base year weighted (b) Current year weighted
(c) Cost of living index (d) Simple aggregative index

MCQ No 5.55
Index number having downward bias is:
(a) Laspeyre's index (b) Paasche’s index
(c) Fisher's ideal index (d) Marshall Edgeworth index

MCQ No 5.56
Index number having upward bias is:
(a) Laspeyre's index (b) Paasche's index (c) Fisher's ideal index (d) Marshal Edgworth index

MCQ No 5.57
Marshall Edgeworth price index was proposed by:
(a) One English economist (b) Two English economist
(c) Three English economist (d) Many English economist
MCQ No 5.58
Index number calculated by Fisher's formula is ideal because it satisfy:
(a) Circular test (b) Factor reversal test (c) Time reversal test (d) All of the above
MCQ No 5.59
The test which is lot obeyed by any of the weighted index numbers unless the weights are constant:
(a) Circular test (b) Time reversal test (c) Factor reversal test (d) None of them
MCQPROBABILITY
MCQ 6.1
When the possible outcomes of an experiment are equally likely to occur, this we apply:
(a) Relative probability (b) Subjective probability
(c) Conditional probability (d) Classical probability

MCQ 6.2
A number between 0 and 1 that is use to measure uncertainty is called:
(a) Random variable (b) Trial (c) Simple event (d) Probability

MCQ 6.3
Probability lies between:
(a) -1 and +1 (b) 0 and 1 (c) 0 and n (d) 0 and ∞

MCQ 6.4
Probability can be expressed as:
(a) Ration (b) Fraction (c) Percentage (d) All of the above

MCQ 6.5
The probability based on the concept of relative frequency is called:
(a) Empirical probability (b) Statistical probability (c) Both (a) and (b) (d) Neither (a) nor (b)

MCQ 6.6
The probability of an event cannot be:
(a) Equal to zero (b) Greater than zero (c) Equal to one (d) Less than zero

MCQ 6.7
A measure of the chance that an uncertain event will occur:
(a) An experiment (b) An event (c) A probability (d) A trial

MCQ 6.8
A graphical device used to list all possibilities of a sequence of outcomes in systematic way is
called:
(a) Probability histogram (b) Venn diagram (c) Pie diagram (d) Tree diagram

MCQ 6.9
A random experiment contains:
(a) At least one outcome (b) At least two outcomes
(c) At most one outcome (d) At most two outcomes

MCQ 6.10
The probability of all possible outcomes of a random experiment is always equal to:
(a) One (b) Zero (c) Infinity (d) All of the above

MCQ 6.11
The outcome of tossing a coin is a:
(a) Mutually exclusive event (b) Compound event (c) Certain event (d) Simple event

MCQ 6.12
The result of no interest of an experiment is called:
(a) Constant (b) Event (c) Failure (d) Success

MCQ 6.13
A set of all possible outcomes of an experiment is called:
(a) Combination (b) Sample point (c) Sample space (d) Compound event
MCQ 6.14
The numbers of counting rules that are useful in determining the number of outcomes in an
experiment are:
(a) One (d) Two (c) Three (d) Four

MCQ 6.15
The events having no experimental outcomes in common is called:
(a) Equally likely events (b) Exhaustive events
(c) Mutually exclusive events (d) Independent events

MCQ 6.16
A set of outcomes formed after some additional information is called:
(a) Sample space (b) Reduced sample space (c) Null set (d) Random experiment

MCQ 6.17
The probability associated with the reduced sample space is called:
(a) Conditional probability (b) Statistical probability
(c) Mathematical probability (d) Subjective probability

MCQ 6.18
An arrangement of objects without regard to order is called:
(a) Permutation (b) Combination (c) Random experiment (d) Sample point

MCQ 6.19
The number of permutations of a set of n things, taken r at a time with n 2 r given by:

MCQ 6.20
If three candidates are selected to attend a course from the ten candidates, the number of ways of selecting
the candidates is an example of:
(a) Combination (b) Permutation (c) Reduced sample space (d) Both (a) and (b)

MCQ 6.21
When each outcome of a sample space is as likely to occur as any other, the outcomes are called:
(a) Exhaustive (b) Mutually exclusive (c) Equally likely (d) Not mutually exclusive

MCQ 6.22
If A is any event in S and its complement, then P( ) is equal to:
(a) 1 (b) 0 (c) 1- A (d) 1 - P(A)

MCQ 6.23
When certainty is involved in a situation, its probability is equal to:
(a) Zero (b) Between -l and + 1 (c) Between 0 and 1 (d) One

MCQ 6.24
Which of the following cannot be taken as probability of an event?
(a) 0 (b) 0.5 (c) 1 (d)
-1

MCQ 6.25
If an event contains more than one sample points, it is called a:
(a) Simple event (b) Compound event (c) Impossible event (d) Certain event
MCQ 6.26
When the occurrence of one event has no effect on the probability of the occurrence of another
event, the events are called:
(a) Independent (b) Dependent (c) Mutually exclusive (d) Equally likely

MCQ 6.27
A particular result of an experiment is called:
(a) Trial (b) Simple event (c) Compound event (d) Outcome

MCQ 6.28
A collection of one or more outcomes of an experiment is called:
(a) Event (b) Outcome (c) Sample point (d) None of the above

MCQ 6.29
A process that leads to the occurrence of one and only one of several possible observations is
called:
(a) Random experiment (c) Random variable (c) Experiment (d) Probability distribution

MCQ 6.30
Which statement is false?
(a) The classical definition applies when there are n equally likely outcomes to an experiment
(b) The empirical definition occurs when number of times an event happen is divided by the number
of observations.
(c) A subjective probability is based on whatever information is available
(d) The general rule of addition is used when the events are mutually exclusive

MCQ 6.31
The term 'sample space' is used for:
(a) All possible outcomes (b) All possible coins (c) Probability (d) Sample

MCQ 6.32
The term 'event' is used for:
(a) Time (b) A sub-set of the sample space
(c) Probability (d) Total number of outcomes.

MCQ 6.33
The six faces of the die are called equally likely if the die is:
(a) Small (b) Fair (c) Six-faced (d) Round

MCQ 6.34
If we toss a coin and P(H) = 2P(T), then probability of head is equal to:
(a) 0 (b) 1/2 (c) 1/3 (d) 2/3

MCQ 6.35
A letter is chosen at random from the word "Statistics". The probability of getting a vowel is:
(a) 1/10 (b) 2/10 (c) 3/10 (d) 4/10

MCQ 6.36
An arrangement in which the order of the objects selected from a specific pool of objects is important
called:
(a) Combination (b) Permutation (c) Factorial (d) Sample space
MCQ 6.37
Two books are to be selected at random without replacement out of four books. Then number of possible
selections are:
(a) 4 (b) 2 (c) 6 (d) 3

MCQ 6.38
Three books of different colours are to be arranged in a book-shelf. The possible arrangements
are: (a) 3 (b) 1 (c) 6 (d) 2

MCQ 6.39
If a sample S = {1, 2}, the number of all possible sub-sets are:
(a) 2 (b) 1 (c) 3 (d)
4

MCQ 6.40
When a die and a coin are rolled together, all possible outcomes are:
(a) 6 (b) 2 (c) 36 (d) 12

MCQ 6.41
When two coins are tossed, the possible outcomes are:
(a) 2 (b) 4 (c) 1 (d) None of them

MCQ 6.42
If three coins are tossed, the possible outcomes are:
(a) 8 (b) 3 (c) 1 (d) None of them

MCQ 6.43
If n coins are tossed, the possible outcomes are:
(a) n (b) 2 (c) 2n (d) All of them

MCQ 6.44
If two dice are roiled, the possible outcomes are:
(a) 6 (b) 36 (c) 1 (d) Difficult to answer

MCQ 6.45
When n dice are rolled, the possible outcomes are:
(a) 6n (b) 6 (c) 1 (d) 18

MCQ 6.46
When one card is selected at random from a pack of 52 playing cards, the possible selections are:
(a) 104 (b) 52 (c) 520 (d) 2704

MCQ 6.47
Two cards are selected at random with replacement from a pack of 52 playing cards. The possible
outcomes are:
(a) 52 x 52 (b) 52 (c) 1326 (d) 2

MCQ 6.48
A bag contains 4 white and 2 black balls of the same size and weight, and two balls are selected at
random without replacement, the possible selections are:
(a) 6 (b) 4 (c) 36 (d) 15

MCQ 6.49
Two balls are selected at random with replacement from a bag containing 3 red, 3 black and 2 green
balls. The possible outcomes are:
(a) 8 (b) 64 (c) 16 (d) 2
MCQ 6.50
Five cards are selected at random from a pack of 52 cards with replacement. The possible
combinations are:
(a) 52 (b) (52)5 (c) 52 x 52 (d) (5)52

MCQ 6.51
The digits 1, 2, 3, 4, 5 are the roll numbers of 5 students. These roll numbers are written on the paper
slips and two paper slips are selected at random without replacement. The possible combinations are:
(a) 5 (b) 2 (c) 25 (d) 10

MCQ 6.52
Which is the impossible event when a die is rolled:
(a) 2 or 3 (b) 5 or 6 (c) 1 (d) 0 or 7

MCQ 6.53
The probability of drawing any one spade card is:
(a) 1/13 (b) 1/4 (c) 4/13 (d) 1/52

MCQ 6.54
A balance die is rolled, the probability of getting an odd number is:
(a) 1/2 (b) 1/4 (c) 1/6 (d) 1/36

MCQ 6.55
Two fair dice are rolled. The probability of throwing an odd sum is:
(a) 1 (b) 1/2 (c) 1/6 (d) 1/36

MCQ 6.56
Given P(A) = 0.4, P(B) = 0.5 and P(A⋃B)=0.9,then:
(a) A and B are not mutually exclusive events (b) A and B are equally likely events
(c) A and Bare independent events (d) A and B are mutually exclusive events

MCQ 6.57
If P(B/A) = 0.50 and P(A⋂B) = 0.40, then p(A) will be equal to:
(a) 0.40 (b) 0.50 (c) 0.80 (d) 1

MCQ 6.58
Which of the following statements is incorrect:
⋃ ⋂ ⋃ ⋂
⋂ ⋃ ⋂⋃

MCQ 6.59
If P(A/B) = P(A) and P(B/A)=P(B), then A and B are:
(a) Mutually exclusive (b) Dependent (c) Equally likely (d) Independent

MCQ 6.60
A fair coin is tossed 100 times, the expected number of heads is:
(a) 100 (b) 50 (c) 30 (d) 60

MCQ 6.61
When two dice are rolled, the maximum total on the two faces of the dice will
be: (a) 6 (b) 36 (c) 12 (d) 2
MCQ 6.62
A random sample of 200 random digits is selected from a random number table. Expected number of
zeros in the sample is:
(a) Zero (b) 10 (c) 20 (d) 5

MCQ 6.63
Six digits are selected at random again and again from a random number table and the even digits are
counted each time. In most of the cases, the number of even digits will be:
(a) 2 (b) 3 (c) 4 (d) 6

MCQ 6.64
Two events A and B are called mutually exclusive if:
(a) A⋃B = Φ (b) A⋂B = Φ (c) A⋂B = S (d) A⋂B = 1

MCQ 6.65
If A and B are two mutually exclusive events, then:
(a) P(A⋂B) = 0 (b) P(A⋂B) = 1 (c) P(A⋃B) = 0 (d) P(A⋂B) = S

MCQ 6.66
When A and B are two non-empty and mutually exclusive events, then:
(a) P(A⋃B) = P(A).P(B) (b) P(A⋃B) = P(A) + P(B)
(c) P(A⋂B) = P(A).P(B) (d) P(A⋂B) = P(A)+P(B)

MCQ 6.67
The two events A and B are called not mutually exclusive events if:
(a) A⋂B = Φ (b) A⋂B ≠ Φ (c) A⋃B = Φ (d) A⋂B = zero

MCQ 6.68
If A and B are disjoint events then the statement which is always true is:
(a) P(A/B) = 0 (b) P(A⋃B) = 0 (c) P(A⋂B) = 1 (d) P(A) = P(B)

MCQ 6.69
The events A, B and C are called exhaustive events if:
(a) A⋃B⋃C = S (b) A⋂B⋂C = S (c) A⋃B⋃C = Φ (d) A⋃B⋃C = Zero

MCQ 6.70
If A and B are not-mutually exclusive events, then:
(a) P(A⋃B) + P(A⋂B) = P(A) + P(B) (b) P(A⋃B) = P(A) + P(B)
(c) P(A⋃B) = P(A).P(B) (d) P(A⋂B) = P(A) + P(B)

MCQ 6.71
If an event is the complement of the event A, then:
(a) A⋃ = S (b) A⋂ = S (c) A⋃ =Φ (d) P(A) = P( )

MCQ 6.72
If A1, A2, A3, ..., Ak are k mutually exclusive events, then:
(a) P(A1⋃A2⋃A3⋃ ...⋃Ak ) = P(A1)+P(A2)+P(A3)+...+ P(Ak)
(b) P(A1⋃A2⋃A3⋃ ...⋃Ak ) > 1
(c) P(A1⋂A2⋂A3⋂ ...⋂Ak ) = 1
(d) P(A1⋂A2⋂A3⋂ ...⋂Ak ) = P(A1⋃A2⋃A3⋃ ...⋃Ak )

MCQ 6.73
If A is an empty set and B is a non-empty set then:
(a) A⋂B = S (b) A⋂B = B (c) A⋃B = B (d) P(A) = P(B)
MCQ 6.74
If A is an empty set and S is the sample space then:
(a) P(A⋃S) = P(S) (b) P(A⋃S) = P(Φ) (c) P(A⋂S) = 1 (d) P(A⋃S) = Zero

MCQ 6.75
If A and B are independent events, then:
(a) P(A⋃B) = P(A).P(B) (b) P(A⋂B) = P(A).P(B)
(c) P(A⋂B) = P(A)+P(B) (d) P(A) = P(B)

MCQ 6.76
If A and B are two independent events, then:
(a) P(A/B) = P(A) (b) P(A) = P(B) (c) P(A) < P(B) (d) P(A/B) = P(B/A)

MCQ 6.77
A and B are two independent events. Which one of these equations is false?
(a) P(A⋂ ) = P(A)P( ) (b) P( ⋂ ) = P( ⋂ )
(c) P( ⋂ ) = P( )P( ) (d) P(A⋃B) = P(A)P(B)

MCQ 6.78
The conditional probability of the event A when event B has occurred is denoted by:
(a) P(A + B) (b) P(A - B) (c) P(A/B) (d) P( )

MCQ 6.79
If A and B are any two events, then P(A/B)+P( /B) is equal to:
(a) 0 (b) 0.25 (c) 0.5 (d) 1

MCQ 6.80
If A is an arbitrary event, then P(A/A) is equal to :
(a) Zero (b) One (c) Infinity (d) Less than one

MCQ 6.81
If A and B are any two events, then P( /B) is equal to:
(a) P(A/B) (b) 1- P(A/B) (c) 1+ P(A/B) (d) P( ⋂B)

MCQ 6.82
If A and B are any two events, then P(A⋃ ):
(a) 1+P(A⋂B) (b) 1-P(A⋃B) (c) 1- P(A⋂B) (d) P(A)+P(B)

MCQ 6.83
If A and B are any two events, then P( ⋂ ):
(a) 1-P(A⋃B) (b) 1-P(A⋂B) (c) 1-P( ⋂B) (d) 1-P(A⋂ )

MCQ 6.84
Which of the following statements is correct?
⋂ ⋃ ⋂ ⋃ ⋂ ⋃⋂⋂⋂⋃
⋂ ⋂ ⋃ ⋃ ⋂ ⋂⋃ ⋂

MCQ 6.85
If A and B are two mutually exclusive and exhaustive events and P(A)=2P(B), then P(B) is equal to:
(a) 1/2 (b) 2/3 (c) 1/3 (d) 1/4

MCQ 6.86
Two coins are tossed. Probability of getting head on the first coin is:
(a) 2/4 (a) 1 (c) Zero (d) 4
MCQ 6.87
A die and a coin are tossed together. Probability of getting head on the coin is:
(a) 6/12 (b) 6 (c) 12 (d) Zero

MCQ 6.88
A fair die is rolled. Probability of getting even face given that face is less than 5 is given by:
(a) 1/2 (b) 5 (c) 2 (d) 6

MCQ 6.89
Two coins are tossed. The probability that both faces will be matching given by:
(a) 1/4 (b) 1/2 (c) 1 (d) Zero

MCQ 6.90
Two coins are tossed. Probability of getting two heads given that there is at least one head is given
by:
(a) 1/2 (b) 1/3 (c) 1/4 (d) 2/3

MCQ 6.91
A fair die is rolled. Probability of getting more than4 or less than 3 is given by:
(a) 2/3 (b) 1/3 (c) 1/2 (d) 4/3

MCQ 6.92
74. A fair die is rolled. Probability of getting even face or face more than 4 is:
(a) 1/3 (b) 2/3 (c) 1/2 (d) 5/6

MCQ 6.93
Two dice are rolled. Probability of getting similar faces is:
(a) 5/36 (b) 1/6 (c) 1/3 (d) 1/2

MCQ 6.94
Two dice are rolled. Probability of getting total less than 4 or total more than 10 is given
by: (a) 10/36 (c) 4/36 (c) 1/36 (d) 14/36

MCQ 6.95
Two dice are rolled. Probability of getting a total of 4 given that both-faces are similar is:
(a) 5/36 (b) 1/36 (c) 4/36 (d) 1/6

MCQ 6.96
If A and B are two not-independent events, then the probability that both A and B will happen
together is:
(a) P(A⋂B) = P(A)P(B/A) (b) P(A⋂B) = P(A)P(B)
(c) P(A⋂B) = P(A)+P(B) (d) P(A⋂B) = P(A)

MCQ 6.97
If A and B are two dependent events, then:
(a) P(A) P(B/A) = P(B)P(A/B) (b) P(A/B) = P(B/A)
(c) P(A/B) = P(A) (d) P(A) = P(B)

MCQ 6.98
Which one is true?
MCQ 6.99

(a) 1/5 (b) 2/5 (c) 3/5 (d) 1

MCQ 6.100

(a) 7/10 (b) 1/10 (c) 3/10 (d) 1

MCQ 6.101
Given P(A)=2/3, P(B)=3/8 and PAB)=1/4, then A and B are:
(a) Independent (b) Dependent (c) Mutually exclusive (d) Equally likely
MCQ BINOMIAL AND HYPERGEOMETRIC DISTRIBUTIONS

MCQ 8.1
A Bernoulli trial has:
(a) At least two outcomes (b) At most two outcomes
(c) Two outcomes (d) Fewer than two outcomes

MCQ 8.2
The two mutually exclusive outcomes in a Bernoulli trial are usually called:
(a) Success and failure (b) Variable and constant
(c) Mean and variance (d) With and without replacement

MCQ 8.3
Nature of the binomial random variable X is:
(a) Quantitative (b) Qualitative (c) Discrete (d) Continuous

MCQ 8.4
In a binomial probability distribution, the sum of probability of failure and probability of success is
always:
(a) Zero (b) Less than 0.5 (c) Greater than 0.5 (d) One

MCQ 8.5
Ina binomial experiment, the successive trials are:
(a) Dependent (b) Independent (c) Mutually exclusive (d) Fixed

MCQ 8.6
The parameters of the binomial distribution are:
(a) n and p (b) p and q (c) np and nq (d) np and npq

MCQ 8.7
The range of binomial distribution is:
(a) 0 to n (b) 0 to ∞ (c) -1 to +1 (d) 0 to 1

MCQ 8.8
The mean and standard deviation of the binomial probability distribution 'are respectively:
(a) np and npq (b) np and (c) np and nq (d) n and p

MCQ 8.9
In a binomial experiment with three trials, the variable can take:
(a) 2 values (b) 3 values (c) 4 values (d) 5 values

MCQ 8.10
The shape of the binomial probability distribution depends upon the values of its:
(a) Mean (b) Variance (c) Parameters (d) Quartiles

MCQ 8.11
In binomial distribution the numbers of trials are:
(a) Very large (b) Very small (c) Fixed (d) Not fixed

MCQ 8.12
In a binomial probability distribution, relation between mean and variance is:
(a) Mean < Variance (b) Mean = Variance
(c) Mean > Variance (d) Difficult to tell
MCQ 8.13
In binomial distribution when n = 1, then it becomes:
(a) Hypergeometric distribution (b) Normal distribution
(c) Uniform distribution (d) Bernoulli distribution

MCQ 8.14
The mean of a binomial distribution depends on:
(a) Number of trials (b) Probability of success
(c) Probability of failure (d) Number of trials and probability of success

MCQ 8.15
The variance of a binomial distribution depends on:
(a) Number of trials (b) Probability of success
(c) Probability of failure (d) All of the above

MCQ 8.16
Which of the following is not property of a binomial experiment?
(a) Probability of success remains constant
(b) n is fixed
(c) Successive trials are dependent
(d) It has two parameters

MCQ 8.17
The binomial probability distribution is symmetrical when:
(a) p = q (b) p < q (c) p > q (d) np > npq

MCQ 8.18
The binomial distribution is negatively skewed if:
(a) p < 1/2 (b) p = 1/2 (c) p > 1/2 (d) p = 1

MCQ 8.19
In a binomial probability distribution, the skewness is positive for:
(a) p < 1/2 (b) p = 1/4 (c) np = npq (d) np = nq

MCQ 8.20
Which of the following statements is false?
(a) Expected value of a constant
(b) In a binomial distribution the standard deviation is always less than its variance
(c) In a binomial distribution the mean is always greater than its variance
(d) In binomial experiment the probability of success remains constant from trial to trial

MCQ 8.21
If a binomial probability distribution has parameters (n, p)= (5, 0.6), the probability of x = 3.5 is:
(a) 0 (b) 1 (c) 0.6 (d) 0.4

MCQ 8.22
In a binomial experiment n= 4, P(x=2) = 216/625 and P(x=3) = 216/625. P(x=-2) is:
(a) 216/625 (b) 1 (c) 0.6 (d) Difficult to tell

MCQ 8.23
If n = 6 and p= 0.9 then the value of P(x=7) is:
(a) Zero (b) Less than zero (c) More than zero (d) One
MCQ 8.24
In a binomial probability distribution, coefficient of skewness = = 0, it means that the
distribution is:
(a) Symmetrical (b) Skewed to the left (c) Skewed to the right (d) Highly skewed

MCQ 8.25
For a binomial distribution with n = 10, p = 0.5, the probability of zero or more successes is:
(a) 1 (b) 0.5 (c) 0.25 (d) 0.75

MCQ 8.26
In a binomial distribution, the mean, median and mode coincide when:
(a) p < 1/2 (b) p > ½ (c) p ≠ 1/2 (d) p = 1/2

MCQ 8.27
In which distribution, the probability success remains constant from trial to trial?
(a) Hypergeometric distribution (b) Binomial distribution
(c) Sampling distribution (d) Frequency distribution

MCQ 8.28
In a binomial experiment when n = 5, the maximum number of successes will be:
(a) 0 (b) 2.5 (c) 4 (d) 5

MCQ 8.29
In a binomial experiment when n = 10, the minimum number of successes will be:
(a) 0 (b) 5 (c) 10 (d) 11

MCQ 8.30
If n = 10 and p = 0.6, then P(x ≥ 0) is:
(a) 0.5 (b) 0.6 (c) 1.0 (d) 1.2

MCQ 8.31
A random variable X has a binomial distribution with n = 4, the standard deviation of X is:
(a) 4 pq (b) 2 (c) 4 p (d) 4 (q+p)

MCQ 8.32
In a multiple choice test there are five possible answers to each of 20 questions. If a candidate
guesses the correct answer each time, the mean number of correct answers is:
(a) 4 (b) 5 (c) 1/5 (d) 20

MCQ 8.33
If three coins are tossed, the probability of two heads is:
(a) 1/8 (b) 3/8 (c) 2/3 (d) 0

MCQ 8.34
Random variable x has binomial distribution with n = 8 and p = ½.. The most probable value of X is:
(a) 2 (b) 3 (c) 4 (d) 5

MCQ 8.35
The value of second moment about the mean in a binomial distribution is 36. The value of the
standard deviation of a binomial distribution is:
(a) 36 (b) 6 (c) 1/36 (d) 1/6
MCQ 8.36
For a binomial probability distribution, the expected frequency of x successes in N experiments is:

MCQ 8.37
In a binomial frequency distribution 100 (1/5 + 4/5)5. The parameters n and p are respectively:
(a) (5, 1/5) (b) (1/5, 4/5) (c) (100, 4/5) (d) (5, 4/5)

MCQ 8.38
For a binomial frequency distribution 100 (1/5 + 4/5)5, the mean is:
(a) 1/5 (b) 4/5 (c) 5 (d) 4

MCQ 8.39
For a binomial distribution (1/3 + 2/3)18, the standard deviation of the binomial distribution will
be:
(a) 2 (b) 4 (c) 6 (d) 12

MCQ 8.40
The hypergeometric distribution has:
(a) One parameter (b) Two parameters (c) Three parameters (d) Four parameters

MCQ 8.41
The parameters of the hypergeometric distribution are:
(a) N, n, p (b) N, n, np (c) N, n, k (d) n and p

MCQ 8.42
Nature of the Hypergeometric random variable is:
(a) Continuous (b) Discrete (c) Qualitative (d) Quantitative

MCQ 8.43
In hypergeometric· distribution, the successive trials are:
(a) Independent (b) Dependent (c) Very large (d) Very small

MCQ 8.44
In a hypergeometric distribution, the probability of success:
(a) Remains constant from trial to trial
(b) Does not remain constant from trial to trial
(c) Equal to probability of failure
(d) Less than probability of failure

MCQ 8.45
If in a hypergeometric distribution N = 10, k = 5 and n = 4; then the probability of failure is:
(a) 2 (b) 0.5 (c) 1 (d) 0.25

MCQ 8.46
The rang of hypergeometric distribution is:
(a) 0 to n (b) 0 to k (c) 0 to N (d) 0 to n or k (whichever is less)

MCQ 8.47
The number of trials in hypergeometric distribution is:
(a) Not fixed (b) Fixed (c) Large (d) Small
MCQ 8.48
The probability of a success changes from trial to trial in:
(a) Binomial distribution (b) Hypergeometric distribution
(c) Normal distribution (d) Frequency distribution

MCQ 8.49
The mean of the hypergeometric distribution is:

MCQ 8.50
The standard deviation of the hypergeometric distribution is:

MCQ 8.51
In hypergeometric probability distribution, the relation between mean and variance is:
(a) Mean > variance (b) Mean < Variance (c) Mean = Variance (d) Mean = 2Variance

MCQ 8.52
Which of the following is the property of hypergeometric experiment?
(a) p remains constant from trial to trial
(b) Successive trials are independent
(c) Sampling is performed without replacement
(d) n is not fixed

MCQ 8.53
Hypergeometric distribution reduces to binomial distribution when:
(a) N = n (b) n → ∞ (c) N → ∞ (d) N < n

MCQ 8.54
In a hypergeometric distribution N=6, n=4 and k=3, then the mean is equal to:
(a) 2 (b) 4 (c) 6 (d) 24

MCQ 8.55
Given N = 11, n = 5, k = 7; P(x ≥ 1) equals:
(a) 1 (b) 1/66 (c) 65/66 (d) None of the above

MCQ 8.56
Given N =12, n =5, k= 4; P(x ≤ 4) equals:
(a) Less than one (b) Exactly one (c) More than one (d) Between 0.5 and 1

1.(c) 2.(a) 3.(c) 4.(d) 5.(b) 6.(a) 7.(a) 8.(b) 9.(c) 10.(c) 11.(c) 12.(c) 13.(d) 14.(d) 15.(d)
16.(c) 17.(a) 18.(c) 19.(a) 20.(b) 21.(a) 22.(c) 23.(a) 24.(a) 25.(a) 26.(d) 27.(b) 28.(d) 29.(a) 30.(c)
31.(b) 32.(a) 33.(b) 34.(c) 35.(b) 36.(c) 37.(d) 38.(d) 39.(a) 40.(c) 41.(c) 42.(b) 43.(b) 44.(b) 45.(b)
46.(d) 47.(b) 48.(b) 49.(a) 50.(b) 51.(a) 52.(c) 53.(c) 54.(a) 55.(a) 56.(b)
MCQ NORMAL DISTRIBUTION
MCQ 10.1
The range of normal distribution is:
(a) 0 to n (b) 0 to ∞ (c) -1 to +1 (d) -∞ to +∞

MCQ 10.2
In normal distribution:
(a) Mean = Median = Mode (b) Mean < Median < Mode
(c) Mean> Median > Mode (d) Mean ≠ Median ≠ Mode

MCQ 10.3
Which of the following is true for the normal curve:
(a) Symmetrical (b) Unimodel (c) Bell-shaped (d) All of the above

MCQ 10.4
In a normal curve, the ordinate is highest at:
(a) Mean (b) Variance (b) Standard deviation (d) Q1

MCQ 10.5
The parameters of the normal distribution are:
(a) µ and σ2 (b) µ and σ (c) np and nq (d) n and p

MCQ 10.6
The shape of the normal curve depends upon the value of:
(a) Standard deviation (b) Q1 (c) Mean deviation (d) Quartile deviation

MCQ 10.7
The normal distribution is a proper probability distribution of a continuous random variable, the total area
under the curve f(x) is:
(a) Equal to one (b) Less than one (c) More than one (d) Between -1 and +1

MCQ 10.8
In a normal probability distribution of a continuous random variable, the value of standard deviation is:
(a) Zero (b) Less than zero (c) Greater than zero (d) None of the above

MCQ 10.9
In a normal curve, the highest point on the curve occurs at the mean, µ, which is also the:
(a) Median and mode (b) Geometric mean and harmonic mean
(c) Lower and upper quartiles (d) Variance and standard deviation

MCQ 10.10
The normal curve is symmetrical and for symmetrical distribution, the values of all odd order moments
about mean will always be:
(a) 1 (b) 0.5 (c) 0.25 (d) 0

MCQ 10.11
If , the points of inflection of normal distribution are:
(a) (b) (c) (d)

MCQ 10.12
In normal probability distribution for a continuous random variable, the value of a mean deviation is
approximately equal to:
(a) 2/3 (b) 2/3 σ (c) 4/5 (d) 4/5 σ
MCQ 10.13
In a normal distribution whose mean is land standard deviation 0, the value 4 quartile deviation is
approximately:
(a) 4/5 (b) 4/5 σ (c) 2/3 σ (d) 2/3

MCQ 10.14
In a normal distribution, the lower and upper quartiles are equidistant from the mean and are at a distance of:
(a) 0.7979 (b) 0.7979 σ (c) 0.6745 (d) 0.6745 σ

MCQ 10.15
The value of e is approximately equal to:
(a) 2.7183 (b) 2.1783 (c) 2.8173 (d) 2.1416

MCQ 10.16
The value of π is approximately equal to:
(a) 3.4116 (b) 3.1416 (c) 3.1614 (d) 3.6416

MCQ 10.17
If , the standard normal variate is distributed as:
(a) (b) (c) (d)

MCQ 10.18
The coefficient of skewness of a normal distribution is:
(a) Positive (b) Negative (c) Zero (d) Three

MCQ 10.19
The total area of the normal probability density function is equal to:
(a) 0 (b) 0.5 (c) 1 (d) 0.25

MCQ 10.20
In a standard normal distribution, the value of mode is:
(a) Equal to zero (b) Less than zero (c) Greater than zero (d) Exactly one

MCQ 10.21
The normal probability density function curve is symmetrical about the mean, µ, i.e. the area to the right of
the mean is the same as the area to the left of the mean. This means that P(X<µ) =P(X>µ) is equal to:
(a) 0 (b) 1 (c) 0.5 (d) 0.25

MCQ 10.22
The skewness and kurtosis of the normal distribution are respectively:
(a) Zero and zero (b) Zero and one (c) One and zero (d) One and one

MCQ 10.23
In a normal curve µ ± 0.6745σ covers:
(a) 50% area (b) 68.27% area (c) 95.45% area (d) 99.73% area

MCQ 10.24
The lower and upper quartiles for a standardized normal variate are respectively:
(a) -0.6745σ and 0.6745σ (b) -0.6745 σ and 0.6745
(c) 0.7979σ and 0.7979σ (d) -0.7979 and 0.7979

MCQ 10.25
The maximum ordinate of a normal curve is at:
(a) X = µ (b) X = µ + σ (c) X = µ - 2σ (d) X = σ2
MCQ 10.26
The value of the standard deviation σ of a normal distribution is always:
(a) Equal to zero (b) Greater than zero (c) Less than zero (d) Equal to 0.5

MCQ 10.27
If X~N(100, 64), then standard deviation σ is:
(a) 100 (b) 64 (c) 8 (d) 100 - 64 = 36

MCQ 10.28
If , the coefficient of variation is equal to:
(a) Zero (b) One (c) Infinity (d) Hundred percent

MCQ 10.29
The points of inflection of the standard normal distribution lie at:
(a) -1 and 0 (b) 0 and 1 (c) -1 and +1 (d) µ and σ

MCQ 10.30
If , then µ4 is equal to:
(a) 0 (b) 1 (c) 3 (d) σ4

MCQ 10.31
The value of second moment about the mean in a normal distribution is 5. The fourth moment about
the mean in the distribution is:
(a) 5 (b) 15 (c) 25 (d) 75

MCQ 10.32
If X is a normal random variable having mean µ, then E|X - µ| is equal to:
(a) Variance (b) Standard deviation (c) Quartile deviation (d) Mean deviation

MCQ 10.33
If X is a normal random variable having mean µ, then E(X - µ)2 is equal to:
(a) σ2 (b) σ (c) 3σ4 (d) β1

MCQ 10.34
Which of the following is possible in normal distribution?
(a) σ < 0 (b) σ = 0 (c) σ > 0 (d) σ > n

MCQ 10.35
The range of standard normal distribution is:
(a) 0 to n (b) 0 to ∞ (c) 0 to k (d) -∞ to +∞

MCQ 10.36
In the normal distribution, the value of the maximum ordinate is equal to:

MCQ 10.37
The value of the ordinate at points of inflection of the normal curve is equal to:

MCQ 10.38
If , then β2 is equal to:
(a) 0 (b) 3 (c) 3σ4 (d) σ2
MCQ 10.39
Pearson’s constants for a normal distribution with mean µ and variance σ2 are:
(a) β1=0, β2=0, γ1=0, γ2=0 (b) β1=0, β2=1, γ1=1, γ2=3
(c) β1=0, β2=3, γ1=0, γ2=0 (d) β1=3, β2=0, γ1=0, γ2=0

MCQ 10.40
The value of maximum ordinate in standard normal distribution is equal to:

MCQ 10.41
A random variable X is normally distributed with µ = 70 and σ2 = 25. The third moment about arithmetic
mean is:
(a) Zero (b) Less than zero (c) Greater than zero (d) None of the above

MCQ 10.42
For the standard normal distribution, P(Z > mean) is:
(a) More than 0.5 (b) Less than 0.5 (c) Equal to 0.5 (d) Difficult to tell

MCQ 10.43
Given a standardized normal distribution (with a mean of zero and a standard' deviation of one),
P(Z < variance) is equal to:
(a) 0.8413 (b) 0.3413 (c) 0.1587 (d) 0.5000

MCQ 10.44
The area to the left of (µ+σ) for a normal distribution is approximately equal to:
(a) 0.16 (b) 0.34 (c) 0.50 (d) 0.84

MCQ 10.45
The median of a normal distribution corresponds to a value of Z is:
(a) 0 (b) 1 (c) 0.5 (d) -0.5

MCQ 10. 46
The mean and standard deviation of the standard normal distribution a respectively:
(a) 0 and 1 (b) 1 and 0 (c) µ and σ2 (d) π and e

MCQ 10.47
In a standard normal distribution, the area to the left of Z = 1 is:
(a) 0.6413 (b) 0.7413 (c) 0.8413 (d) 0.3413

MCQ 10.48
The semi-inter quartile range for a standard normal random variable Z is:
(a) 0.6745 (b) 0.6745 σ (c) 0.7979 (d) 0.7979 σ

MCQ 10.49
If , then µ4 is equal to:
(a) 3 (b) 3 σ (c) 3 σ2 (d) 3 σ4

MCQ 10.50
If , then β2 is equal to:
(a) 0 (b) 3 (c) 3 σ4 (d) σ4/3

MCQ 10.51
P(µ-σ < X < µ+σ) is equal to:
(a) 0.5000 (b) 0.6827 (c) 0.9545 (b) 0.9973
MCQ 10.52
In a normal curve µ ± 2σ covers:
(a) 50% area (b) 68.27% area (c) 95.45% area (d) 99.73% area

MCQ 10.53
In X is N(µ, σ2), the percentage of the area contained within the limits µ ± 3σ:
(a) 50% (b) 68.27% (c) 95.45% (d) 99.73%

MCQ 10.54
Most of the area under the normal curve with parameters µ and σ lies between:
(a) µ - 0.5σ and µ + 0.5σ (b) µ - σ and µ + σ
(c) µ - 2σ and µ + 2σ (d) µ - 3σ and µ + 3σ

MCQ 10.55
The probability density function of the standard normal distribution is:

MCQ
10.56
The equation of the normal frequency distribution is:

MCQ 10.57
If X is N(µ,σ2) and if Y =a + bX, then mean and variance of Y are respectively:
(a) µ and σ2 (b) a + µ and bσ2 (c) a + bµ and σ2 (d) a + bµ and b2σ2

MCQ 10.58
For a normal distribution with mean µ and standard deviation σ:
(a) Approximately 5% of values are outside the range (µ - 2σ) to (µ + 2σ)
(b) Approximately 5% of values are greater than (µ + 2σ)
(c) Approximately 5% of values are outside the range (µ - σ) to (µ + σ)
(d) Approximately 5% of values are less than (µ - 3σ)

MCQ 10.59
The normal probability distribution with mean np and variance npq may used to approximate the
binomial distribution if n ≥ 50 and both np and nq are:
(a) Greater than 5 (b) Less than 5 (c) Equal to 5 (d) Difficult to tell

MCQ 10.60
In a normal distribution Q1 = 20 and Q3 = 40, then mean is equal to:
(a) 20 (b) 30 (a) 40 (b) 60

MCQ 10.61
If Z is a standard normal variate, then P(-1.645 ≤ Z ≤ +1.645) is equal to:
(a) 0.90 (b) 0.95 (c) 0.98 (d) 0.99

MCQ 10.62
If Z is a standard normal variate, then P(-2.33 ≤ Z ≤ +2.33) is equal to:
(a) 0.4901 (b) 0.6827 (c) 0.9545 (d) 0.9802

MCQ 10.63
If Z is a standard normal variate, then P(- 2.575 ≤ Z ≤ +2.575) is equal to:
(a) 0.9951 (b) 0.99 (c) 0.4951 (d) 0.4949
MCQ 10.64
If Z is a standard normal variate, then P[ IZI< 1.96] is equal to:
(a) 0.0250 (b) 0.4750 (c) 0.95 (d) 0.9750

MCQ 10.65
For a normal distribution with µ = 10, σ = 2, the probability of a value greater than 10
is: (a) 0.1915 (b) 0.3085 (c) 0.6915 (d) 0.5000

MCQ 10.66
Given a random variable X which is normally distributed with a mean and variance both equal to 100.
The value of mean deviation is approximately equal to:
(a) 7 (b) 8 (c) 8.5 (d) 9

MCQ 10.67
If X is a normal variate with mean 50 and standard deviation 3. The value of quartile
deviation is approximately equal to:
(a) 1 (b) 1.5 (c) 2 (d) 2.5

MCQ 10.68
In a normal distribution mean is 100 and standard deviation is 10. The values of points of inflection
are:
(a) 100 and 110 (b) 80 and 120 (c) 90 and 110 (d) None of the above

MCQ 10.69
If X is a normal variate with mean 20 and variance 16. The respective values of β1 and β2 are:
(a) 0 and 3 (b) 3 and 1 (c) 0.5 and 1 (d) 3 and 3

MCQ 10.70
If X is N(100; 5), the fourth central moment is:
(a) 65 (b) 75 (c) 85 (d) 100

MCQ 10.71
A normal distribution has the mean µ=200. If 70 percent of the area under the curve lies to the left
of 220, the area to the right of 220 is:
(a) 0.3 (b) 0.5 (c) 0.2 (d) 0.7

MCQ 10.72
Given a normal distribution with µ = 100 and σ2 = 100, the area to the left of 100 is:
(a) One (b) Equal to 0.5 (c) Less than 0.5 (d) Greater than 0.5

MCQ 10.73
If a normal distribution with µ = 200 have P(X > 225) = 0.1587, then P(X < 175) equal to:
(a) 0.3413 (b) 0.8413 (c) 0.1587 (d) 0.5000

MCQ 10.74
A random variable has a normal distribution with the mean µ = 400. If 8 percent of the area under
the curve lies to the left of 500, the area between 400 and 500 is:
(a) 0.5 (b) 0.2 (c) 0.3 (d) Zero
MCQ 10.75
If Y = 5X+ 10 and X is N(10, 25), then mean of Y is:
(a) 50 (b) 60 (c) 70 (d) 135
MCQ 10.76
If X is a normal random variable with mean µ = 50 arid standard deviation σ = 7, if Y = X – 7 then standard
deviation of Y is:
(a) 7 (b) 14 (c) 0 (d) 49
Introduction
Statics Collection of data. Presentation
of data. Analysis of data.
Interpretation of data For
Research projects.
Types of Statics
Descriptive statistics
If a business analyst is using data gathered on a
group to describe or reach conclusions about that
same group, the statistics are called descriptive
statistics.
Inferential statistics
If a researcher gathers data from a sample and uses the
statistics generated to reach conclusions about the
population from which the sample was taken, the
statistics are inferential statistics.
Population:
The collection of all individuals, items or data under consideration
in a statistical study.
Sample:
That part of the population from which information is collected.
Parameter: Numerical calculation of population.
Static: Result of sample.
Variable:
A characteristic that varies with an individual or an object, is called
a variable.
For example, age is a variable as it varies from person to
person. A variable can assume a number of values. The given
set of all possible values from which the variable takes on a
value is called its Domain. If for a given problem, the domain
of a variable contains only one value, then the variable is
referred to as a constant.
Qualitative variable.
If the characteristic is non-numerical such as education, sex,
eye-color, quality, intelligence, poverty, satisfaction, etc. the
variable is referred to as a qualitative variable. A qualitative
characteristic is also called an attribute
Quantitative variable
A variable is called a quantitative variable when a
characteristic can be expressed numerically such as age,
weight, income or number of children.

A quantitative variable may be classified as discrete or continuous.

Discrete variable
A discrete variable is one that can take only a discrete set of
integers or whole numbers, that is, the values are taken by
jumps or breaks. A discrete variable represents count data
such as the number of persons in a family, the number of
rooms in a house, the number of deaths in an accident, the
income of an individual, etc.
Continuous variable
A l within a given interval, i.e. its domain is an interval with all
v possible values without gaps. A continuous variable represents
ar measurement data such as the age of a person, the height of a
ia plant, the weight of a commodity, the temperature at a place,
bl etc.
e A variable whether countable or measurable, is
is generally denoted by some symbol such as X or Y and Xi or
c Xj represents the ith or jth value of the variable. The
al subscript i or j is replaced by a number such as 1, 2, 3 …
le when referred to a particular value.
d
a
c
o
n
ti
n
u
o
u
s
v
ar
ia
bl
e
if
it
c
a
n
ta
k
e
o
n
a
n
y
v
al
u
e-
fr
a
ct
io
n
al
o
r
in
te
g
ra
Data
Data can be defined as a systematic record of a particular quantity. It is the different values of that quantity
represented together in a set. It is a collection of facts and figures to be used for a specific purpose such as a
survey or analysis. When arranged in an organized form, can be called information.
Qualitative Data:
They represent some characteristics or attributes. They depict descriptions that may be observed but cannot
be computed or calculated. For example, data on attributes such as intelligence, honesty, wisdom,
cleanliness, and creativity collected using the students of your class a sample would be classified as
qualitative. They are more exploratory than conclusive in nature.
Quantitative Data:
These can be measured and not simply observed. They can be numerically represented and calculations can
be performed on them. For example, data on the number of students playing different sports from your class
gives an estimate of how many of the total students play which sport. This information is numerical and can
be classified as quantitative.
Data Collection
Primary Data
These are the data that are collected for the first time by an investigator for a specific purpose. Primary data
are ‘pure’ in the sense that no statistical operations have been performed on them and they are original. An
example of primary data is the Census of Pakistan.
Sources Primary Data
i) Direct Personal Investigation.
ii) Indirect Investigation.
iii) Collection through Questionnaires.
iv) Collection through Enumerators.
v) Collection through Local Sources.

Secondary Data
They are the data that are sourced from someplace that has originally collected it. This means that this kind
of data has already been collected by some researchers or investigators in the past and is available either in
published or unpublished form. This information is impure as statistical operations may have been
performed on them already. An example is an information available on the Government of Pakistan, the
Department of Finance’s website or in other repositories, books, journals, etc.

Class Limit
Corresponding to a class interval, the class limits may be defined as the minimum value and the maximum
value the class interval may contain.
The minimum value is known as the lower class limit (LCL) and the maximum value is known as the upper
class limit (UCL).
Class Boundary
Class boundaries may be defined as the actual class limit of a class interval.
For overlapping classification or mutually exclusive classification that excludes the upper class limits like
10–20, 20–30, 30–40 … etc. the class boundaries coincide with the class limits.
This is usually done for a continuous variable. However, for non-overlapping or mutually inclusive
classification that includes both the class limits like 0–9, 10–19, 20–29 … which is usually applicable for a
discrete variable, we have
𝐿𝐶𝐵 = 𝐿𝐶𝐿 − 𝐷/2
𝑈𝐶𝐵 = 𝑈𝐶𝐿 + 𝐷/2
Where D is the difference between the LCL of the next class interval and the UCL of the given class
interval.

For the data presented in the above table, LCB of the first class interval and the corresponding UCB
Apart from the stuff class limit and class boundary, let us look at the midpoint of a class interval.

Mid-Point or Mid Value or Class Mark

Corresponding to a class interval, this may be defined as the total of the two class limits or class boundaries
to be divided by 2.
In other words, in a class interval, mid-point or mid value may be defined as arithmetic mean or average of
the two class limits and two class boundaries.
Thus, we have
𝐿𝐶𝐿 + 𝑈𝐶𝐿
𝑀𝐼𝐷 𝑃𝑂𝐼𝑁𝑇 =
𝐿𝐶𝐵 +2 𝑈𝐶𝐵
𝑀𝐼𝐷 𝑃𝑂𝐼𝑁𝑇 =
2

Example 1
Tally marks are often used to make a frequency distribution table. For example, let’s say you survey a
number of households and find out how many pets they own. The results are 3, 0, 1, 4, 4, 1, 2, 0, 2, 2, 0, 2, 0,
1, 3, 1, 2, 1, 1, and 3. Looking at that string of numbers boggles the eye; a frequency distribution table will
make the data easier to understand.
To make the frequency distribution table, first write the categories in one column (number of pets):

Next, tally the numbers in each category (from the results above). For example, the number zero appears
four times in the list, so put four tally marks “||||”:

Finally, count up the tally marks and write the frequency in the final column. The frequency is just the
total. You have four tally marks for “0”, so put 4 in the last column:
Example 2
. The list of IQ scores are: 118, 123, 124, 125, 127, 128, 129, 130, 130, 133, 136, 138, 141, 142, 149, 150,
and 154. Class Interval is 8.
Tally the numbers in each category from the above. For example four numbers exist between 118-125, so
put four tally marks “||||”:

IQ TALLY NUMBER
118-125 ||||
126-133 |||| |
134-141 |||
142-149 ||
150-157 ||

Finally, count up the tally marks and write the frequency in the final column.
IQ TALLY NUMBER
118-125 |||| 4
126-133 |||| | 6
134-141 ||| 3
142-149 || 2
150-157 || 2
Presentation of Data
Relative frequency If frequency of a class is divided by the sum of frequencies we get
what is called a relative frequency. If we calculate the relative
frequencies for all the classes, we get the relative frequency
distribution. The total of the relative frequencies is equal to 1.
RELATIVE FREQUENCY TABLE
Weights
Frequency
(kilograms) Relative fr
55-57 4 4/40=
58-60 6 6/40=
61-63 14 14/40=
64-66 12 12/40=
67-69 4 4/40=
Total 40
The relative frequencies are also called proportions and in
discussion on probability we shall call them probabilities of the
classes. The idea of relative frequencies is helpful in understanding
the basic lessons on probability. It is also used in the normal
distribution and other probability distributions where the total area
under the curve is unity.
Percentage relative frequency distribution
If a relative frequency is multiplied by 100, we get
percentage relative frequency. If all the relative frequencies
are converted into percentage relative frequencies, we get
percentage relative frequency distribution or simply
percentage frequency distribution.
PERCENTAGE RELATIVE FREQUENCY TABLE
Weights
Frequency
(kilograms) Relative fr
55-57 4 4/40x100=
58-60 6 6/40x100=
61-63 14 14/40x100=
64-66 12 12/40x100=
67-69 4 4/40x100=
Total 40
Cumulative frequency distribution
For cumulative frequency distribution, the class limits are
converted into class boundaries. Cumulative frequency of a class
is the total of all frequencies up to that class.
Less than' cumulative frequencies
Cumulative frequency of the class 57.5-60.5 is 4+6=10 and the
cumulative frequency of the class 60.5-63.5 is 4+6+ 14=24. This
means that there are 10 observations less than 60.5 and there are 24
observations less than 63.5. These are called less than' cumulative
frequencies.
CUMULATIVE FREQUENCY TABLE(Less Then)
Weights Class Weigh
Frequency
(kilograms) Boundaries Less Th
55-57 4 54.5- 57.5 less than
58-60 6 57.5-60.5 less than
61-63 14 60.5-63.5 less than
64-66 12 63.5-66.5 less than
67-69 4 66.5 - 69.5 less than
More than' cumulative frequencies
If we calculate the cumulative frequencies from the bottom, we get what are called "more than cumulative
frequencies. Thus there are 4 observations more than 66.5, there are 4+ 12=16 observations more than 63.5
and there are 4+12+14=30 observations more than 60.5.

CUMULATIVE FREQUENCY TABLE (More Then)

Weights Class Weights More More Then Camulative
Frequency
(kilograms) Boundaries Then Frequncey
55-57 4 54.5- 57.5 more than 54.5 36+4 40
58-60 6 57.5-60.5 more than 57.5 30+6 36
61-63 14 60.5-63.5 more than 60.5 16 +14 30
64-66 12 63.5-66.5 more than 63.5 4+12 16
67-69 4 66.5 - 69.5 more than 66.5 4

Histogram
Histogram is a graph of the frequency distribution in which classes with class boundaries are taken on X-
axis with a suitable breadth of class and adjacent bars are erected to show the frequencies. The height of the
bars is in proportion to the size of the frequency. For uniform intervals, we take a suitable breadth for
classes.

For unequal intervals we have to adjust the frequency. If the interval becomes double, then frequency is
divided by 2 so that the area of the bar is in proportion to the areas of other bars. Histogram is a very simple
and very important graph of the frequency distribution. This graph makes the base for other graphs. If we
take the frequencies on Y-axis, we get frequency histogram, the total area of which is equal to the total
frequency. If we take relative frequencies on the Y-axis, the total area of the histogram is unity, if we take
the percentage frequencies on Y-axis, we get percentage frequency histogram, and the total area of the
histogram will be 100.

Weights 57.5 60.5 63.5 66.5 69.5

Frequency 4 6 14 12 4

Histogram
16
14
12
Frequncy

10
8
6
4
2
0
57.5 60.5 63.5 66.5 69.5
Weights

Frequency polygon
Frequency polygon is a graph of the frequency distribution in which the frequencies are plotted against the
midpoints of the classes. The plotted point’s are joined together to get the frequency polygon.
Midpoints ( x i ) Frequency (f) 20
Frequency
17 Polygon
74.5 9 15 10 10
9
94.5 10
114.5 17 10 5
4
5

134.5 10 5
154.5 5
174.5 4 0
74.5 94.5 114.5 134.5 154.5 174.5 194.5
194.5 5

FREQUENCY CURVE
In frequency curve the points are not joined together by straight lines. The free-hand drawing method of
drawing curve is used and we get the frequency curve as shown in fig.2.10. We can draw the frequency
curve on the frequency polygon or we can draw the curve on the separate sheet of paper.
Midpoints ( x i ) Frequency (f) Frequency Curve
74.5 9 20

94.5 10 15
114.5 17
10
134.5 10
154.5 5 5
174.5 4
0
194.5 5 74.5 94.5 114.5 134.5 154.5 174.5 194.5

CUMULATIVE FREQUENCY POLYGON OR OGIVE

In cumulative frequency polygon, the cumulative frequencies are plotted against the upper class boundaries.
This graph can be used to interpolate the values of median, quartiles and other partition values. The word
ogive polygon is also used for cumulative frequency polygon
Weights Class Less Then
Frequency less then
(kilograms) Boundaries Camulative
55-57 4 54.5- 57.5 57.5 4
58-60 6 57.5-60.5 60.5 10
61-63 14 60.5-63.5 63.5 24
64-66 12 63.5-66.5 66.5 36
67-69 4 66.5 - 69.5 69.5 40

Less Then Camulative Frequncey

50
40
30
20
10
0

57.5 60.5 63.5 66.5 69.5

Measure of Central Tendency
Arithmetic Mean Arithmetic Mean or Simply Mean: “A value obtained by
dividing the sum of all the observations by the number of
observation is called arithmetic Mean”
For Ungrouped Data
𝑆𝑢𝑚 𝑜𝑓 𝐴𝑙𝑙 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
𝑀𝑒𝑎𝑛 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑎𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
∑𝑥𝑖
�
��
For ̅ �
=
∑
Groupe �
Weight Midpoints Frequency fx 𝐱̅�
(grams) ( xi ) f =
�
65----84 74.5 d Data 09 670.5 �
85----104 94.5 10 945.0 �
105----124 114.5 17 1946.5 �
125----144 134.5 Solutio 10 1345.0
145----164 154.5 05 772.5
165----184 174.5 04 698.0
185----204 194.5 n: 05 972.5
Total 60 7350

𝐱̅ = =122.5
60 7350

Mo
de
Mode is an appropriate average in case of qualitative data e.g. the
opinion of an average person; he is probably referring to the most
frequently expressed opinion which is the modal opinion.
Mode in case of Ungrouped Data:
“A VALue THAt occurs most frequently in A DATA is CALled mode”
xi: 2, 3, 8, 4, 6, 3, 2, 5, 3.
Mode = 3(Answer).
Mode in case of Grouped Data:
“A VALue which hAS the lARgest frequency in A set of DATA is CALled mode”
𝑓𝑚 − 𝑓1
𝐌𝐨𝐝𝐞 = 𝑙 + ∗ℎ
(𝑓𝑚 − 𝑓1) + (𝑓𝑚 − 𝑓2)
Fm = frequency of modal class l=
lower class boundary of model class F1 = frequency of
previous class from modal class h = Class interval
F2 = frequency of Next class from modal class
Class boundaries Midpoints xi  Frequency  fi  Cum
frequ
29.5---39.5
39.5---49.5
49.5---59.5
59.5---69.5
69.5---79.5
79.5---89.5
89.5---99.5
TOTAL
𝐌𝐨𝐝𝐞 = 59.5 + (
304−190)+(304−211)
304−190
∗ 10
Median
Median: “when the observation are arranged in ascending or descending order, then a value, that divides a
distribution into equal parts, is called median”
Ungrouped data:
𝑛+1
Median = ( )𝑡ℎ 𝑡𝑒𝑟𝑚
2
If Median is in Points then also apply this. Like 5.5
𝑀𝑒𝑑𝑖𝑎𝑛 = 5 + 0.5(4𝑡ℎ 𝑡𝑒𝑟𝑚 – 5𝑡ℎ 𝑡𝑒𝑟𝑚)
Grouped data:
h n
Median = l + ∗ ( − c)
f 2
n/2 = median term
l= lower class boundary of the median class
h= class interval
f = frequency of median class
c = cumulative frequency of the class preceding median class

Class boundaries Midpoints xi  Frequency  fi  Cumulative

frequency c. f 
29.5---39.5 34.5 8 8
39.5---49.5 44.5 87 95
49.5---59.5 54.5 190 285
59.5---69.5 64.5 304 589
69.5---79.5 74.5 211 800
79.5---89.5 84.5 85 885
89.5---99.5 94.5 20 905
TOTAL 905 3567

Median = 59.5 + 10
304 ∗ (905 − 258)= 65 Answer
2
Quartiles
Q1, Q2, Q3
Divides ranked scores into four equal parts
Ungrouped Data
𝑗(𝑛 + 1)
𝑄= 𝑡ℎ 𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
4
Grouped Data
ℎ 𝑗𝑛
𝑄 = 𝑙 + ∗ ( − 𝑐)𝑡ℎ 𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
𝑓 4
ℓ= lower boundary of the class containing the Q2 or Q3, i.e. the class corresponding to the cumulative
frequency in which 2N/4 or 3N/4 lies
h= class interval size of the class containing Q2 or Q3
f= frequency of the class containing Q2 or Q3
n= number of values, or the total frequency
C= cumulative frequency of the class preceding the class containing Q2 or Q3
Deciles
D1, D2, D3, D4, D5, D6, D7, D8, D9
Divides ranked data into ten equal parts
Ungrouped Data
𝑗(𝑛 + 1)
𝐷= 𝑡ℎ 𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
10
ℎ 𝑗𝑛
Grouped Data 𝐷 = 𝑙 + ∗ ( − 𝑐)𝑡ℎ 𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
𝑓 10
ℓ= lower boundary of the class containing the D2 or D9, i.e.
the class corresponding to the cumulative frequency in which
2N/10 or 9N/10 lies
h= class
interval size of
the class
containing D2
or D9 f=
frequency of
the class
containing D2
or D9
n= number of values, or the total frequency
‹C= cumulative frequency of the class preceding the class containing D2 or
D9
Class boundaries Midpoints xi  Frequency  fi 

29.5---39.5 34.5 8
39.5---49.5 44.5 87
49.5---59.5 54.5 190
59.5---69.5 64.5 304
69.5---79.5 74.5 211
79.5---89.5 84.5 85
89.5---99.5 94.5 20
TOTAL 905

Percentiles
D1, D2, D3, D4, D5, D6, D7, D8, D…….D100
Divides
ranked
data into
hundred
equal
parts
Ungroup
ed Data

𝑃= �
�
(
�
�
+
1
)
𝑡ℎ val size of the
𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
100 class
containing P2
Grouped Data or P99 f=
ℎ 𝑗𝑛 frequency of
𝑃=𝑙+ ∗( the class
𝑓 100
− 𝑐)𝑡ℎ containing P2
𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 or D9
n= number of values, or the total frequency
C= cumulative frequency of the class preceding the class containing P2 or
ℓ= P99
lower
bound
ary of
the
class
contai
ning
the P2
or
P99,
i.e.
the
class
corres
pondi
ng to
the
cumul
ative
freque
ncy in
which
2N/10
0 or
99N/1
00
lies
h
=

c
l
a
s
s

i
n
t
e
r
Systematical distribution

Systematical distribution
Equal distance of both tails from end. Data is distributed in balance form.
Mode = Median =Mean
Positive Skew
Its tail is longer towards right side
Mode < Median <Mean

Negative Skew
its tail is longer towards left Side
Mode > Median >Mean

Measurement of skewed
Karl’s Formula /parson’s coefficient of skewness

𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒
𝑠𝑘 = 𝑆
3(𝑀𝑒𝑎𝑛 − 𝑀𝑒𝑑𝑖𝑎𝑛)
𝑠𝑘 = 𝑆
Answer lies between +3 to -3.
If Answer is 0, its means distribution is symmetrical.
Bowley Formula
𝑄3 + 𝑄1 − 2𝑀𝑒𝑑𝑖𝑎𝑛
𝑠𝑘 = 𝑄3 − 𝑄1
Answer lies between +1 to -1.
If Answer is 0 , its means distribution is symmetrical.
Geometric Mean
The geometric mean, G, of a set of n positive values X1, X2… Xn is defined as the positive nth root of
their product.
G.M FOR UNGROUPED DATA
G n
X X 12n
...X
(Where Xi > 0)
Taking logarithms to the base 10, we get
1
log G  log X  X 2  log Xn 
1
n
log


log X
n  log X 
G  anti log
 
 n 
Example:
Find the geometric mean of numbers:
45, 32, 37, 46, 39, 36, 41, 48, 36.
9

3936
45 3237
41 4836
46
X log X log X
45 1.6532 log G 
32 1.5052 n
37 1.5682
46 1.6628
14.3870
  1.5986
39 1.5911 9
36 1.5563
41 1.6128 Hence G  antilog 1.5986
48 1.6812
36 1.5563 39.68
14.3870

G.M GROUPED DATA

In case of a frequency distribution having k classes with midpoints X1, X2, …,Xk and the corresponding
frequencies f1, f2, …, fk (such that fi = n), the geometric mean is given by
G  n X f1 X f 2 . . X f k
1 2 k

Each value of X thus has to be multiplied by itself f times, and the whole procedure becomes quite a
formidable task!
In terms of logarithms, the formula becomes
1
log G   f log X  f log X  ...  f log X 
1 1 2 2 k k
n

Log G.M 
f log
Xn

Class-
Mileage No. of
mark log X f log X
Rating Cars (midpoint)
X
Geometric Mean
30.0 - 32.9 2 31.45 1.4976 2.9952
33.0 - 35.9 4 34.45 1.5372 6.1488
36.0 - 38.9 14 37.45 1.5735 22.0290
39.0 - 41.9 8 40.45 1.6069 12.8552
42.0 - 44.9 2 43.45 1.6380 3.2760
Total 30 Total 47.3042

G = antilog 47.3042/30 G= antilog 1.5768 = 37.74

Measures of Dispersion
Sometimes when two or more different data sets are to be compared using measure of central tendency or
averages, we get the same results.
Dispersion: “The variability (spread) that exists between the values of a data is called dispersion.
” OR “
The extent to which the observations are spread around an average is called dispersion or scatter”.
There are two types of measure of dispersion
 Absolute Measure of Dispersion
 Relative Measure of Dispersion
Absolute Measure of Dispersion
‘An absolute measure of dispersion measures the variability in terms of the same units of the data”
E.g. if the units of the data are Rs, meters, kg, etc. The units of the measures of dispersion will also be Rs,
meters, kg, etc.
The common Absolute measures of dispersion are:
 Range
 Quartile Deviation or Semi Inter-Quartile Range
 Average Deviation or Mean Deviation
 Standard Deviation
Relative Measure of Dispersion
“A relative measure of dispersion compares the variability of two or more data that are independent of the
units of measurement.
“A relative measure of dispersion, expresses the absolute measure of dispersion relative to the relevant
average and multiplied by 100 many times” i.e.
𝐴𝑏𝑠𝑜𝑙𝑢𝑡𝑒 𝑑𝑖𝑠𝑝𝑒𝑟𝑠𝑖𝑜𝑛
𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑑𝑖𝑠𝑝𝑒𝑟𝑠𝑖𝑜𝑛 = ( ) ∗ 100
𝐴𝑣𝑒𝑟𝑎𝑔𝑒

This is a pure number and independent of the units in which the data has been expressed. It is used for the
purpose to compare the dispersion of a data with the dispersion of another data.
The common relative measures of dispersion are:
 Coefficient of Dispersion or Coefficient of Range
 Coefficient of Quartile Deviation
 Coefficient of Mean Deviation
 Coefficient of Standard Deviation or Coefficient of Variation (C.V)
Standard Deviation:
“The positive square root of variance is called as standard deviation”.
For ungrouped data
  x  x 2 X (xx) ( x  x )2
S
n 4 0 0
6 +2 4
2 –2 4
S  x  x 2 n 0 –4 16
3 –1 1
42 5 +1 1
=  2.45
7
8 +4 16
42
For Grouped data

  fx 2 fx 2 
 Life (in
No. of Mid-
S   n   Bulbs point fx fx2
n   Hundreds of Hours)
 f x
0–5 4 2.5 10.0 25.0
5 – 10 9 7.5 67.5 506.25
78781.25 2437.5  2 
S 100   10 – 20 38 15.0 570.0 8550.0
  100   20 – 40 33 30.0 990.0 29700.0
40 and over 16 50.0 800.0 40000.0
= 13.9 hundred hours 100 2437.5 78781.25
= 1390 hours
Variance
The square of standard deviation variance is called as variance.
Ungrouped data
2
2 ∑(𝑥 − 𝐱̅)
𝑆 =
𝑛
Grouped data
∑𝑓𝑥 2 ∑𝑓𝑥 2
𝑆2 = { −( )}
𝑛 𝑛
Coefficient of variation
It is a pure number without unit.
it is used to compare variation in two or more data sets given in different units.
The coefficient of variation is obtained by dividing the standard deviation by the mean and expressed in
percentage.
S tan
Coefficient of variation  Deviation OR C.V . S 100
dard 
Mean X
Less variation = more constant and More variation = less constant
Correlation
Correlation is a measure of the degree of relatedness of variables. It can help a business researcher
determine, for example, whether the stocks of two airlines rise and fall in any related manner. For a sample
of pairs of data, correlation analysis can yield a numerical value that represents the degree of relatedness of
the two stock prices over time.
𝑛∑𝑥𝑦 − (∑𝑥)(∑𝑌)
𝑟 = √𝑛(∑𝑥2) − (∑𝑥2) ∗ √𝑛(∑𝑦2) − (∑𝑦2)

r represents linear correlation coefficient for a sample.

n represents the number of pairs of data present.
S denotes the addition of the items indicated.
∑x denotes the sum of all x-values.
∑x2 indicates that each x-value should be squared and then those squares added.
(∑x)2 indicates that the x-values should be added and the total then squared.
∑xy indicates that each x-value should be first multiplied by its corresponding y-value. After obtaining
all such products, find their sum.
Coefficient of correlation
+1 Perfectly Positive
+1 to 0.5 Strong Positive
0.5 Moderate Positive
0.5 to 0 Weak Positive
0 No co relation
0 To -0.5 Weak Negative
-0.5 Moderate Negative
-0.5 to -1 Strong Negative
-1 Perfectly Negative

Rank correlation
Sometimes the actual measurement or counts of individual objects are either not available or accurate
assessment is not possible. They are then arrange in order according to some characteristic of interact. Such
an order arrangement is called a ranking and the Order given to an individual or object is called rank. The
correlation between such sets of ranking is known as rank correlation.
A rank correlation coefficient measures the degree of similarity between two rankings, and can be used to
assess the significance of the relation between them.
the relationship between rankings of different ordinal variables or different rankings of the same variable,
where a "ranking" is the assignment of the ordering labels "first", "second", "third", etc. to different
observations of a particular variable.
Spear’s man Rank Correlation:
6∑𝑑2
𝑟𝑠 = 1 −
𝑛(𝑛2 − 1)
Judge X Judge Y Judge Z dxy=X-Y dxz=X-Z dyz=Y-Z 𝑑2 𝑑2 𝑑2

5 1 6 4 -1 -5 16 1 25
2 7 4 -5 -2 3 25 4 9
6 6 9 0 -3 -3 0 9 9
8 10 8 -2 0 2 4 0 4
1 4 1 -3 0 3 9 0 9
7 5 2 2 5 3 4 25 9
4 3 3 1 1 0 1 1 0
9 8 10 1 -1 -2 1 1 4
3 2 5 1 -2 -3 1 4 9
10 9 7 1 3 2 1 9 4
Total 62 54 82

6∑𝑑2
𝑟=1−
𝑛(𝑛2 − 1)
6∑𝑑2
𝑟 =1 −
𝑛(𝑛2 − 1)

Rank Correlation for Tie

For Each tie:
Add 1/12(𝑡3 − 𝑡) into ∑d2

X Rank Tie Calculation

10 7 10 Tie 6+7+8/3=7
15 4
24 1
10 7
12 5
22 2.5 22 Tie 2+3/2= 2.5
22 2.5
10 7

Multiple correlation
An estimate of combined influence of two or more variable on the observed (dependent) variable.
𝑟2 + 𝑟2 − 2𝑟12 ∗ 𝑟23 ∗ 𝑟13
12 13
𝑅1.23 = √ 2
1 − 𝑟23
𝑟2 + 𝑟2 − 2𝑟12 ∗ 𝑟23 ∗ 𝑟13
23 21
𝑅2.31 = √ 2
1 − 𝑟31

𝑟2 + 𝑟2 − 2𝑟12 ∗ 𝑟23 ∗ 𝑟13

31 32
𝑅3.12 = √ 2
1 − 𝑟12

𝑟12
2
𝑛∑𝑋1𝑋2 2− (∑𝑋1)(∑𝑋2 2
)
= √𝑛(∑𝑋 ) − (∑𝑋 ) ∗ √𝑛(∑𝑋 ) − (∑𝑋2)
1 1 2 2

𝑟13
2
𝑛∑𝑋1𝑋3 2− (∑𝑋1)(∑𝑋2 3
)
= √𝑛(∑𝑋 ) − (∑𝑋 ) ∗ √𝑛(∑𝑋 ) − (∑𝑋2)
1 1 3 3

𝑟23
2
𝑛∑𝑋2𝑋3 2− (∑𝑋2)(∑𝑋2 3
)
= √𝑛(∑𝑋 ) − (∑𝑋 ) ∗ √𝑛(∑𝑋 ) − (∑𝑋2)
2 2 3 3

X1 X2 X3 𝑋12 𝑋22 𝑋32 X1X2 X1X3 X2X3

3 16 90 9 256 8100 48 270 1440
5 10 72 25 100 5184 50 360 720
6 7 54 36 49 2916 42 324 378
8 4 42 64 16 1764 32 336 168
12 3 30 144 9 900 36 360 90
14 2 12 196 4 144 28 168 24
48 42 300 474 434 19008 236 1818 2820
Partial correlation
A partial correlation Measure the degree of linear relationship between any two variables in a multivariate
problem under the condition that any common relationship with all other variable has been removed
If X1, X2, and X3 , in then the correlation between X1 and X2 after removing the effect of X3 from X1 and
X2 in partial correlation.

𝑟12.3 𝑟12 − 𝑟13𝑟23

= √(1 − 𝑟2 )(1 − 𝑟2 )
13 23

𝑟13 − 𝑟12𝑟23
𝑟13.2 = √(1 − 𝑟2 )(1 − 𝑟2 )
12 23

𝑟23 − 𝑟12𝑟13
𝑟23.1 = √(1 − 𝑟2 )(1 − 𝑟2 )
12 13
Regression
Regression is a statistical method used in finance, investing, and other disciplines that attempts to determine
the strength and character of the relationship between one dependent variable (usually denoted by Y) and a
series of other variables (known as independent variables)
Simple linear regression
A statistical method that allows us to summarize and study relationships between two continuous
(quantitative) variables:
 One variable, denoted x, is regarded as the predictor, explanatory, or independent variable.
 The other variable, denoted y, is regarded as the response, outcome, or dependent variable.
Because the other terms are used less frequently today, we'll use the "predictor" and "response" terms to
refer to the variables encountered in this course. The other terms are mentioned only to make you aware of
them should you encounter them. Simple linear regression gets its adjective "simple," because it concerns
the study of only one predictor variable. In contrast, multiple linear regression, which we study later in this
course, gets its adjective "multiple," because it concerns the study of two or more predictor variables.
For Population:
𝐘 = α + 𝛃𝐗 + 𝐸𝑖
For Sample:
𝑦𝑖 = 𝑎 + 𝑏𝑥𝑖 + 𝑒𝑖
The estimated regression line is generally written as:
Ŷ𝑖 = 𝑎 + 𝑏𝑥𝑖
𝑒𝑖 = 0
By using Method of least square we obtain following two equations:
∑𝑦𝑖 = 𝑛𝑎 + 𝑏∑𝑥𝑖
∑𝑥𝑖𝑦𝑖 = 𝑎∑𝑥𝑖 + 𝑏∑𝑥2𝑖
Alternate Method:
Y dependent
X independent

𝑏 𝑛∑𝑋𝑌 − (∑𝑋)(∑𝑌)
= 𝑛∑𝑋2 − (∑𝑋2)

𝑎 = ȳ − 𝑏 X̄
X dependent
Y independent

𝑏 𝑛∑𝑋𝑌 − (∑𝑋)(∑𝑌)
= 𝑛∑𝑌2 − (∑𝑌2)

𝑎 = X̄ − 𝑏Ŷ

A =1.47, b= 2.831
X Y XY 𝑥2 Ŷ =1.47+2.831(X) 𝑦−Ŷ (𝑦 − Ŷ)2 𝑦2
5 16 80 25 15.625 0.375 0.140625 256
6 19 114 36 18.456 0.544 0.295936 361
8 23 184 64 24.118 -1.118 1.249924 529
10 28 280 100 29.78 -1.78 3.1684 784
12 36 432 144 35.442 0.558 0.311364 1296
13 41 533 169 38.273 2.727 7.436529 1681
15 44 660 225 43.935 0.065 0.004225 1936
16 45 720 256 46.766 -1.766 3.118756 2025
17 50 850 289 49.597 0.403 0.162409 2500
102 302 3853 1308 301.992 0.008 15.88817 11368
Stranded deviation of regression Or Stranded Error of Estimation

𝑆. ∑(𝑦 − Ŷ)2
=√ 𝑛−2
Alternate Method 2
𝑆 . = √∑𝑦 − 𝑎∑𝑦 − 𝑏∑𝑥𝑦
𝑛−2

Multiple regression Model

Multiple regression is an extension of simple linear regression. I
For Population:

For Sample:

Multiple regression for two regression:

By using Method of least square we obtain following two equati

STA301 – Statistics and Probability

Probability
of Winning
Discrete Uniform Distribution

1/1000

X
000

999
000

999
Lottery Number

INTERPRETATION

It reflects the fact that winning lottery numbers are selected by a random procedure which makes all numbers equally
likely to be selected. The point to be kept in mind is that, whenever we have a situation where the various outcomes are
equally likely, and of a form such that we have a random variable X with values 0, 1, 2, … or , as in the above example,
0000, 0001 …, 9999, we will be dealing with the discrete uniform distribution.

BINOMIAL DISTRIBUTION

The binomial distribution is a very important discrete probability distribution. It was discovered by James Bernoulli
about the year 1700.We illustrate this distribution with the help of the following example:

EXAMPLE

Suppose that we toss a fair coin 5 times, and we are interested in determining the probability distribution of X, where X
represents the number of heads that we obtain.
We note that in tossing a fair coin 5 times:
 every toss results in either a head or a tail,
 the probability of heads (denoted by p) is equal to ½ every time (in other words, the probability
of heads remains constant),
 every throw is independent of every other throw, and
 the total number of tosses i.e. 5 is fixed in advance.
The above four points represents the four basic and vitally important PROPERTIES of a binomial experiment

PROPERTIES OF A BINOMIAL EXPERIMENT

 Every trial results in a success or a failure.

 The successive trials are independent.
 The probability of success, p, remains constant from trial to trial.
 The number of trials, n, is fixed in advanced.

Virtual University of Pakistan 206

LECTURE NO. 28
 Binomial Distribution
 Fitting a Binomial Distribution to Real Data
 An Introduction to the Hyper geometric Distribution
The binomial distribution is a very important discrete probability distribution. We illustrate this distribution with the
help of the following example:

EXAMPLE

Suppose that we toss a fair coin 5 times, and we are interested in determining the probability distribution of X, where X
represents the number of heads that we obtain. We note that in tossing a fair coin 5 times:
 Every toss results in either a head or a tail,
 The probability of heads (denoted by p) is equal to ½ every time (in other words, the probability of heads
remains constant),
 Every throw is independent of every other throw, and
 The total number of tosses i.e. 5 is fixed in advance.
The above four points represents the four basic and vitally important PROPERTIES of binomial experiment. Now, in 5
tosses of the coin, there can be 0, 1, 2, 3, 4 or 5 heads, and the no. of heads is thus a random variable which can take one
of these six values. In order to compute the probabilities of these X-values, the formula is:

Binomial Distribution

Where 
P X x   xn  p x q nx
n = the total no. of trials
p = probability of success in each trial
q = probability of failure in
each trial (i.e. q = 1 - p)
x = no. of successes in n trials.
x = 0, 1, 2, … n

The binomial distribution has two parameters, n and p. In this example, n = 5 since the coin was thrown 5 times, p = ½
since it is a fair coin, q = 1 – p = 1 – ½ = ½ Hence
P X  x      
5 1
Putting x = 0 1 5x
 x

x 2 2

PX  0   
5 1

0 12 50
0 2

1 1 
5! 5

2
0!5!

 1 1 1 5  1

Putting x = 1 2
32
PX  1 
1
    
5 1 1 1 5 1

2 2

5!
1 1 1 
4

2 2
1!4!

5
 21 
1 5
 1  5
 5 21   5  
32 32
 
STA301 – Statistics and Probability

Similarly, we have:

P X  2       
5 1 2 1 5 2

10
2 2 2
32
P X  3       
5 1 3 1 53

10
3 2 2
32
P X  4   
5 1
5
 4
 1 54 
4 2 2
32
P  X  5   
5 1
1
 5
 1 55 
5 2 2
32
Hence, the binomial distribution for this particular example is as follows. Binomial Distribution in the case of tossing a
fair coin five times:

Number of Heads Probability

X P(x)
0 1/32
1 5/32
2 10/32
3 10/32
4 5/32
5 1/32
Total 32/32 = 1

Graphical Representation of the above binomial distribution:

P(x)

10/32

8/32

6.32

4/32

2/32

X
0 1 2 3 4 5

The next question is: What about the mean and the standard deviation of this distribution? We can calculate them just as
before, using the formulas

Mean of X = E(X) = XP(X)

Virtual University of Pakistan 208
STA301 – Statistics and Probability
Var(X) = X2 P(X) – [XP(X)]2

Virtual University of Pakistan 209

but it has been mathematically proved that for a binomial distribution given by

P X  x   x  p x q nx
n

For a binomial distribution

E(X) = np
and Var(X) = npq
so that
S.D.X   npq
For the above example, n = 5, p = ½ and q = ½
Hence
Mean = E(X) = np = 5(½) = 2.5
and S.D.(X) 
npq
 5     5  = 1.12
1 1
2 2 4
We would have got exactly the same answers if we had applied the LENGTHIER procedure.
E(X) = XP(X) and Var X = X2 P(X)-[XP(X)]2
Graphical Representation of the Mean and Standard Deviation of the Binomial Distribution (n=5, p=1/2)

P(x)

10/32

8/32

6.32

4/32

2/32
X
0 1 2 3 45
1.12

E(X) S.D.(X)
WHAT DOES THIS MEAN?

What this mean is that if 5 fair coins are tossed an INFINITE no. of times, sometimes we will get no head out of 5,
sometimes/head… sometimes all 5 heads. But on the AVERAGE we should expect to get 2.5 heads in 5 tosses of the
coin, or, a total of 25 heads in 50 tosses of the coin And 1.12 gives a measure of the possible variability in the various
numbers of heads that can be obtained in 5 tosses. (As you know, in this problem, the number of heads can range from 0
to 5 had the coin been tossed 10 times, the no. of heads possible would vary from 0 to 10 and the standard deviation
would probably have been different).

Coefficient of Variation:
 1.12
C.V.  100  44.8%
100  2.5

Note that the binomial distribution is not always symmetrical as in the above example. It will be symmetrical only when
p = q = ½ (as in the above example).
P(x)

X
0 1 2 3 4 5
It is skewed to the right if p < q:

P(x)

X
0 1 2 3 4 5 6 7

It is skewed to the left if p > q:

P(x)

X
0 1 2 3 4 5 6 7
But the degree of Skewness (or asymmetry) decreases as n increases. Next, we consider the Fitting of a
Binomial Distribution to Real Data. We illustrate this concept with the help of the following example:

EXAMPLE

The following data has been obtained by tossing a LOADED die 5 times, and noting the number of times that we
obtained a six. Fit a binomial distribution to this data.

No. of Sixes 0 1 2 3 4 5 Total

Frequency 12 56 74 39 18 1 200

SOLUTION

To fit a binomial distribution, we need to find n and p.

Here n = 5, the largest x-value.
To find p, we use the relationship x = np.

The rationale of this step is that, as indicated in the last lecture, the mean of a binomial probability distribution is equal
to np, i.e.
 = np
But, here, we are not dealing with a probability distribution i.e. the entire population of all possible sets of throws of a
loaded die --- we only have a sample of throws at our disposal.
As such,  is not available to us, and all we can do is to replace it by its estimate X.
Hence, our equation becomesX = np.
Now, we have:
fixi
x
 fi
0  56 148 117  72  5
 200
398
  1.99
200
Using the relationship x = np, we get 5p = 1.99 or p = 0.398.This value of p seems to indicate clearly that the die is not
fair at all! (Had it been a fair die, the probability of getting a six would have been 1/6 i.e. 0.167; a value of p = 0.398 is
very different from 0.167.) Letting the random variable X represent the number of sixes, the above calculations yield
the fitted binomial distribution as
5 
x 5 x
bx;5, 0.398  x 0.398 0.602

Hence the probabilities and expected frequencies are calculated as below:

No. of Expected
Probability f(x)
Sixes (x) frequency
5 5 5
0   q  0.602  = 0.07907 15.8
 0
5 5 4
1   q p  5.0.602  0.398 = 0.26136 52.5
1 
5  3 2 3 2
2   q p  10.0.602  0.398 = 0.34559 69.1
 2
5 2 3 3
3   q p  10.0.602 0.398 = 0.22847 45.7
3
5  4 4
4   qp  0.602 0.398 = 0.07553 15.1
 4
5 5 5
5   p  0.398 = 0.00998 2.0
5
Total = 1.00000 200.0
In the above table, the expected frequencies are obtained by multiplying each of the probabilities by 200.
In the entire above procedure, we are assuming that the given frequency distribution has the characteristics of
the fitted theoretical binomial distribution, comparing the observed frequencies with the expected frequencies, we
obtain:

No. of Observed Expected

Sixes Frequency Frequency
x f0 fe
0 12 15.8
1 56 52.5
2 74 69.1
3 39 45.7
4 18 15.1
5 1 2.0
Total 200 200.0

The graphical representation of the observed frequencies as well as the expected frequencies is as follows:
Graphical Representation of the Observed and Expected
Frequencies:

Frequency
Observed frequency

75 Expected frequency

X
0 1 3 4 5
2
The above graph quite clearly indicates that there is not much discrepancy between the observed and the expected
frequencies. Hence, we can say that it is a reasonably good fit.
There is a procedure known as the Chi-Square Test of Goodness of Fit which enables us to determine in a formal,
mathematical manner whether or not the theoretical distribution fits the observed distribution reasonably well. This test
comes under the realm of Inferential Statistics --- that area which we will deal with during the last 15 lectures of this
course. Let us consider a real-life application of the binomial distribution:

AN EXAMPLE FROM INDUSTRY

Suppose that the past record indicates that the proportion of defective articles produced by this factory is 7%.And
suppose that a law NEWLY instituted in this particular country states that there should not be more than 5% defective.
Suppose that the factory-owner makes the statement that his machinery has been overhauled so that the number of
defectives has DECREASED.
In order to examine this claim, the relevant government department decides to send an inspector to examine a sample of
20 items.
STA301 – Statistics and Probability

What is the probability that the inspector will find 2 or more defective items in his sample (so that a fine will be
imposed on the factory)?

SOLUTION

The first step is to identify the NATURE of the situation, If we study this problem closely, we realize that we are
dealing with a binomial experiment because of the fact that all four properties of a binomial experiment are being
fulfilled:

PROPERTIES OF A BINOMIAL EXPERIMENT

 Every item selected will either be defective (i.e. success) or not defective (i.e. failure)
 Every item drawn is independent of every other item
 The probability of obtaining a defective item i.e. 7% is the same (constant) for all items. (This probability
figure is according to relative frequency definition of probability.
 The number of items drawn is fixed in advance i.e. 20 hence; we are in a position to apply the binomial
formula

P X  x   x  p x q nx
n

PX  x   0.07 0.93

20
x
x 20x

Substituting n = 20 and p = 0.07, we obtain:

Now

P(X > 2) = 1 - P(X < 2)

= 1- [P(X = 0) + P(X =1)]

 1  0.07 0.93
20 0 200
  0.07 0.93
20 1 201
]
0 1

 111 0.9320  20  0.07  0.9319

 1 0.234  0.353
 0.413
 41.3%
Hence the probability is SUBSTANTIAL i.e. more than 40% that the inspector will find two or more defective articles
among the 20 that he will inspect. In other words, there is CONSIDERABLE chance that the factory will be fined.
The point to be realized is that, generally speaking, whenever we are dealing with a ‘success / failure’ situation, we are
dealing with what can be a binomial experiment. (For EXAMPLE, if we are interested in determining any of the
following proportions, we are dealing with a BINOMIAL situation:
 Proportion of smokers in a city smoker  success, non-smokers  failure.
 Proportion of literates in a community  literacy rate, literate  success, illiterate  failure.
 Proportion of males in a city  sex ratio).

HYPERGEOMETRIC PROBABILITY DISTRIBUTION

There are many experiments in which the condition of independence is violated and the probability of success does not
remain constant for all trials. Such experiments are called hyper geometric experiments.
In other words, a hyper geometric experiment has the following properties:

PROPERTIES OF HYPERGEOMETRIC EXPERIMENT

 The outcomes of each trial may be classified into one of two categories, success and failure.
 The probability of success changes on each trial.
 The successive trials are not independent.
 The experiment is repeated a fixed number of times.
The number of success, X in a hyper geometric experiment is called a hyper geometric random variable and its
probability distribution is called the hyper geometric distribution. When the hyper geometric random variable X
assumes a value x, the hyper geometric probability distribution is given by the formula

Virtual University of Pakistan 213

Where P X x 
 
  ,
k Nk

N = number of units in the population,


x Nnx
n
n = number of units in the sample, and
k = number of successes in the population.
The hyper geometric probability distribution has three parameters N, n and k.
The hyper geometric probability distribution is appropriate when

 a random sample of size n is drawn WITHOUT REPLACEMENT from a finite population of N units;

 k of the units are of one kind (classified as success) and the remaining N – k of another kind (classified as
failure).
STA301 – Statistics and Probability

LECTURE NO. 29
 Hyper geometric
Distribution (in some detail)
 Poisson Distribution
 Limiting Approximation to the Binomial
 Poisson Process
 Continuous Uniform Distribution

In the last lecture, we began the discussion of the HYPERGEOMETRIC PROBABILITY DISTRIBUTION. We now
consider this distribution in some detail. As indicated in the last lecture, there are many experiments in which the
condition of independence is violated and the probability of success does not remain constant for all trials. Such
experiments are called hyper geometric experiments. In other words, a hyper geometric experiment has the following
properties:

PROPERTIES OF HYPERGEOMETRIC EXPERIMENT

 k Nk 
PX 
x x nx

where n N ,

N = number of units in the

population, n = number of units in the
sample, and
k = number of successes in the population.
The hyper geometric probability distribution
has three parameters N, n and k.
 The hyper geometric probability distribution is appropriate when
 a random sample of size n is drawn WITHOUT REPLACEMENT from a finite population of N units;
 k of the units are of one kind (classified as success) and the remaining N – k of another kind (classified as
failure).

EXAMPLE

The names of 5 men and 5 women are written on slips of paper and placed in a hat. Four names are drawn. What is the
probability that 2 are men and 2 are women? Let us regard ‘men’ as success. Then X will denote the number of
men. We have N = 5 + 5 = 10 names to be drawn from; Also, n = 4, (since we are drawing a sample of size 4 out of a
‘population’ of size 10) In addition, k = 5 (since there are 5 men in the population of 10). In this problem, the possible
values of X are 0, 1, 2, 3, 4, i.e. n): The hyper geometric distribution is given by


P Xx    
k Nk
x nx
 N
n
,

Since N = 10, k = 5 and n = 4, hence, in this problem, the hyper geometric distribution is given by

 5  5 
  
 x  4  x 
P(X  x) 
10 
4
 

Virtual University of Pakistan 215

and the required probability,
i.e P(X = 2) is

PX  2 
   5
2 42
5

  
10
4
5 5

 2 2
  10
4

10 10
 210
10
 21
In other words, the probability is a little less than 50% that two of the four names drawn will be those of MEN. In the
above example, just as we have computed the probability of X = 2, we could also have computed the probabilities of X
= 0, X = 1, X = 3 and X = 4 (i.e. the probabilities of having zero, one, three OR four men among the four names
drawn).The students are encouraged to compute these probabilities on their own, to check that the sum of these
probabilities is 1, and to draw the line chart of this distribution.
Additionally, the students are encouraged to think about the centre, spread and shape of the distribution. Next, we
consider some important PROPERTIES of the Hyper
geometric Distribution:

PROPERTIES OF THE HYPERGEOMETRIC DISTRIBUTION

 The mean and the hyper geometric probability distribution are

k k
n and 
2
n NkNn
,
N N N N 1
 If N becomes indefinitely large, the hyper geometric probability distribution tends to the BINOMIAL
probability distribution.
The above property will be best understood with reference to the following important points:
 There are two ways of drawing a sample from a population, sampling with replacement, and sampling
without replacement.
 Also, a sample can be drawn from either a finite population or an infinite population.
This leads to the following bivariate table: With reference to sampling, the various possible situations are:

Population
Finite Infinite
Sampling
With
replacement
Without
replacement
The point to be understood is that, whenever we are sampling with replacement, the population remains undisturbed
(because any element that is drawn at any one draw, is re-placed into the population before the next draw).Hence, we
can say that the various trials (i.e. draws) are independent, and hence we can use the binomial formula. On the other
hand, when we are sampling without replacement from a finite population, the constitution of the population changes at
every draw (because any element that is drawn at any one draw is not re-placed into the population before the next
draw). Hence, we cannot say that the various trials are independent, and hence the formula that is appropriate in this
particular situation is the hyper geometric formula. But, if the population size is much larger than the sample size (so
that we can regard it as an ‘infinite’ population), then we note that, although we are not re-placing any element that has
been drawn back into the population, the population remains almost undisturbed. As such, we can assume that the
various trials (i.e. draws) are independent, and, once again, we can apply the binomial formula.
In this regard, the generally accepted rule is that the binomial formula can be applied when we are drawing a sample
from a finite population without replacement and the sample size n is not more than 5 percent of the population size N,
or, to put it in another way, when n < 0.05 N.
When n is greater than 5 percent of N, the hyper geometric formula should be used.
STA301 – Statistics and Probability

Next, we discuss the Poisson Distribution.

POISSON DISTRIBUTION

The Poisson distribution is named after the French mathematician Sime’on Denis Poisson (1781-1840) who published
its derivation in the year 1837.THE POISSON DISTRIBUTION ARISES IN THE FOLLOWING TWO
SITUATIONS:
 It is a limiting approximation to the binomial distribution, when p, the probability of success is very small
but n, the number of trials is so large that the product np =  is of a moderate size;
 a distribution in its own right by considering a POISSON PROCESS where events occur randomly over a
specified interval of time or space or length.
Such random events might be the number of typing errors per page in a book, the number of traffic accidents in a
particular city in a 24-hour period, etc.
With regard to the first situation, if we assume that n goes to infinity and p approaches zero in such a way that  = np
remains constant, then the limiting form of the binomial probability distribution is

 x
Li bx; n, p e  , x  0,1,2,..., 
m x!
n
p0
where e = 2.71828.
The Poisson distribution has only one parameter  > 0.
The parameter  may be interpreted as the mean of the distribution.
Although the theoretical requirement is that n should tend to infinity, and p should tend to zero, but in PRACTICE,
generally, most statisticians use the Poisson approximation to the binomial when
p is 0.05 or less,
& n is 20 or more,
but in fact, the LARGER n is and the SMALLER p is, the better will be the approximation. We illustrate this particular
application of the Poisson distribution with the help of the following example:

EXAMPLE

Two hundred passengers have made reservations for an airplane flight. If the probability that a passenger who has a
reservation will not show up is 0.01, what is the probability that exactly three will not show up?

SOLUTION

Let us regard a “no show” as success. Then this is essentially a binomial experiment with n = 200 and p = 0.01. Since p
is very small and n is considerably large, we shall apply the Poisson distribution, using
= np = (200) (0.01) = 2.
Therefore, if X represents the number of successes (not showing up), we have

PX 
3  e 2
2 3

3!
0.1353 8
 3 21
0.1804

e 
1 
 2  0.1353
2 2.71828  


POISSON PROCESS

may be defined as a physical process governed at least in part by some random mechanism.
Stated differently a poisson process represents a situation where events occur randomly over a specified interval of time
or space or length. Such random events might be the number of taxicab arrivals at an intersection per day; the number
of traffic deaths per month in a city; the number of radioactive particles emitted in a given period; the number of flaws
per unit length of some material; the number of typing errors per page in a book; etc.
The formula valid in the case of a Poisson process is:

Virtual University of Pakistan 217

STA301 – Statistics and Probability
PX 
x   e t ,
t x

Virtual University of Pakistan 218

STA301 – Statistics and Probability

where
= average number of
occurrences of the outcome
of interest per unit of time,
t = number of time-units
under consideration, and
x= number of occurrences of the
outcome of interest in t units of time.
We illustrate this concept with the help of the following example:

EXAMPLE

Telephone calls are being placed through a certain exchange at random times on the average of four per minute.
Assuming a Poisson Process, determine the probability that in a 15-second interval, there are 3 or more calls.

SOLUTION

Step-1: Identify the unit of time:

In this problem we take a minute as the unit of time.

Step-2: Identify, the average number of occurrences of the outcome of interest per unit of time,
In this problem we have the information that, on the average, 4 calls are received per minute, hence:
=4
Step-3: Identify t, the number of time-units under consideration. In this problem, we are interested in a 15-second
interval, and since 15 seconds are equal to 15/60 = ¼ minutes i.e. 1/4 units of time, therefore t = 1/4
Step-4: Compute t: In this problem,
 = 4, &
t = 1/4,
Hence:
t = 4  ¼ = 1
Step-5: Apply the Poisson formula

PX  x e
 t
 t
  x

,
x!
In this problem, since t = 1, therefore and since we are interested in 3 or more calls in a 15-second interval, therefore

P(X > 3) = 1 - P(X < 3)

= 1 - [P(X=0)+P(X=1)+P(X=2)]

e 1
2 x
 1 
x0 x!
2 0.3679 1
x
-1
=1  (
e = 0.3679)
x0 x!
= 1 – (0.91975) = 0.08025
Hence the probability is only 8% (i.e. a very low probability) that in a 15-second interval, the telephone exchange
receives 3 or more calls.

PROPERTIES OF THE POISSON DISTRIBUTION

Some of the main properties of the Poisson distribution are given below:
 If the random variable X has a Poisson distribution with parameter , then its mean and variance are given
by E(X) =  and Var(X) = .
 (In other words, we can say that the mean of the Poisson distribution is equal to its variance.)
 The shape of the Poisson distribution is positively skewed. The distribution tends to be symmetrical as 
becomes larger and larger.
Comparing the Poisson distribution with the binomial, we note that, whereas the binomial distribution can be
symmetric, positively skewed, or negatively skewed (depending on whether p = 1/2, p < 1/2, or p > 1/2), the Poisson
distribution can never be negatively skewed.

Virtual University of Pakistan 218

FITTING OF A POISSON DISTRIBUTION TO REAL DATA

Just as we discussed the fitting of the binomial distribution to real data in the last lecture, the Poisson distribution can
also be fitted to real-life data. The procedure is very similar to the one described in the case of the fitting of the
binomial distribution: The population mean  is replaced by the sample mean X, and the probabilities of the various
values of X are computed using the Poisson formula. The chi-square test of goodness of fit enables us to determine
whether or not it is a good fit i.e. whether or not the discrepancy between the expected frequencies and the observed
frequencies is small. Next, we discuss some important mathematical points regarding Poisson distribution.
 1) The Poisson approximation to the binomial formula works well when
n > 20 and p < 0.05.
 2) Suppose that the Poisson is used to approximate the binomial which, in turn, is being used to
approximate the hyper geometric. Then the Poisson is being used to approximate the hyper geometric
Putting the two approximation conditions together, the rule of thumb is that the Poisson distribution can be
used to approximate the hyper geometric distribution when n < 0.05N, n > 20, and p < 0.05
This brings to the end of the discussion of some of the most important and well-known Univariate discrete probability
distributions. We now begin the discussion some of the well-known Univariate continuous probability distribution.
There are different types of continuous distributions e.g. the uniform distribution, the normal distribution, the geometric
distribution, and the exponential distribution. Each one has its own shape and its own mathematical properties. In this
course, we will discuss the uniform distribution and the normal distribution.
We begin with the continuous UNIFORM DISTRIBUTION (also known as the RECTANGULAR DISTRIBUTION).

UNIFORM DISTRIBUTION

A random variable X is said to be uniformly distributed if its density function is defined as

f x  1 axb
b  a,
The graph of this distribution is as follows

f(x)

f x 
1
ba
1
ba

X
0 a b
The above function is a proper probability density function because of the fact that:
i) Since a < b, therefore f(x) > 0
 
1
ii) 
 b
1 b ba
dx 
b x 1
 f x dx  b  a
 a a a
ba
Since the shape of the distribution is like that of a rectangle, therefore the total area of this distribution can also be
obtained from the simple formula:
Area of rectangle
= (Base) × (Height)
 1 
 b  a   1
ba
 

Area under the Uniform Distribution

= Area of the rectangle
 1
= (Base) × (Height) 
 b  a   1
ba
 

f(x)

f x  
1
ba
1
ba

X
0 a b
The distribution derives its name from the fact that its density is constant or uniform over the interval [a, b] and is 0
elsewhere. It is also called the rectangular distribution because its total probability is confined to a rectangular region
with base equal to (b – a) and height equal to 1/(b – a). The parameters of this distribution are a and b with

ab
2 b  a 
and variance 2
 is   12
 2

PROPERTIES OF THE UNIFORM DISTRIBUTION

Let X has the uniform distribution over [a, b]. Then its mean is

The uniform probability distribution provides a model for continuous random variables that are evenly distributed over
a certain interval. That is, a uniform random variable is one that is just as likely to assume a value in one interval as it
is to assume a value in any other interval of equal size. There is no clustering of values around any value. Instead, there
is an even spread over the entire region of possible values. As far as the real-life application of the uniform
distribution is concerned, the point to be noted is that, for continuous random variables there is an infinite number of
values in the sample space, but in some cases, the values may appear to be equally likely.

EXAMPLE-1

If a short exists in a 5 meter stretch of electrical wire, it may have an equal probability of being in any particular 1
centimeter segment along the line.

EXAMPLE-2

If a safety inspector plans to choose a time at random during the 4 afternoon work-hours to pay a surprise visit to a
certain area of a plant, then each 1 minute time-interval in this 4 work-hour period will have an equally likely chance to
being selected for the visit. Also, the uniform distribution arises in the study of rounding off errors, etc.
STA301 – Statistics and Probability

LECTURE NO. 30
 Normal Distribution.
 Mathematical Definition
 Important Properties
 The Standard Normal Distribution
 Direct Use of the Area Table
 Inverse Use of the Area Table
 Normal Approximation to the Binomial Distribution

The normal distribution was discovered in 1733. The normal distribution has a bell-shaped curve of the type shown
below:

- 
Let us begin its detailed discussion by considering its formal MATHEMATICAL DEFINITION, and its main
PROPERTIES.

NORMAL DISTRIBUTION

A continuous random variable is said to be normally distributed with mean  and standard deviation  if its probability
density function is given by
2
 x   where
1 
1 2  
    3.1416 ~ 22 7 ,
f x  2 e ,    x  
 e ~ 2.71828



 

For any particular value of  and any particular value of , giving different values to x and we obtain a set of ordered
pairs (x, f(x)) that yield the bell-shaped curve given above. The formula of the normal distribution defines a FAMILY of
distributions depending on the values of the two parameters  and  (as these are the two values that determine the
shape of the distribution).

PROPERTIES OF THE NORMAL DISTRIBUTION

Property No. 1
It can be mathematically proved that, for the normal distribution N(,2),  represents the mean, and  represents the
standard deviation of the normal distribution. A change in the mean  shifts the distribution to the left or to the right
along the x-axis:

X
1 2 3
1 < 2 < 3
( Constant)
The different values of the standard deviation, (which is a measure of dispersion), determine the flatness or
peakedness of the normal curve. In other words, achange in the standard deviation on  flattens it or compresses it
while leaving its centre in the same position:

Virtual University of Pakistan 221

STA301 – Statistics and Probability

1
1 < 2 < 3
( Constant)

2

3

X

Property No. 2

The normal curve is asymptotic to the x-axis as x   .

Property No. 3

Because of the symmetry of the normal curve, 50% of the area is to the right of a vertical line erected at the mean, and
50% is to the left.(Since the total area under the normal curve from -  to +  is unity, therefore the area to the left of 
is 0.5 and the area to the right of  is also 0.5.)

Property No. 4

The density function attains its maximum value at x =  and falls off symmetrically on each side of . This is why the
mean, median and mode of the normal distribution are all equal to .

- 
Mean = Median = Mode
Property No. 5

Since the normal distribution is absolutely symmetrical, hence 3 , the third moment about the mean is zero.

Property No. 6

For the normal distribution, it can be mathematically proved that 4 = 3 4

Property No. 7
The moment ratios of the normal distribution come out to be 0 and 3 respectively:

Moment
 3 0
Ratios:
  2  2 
0,
1
 23  3
2

Virtual University of Pakistan 222


 4 34
 
2
2
2
 2
2
3
NOTE
Because of the fact that, for the normal distribution, 2 comes out to be 3, this is why this value has been taken as a
criterion for measuring the kurtosis of any distribution: The amount of peakedness of the normal curve has been taken
as a standard, and we say that this particular distribution is masochistic. Any distribution for which 2 is greater than 3
is more peaked than the normal curve, and is called leptokurtic; Any distribution for which 2 is less than 3 is less
peaked than the normal curve, and is called platykurtic.

Property No. 8
No matter what the values of  and  are, areas under the normal curve remain in certain fixed proportions within a
specified number of standard deviations on either side of .
For the normal distribution:
 The interval    will always contain 68.26% of the total area.

0.1587 0.6826 0.1587

X
 – 1  + 1


 The interval  + 2 will always contain 95.44% of the total area.

0.0228 0.9544 0.0228

X
–2  +2
 The interval   3 will always contain 99.73% of the total area.

0.00135 0.9973 0.00135

X
 – 3   + 3
STA301 – Statistics and Probability

Combining the above three results, we have:

-3 -2 -  + +2 +3

68.26%
95.44%

At this point, the student are reminded of the Empirical Rule that was discussed during the first part of this course ---
that on descriptive statistics. You will recall that, in the case of any approximately symmetric hump-shaped frequency
distribution, approximately 68% of the data-values lie betweenX + S, approximately 95% between the X + 2S, and
approximately 100% between X + 3S.You can now recognize the similarity between the empirical rule and the
property given above. (In case a distribution is absolutely normal, the areas in the above-mentioned ranges are 68.26%,
95.44% and 99.73%; in case a distribution approximately normal, the areas in these ranges will be approximately equal
to these percentages.)

Property No. 9
The normal curve contains points of inflection (where the direction of concavity changes) which are
equidistant from the mean. Their coordinates on the XY-plane are
 1   1 
   
, 2e  and     , 2e 

   
respectively.

Points of Inflection

-  +
Next, we consider the concept of the Standard Normal Distribution:

THE STANDARD NORMAL DISTRIBUTION

A normal distribution whose mean is zero and whose standard deviation is 1 is known as the standard normal
distribution.

-1 0 1
=1

Virtual University of Pakistan 224

This distribution has a very important role in computing areas under the normal curve. The reason is that the
mathematical equation of the normal distribution is so complicated that it is not possible to find areas under the normal
curve by ordinary integration. Areas under the normal curve have to be found by the more advanced method of
numerical integration. The point to be noted is that areas under the normal curve have been computed for that particular
normal distribution whose mean is zero and whose standard deviation is equal to 1, i.e. the standard normal distribution.

Areas under the Standard Normal Curve

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0159 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2083 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2380 0.2422 0.2454 0.2486 0.2518 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3880
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3990 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4430 0.4441
1.6 0.4452 0.4463 0.4474 0.4485 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4690 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4758 0.4762 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4865 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4980 0.4980 0.4981
2.9 0.4981 0.4982 0.4983 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.49865 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
3.1 0.49903 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993

In any problem involving the normal distribution, the generally established procedure is that the normal distribution
under consideration is converted to the standard normal distribution. This process is called standardization. The
formula for converting N (, ) to N (0, 1) is:
THE PROCESS OF STANDARDIZATION
The standardization formula is:

X
Z 
If X is N (, ), then Z is N (0, 1). In other words, the standardization formula given above converts our normal
distribution to the one whose mean is 0 and whose standard deviation is equal to 1.

-1 0 1
=1
We illustrate this concept with the help of an interesting example:

EXAMPLE

The length of life for an automatic dishwasher is approximately normally distributed with a mean life of 3.5 years and a
standard deviation of 1.0 years. If this type of dishwasher is guaranteed for 12 months, what fraction of the sales will
require replacement?
SOLUTION

Since 12 months equal one year, hence we need to compute the fraction or proportion of dishwashers that will cease to
function before a time-span of one year. In other words, we need to find the probability that a dishwasher fails before
one year.

1.0 3.5 X
In order to find this area we nee to standardize normal distribution i.e. to convert N(3.5, 1) to N(0, 1):

The method is
X   X  3.5
Z 
 1.0

The X-value representing the

warranty
period is 1.0 so

1.0  3.5  2.5

Z 1.0  1  2.5
STA301 – Statistics and Probability

X
- 1.0 3.5

- Z
-2.5 0
Now we need to find the area under the normal curve from z= - to Z = -2.5. Looking at the area table of the standard
normal distribution, we find that Area from 0 to 2.5 = 0.4938

0.4938

0 2.5
Hence: The area from X = 2.5 to  is 0.0062

0.0062

0 2.5 
But, this means that the area from - to -2.5 is also 0.0062, as shown in the following figure:

0.0062
--2.5
0

Virtual University of Pakistan 227

This means that the probability of a dishwasher lasting less than a year is 0.0062 i.e. 0.62% --- even less than
1%.Hence, the owner of the factory should be quite happy with the decision of placing a twelve-month guarantee on the
dishwasher! Next, we discuss the Inverse use of the Table of Areas under the Normal Curve. In the above example, we
were required to find a certain area against a given x-value. In some situations, we are confronted with just the opposite
--- we are given certain areas, and we are required to find the corresponding x-values. We illustrate this point with the
help of the following example:

EXAMPLE

The heights of applicants to the police force in a certain country are normally distributed with mean 170 cm and
standard deviation 3.8 cm. If 1000 persons apply for being inducted into the police force, and it has been decided that
not more than 70% of these applicants will be accepted, (and the shortest 30% of the applicant are to be rejected), what
is the minimum acceptable height for the police force?

SOLUTION:
We have:

- 170 
3.8
We need to compute the x-value to the left of which, there exists 30% area

30%
20% 50%
-
170 
3.8
The standardization formula can be re-written as

X
Z 
The Z value to the left of which there exists 30% area is obtained as follows.
0.5
0.2 0.3

- Z
0
z
By studying the figures inside the body of the area table of the standard normal distribution, we find that:
 The area between z = 0
and z = 0.52 is 0.1985,
and
 The area between z =
0 and z = 2.53 is
0.2019
Since 0.1985 is closer to 0.2000 than 0.2019, hence 0.52 is taken as the appropriate z-value.

0.5 0.2 0.3

Z
- 0 0.52

But, we are interested not in the upper 30% but the lower 30% of the applicants.
Hence, we have:

0.3 0.2 0.5

Z
- -0.52 0

Since the normal distribution is absolutely symmetrical, hence the z-value to the left of which there exists 30% area (on
the left-hand-side of the mean) will be at exactly the same distance from the mean as the z-value to the right of which
there exists 30% area (on the right-hand-side of the mean).
Substituting z = -0.52 in the standardization formula, we obtain:
X = 170 + 3.8 Z
= 170 + 3.8 (-0.52)
= 170 - 1.976
= 168.024 168 cm
Hence, the minimum acceptable height for the police force is 168 cm. Just as binomial, Poisson and other discrete
distributions can be fitted to real-life data; similarly, the normal distribution can also be FITTED to real data.
This can be done by equating  to X, the mean computed from the observed frequency distribution (based on sample
data), and  to S, the standard deviation of the observed frequency distribution. Of course, this should be done only if
STA301 – Statistics and Probability

we are reasonably sure that the shape of the observed frequency distribution is quite similar to that of the normal
distribution. (As indicated in the case of the fitting of the binomial distribution to real data), in order to decide whether
or not our fitted normal distribution is a reasonably good fit, the proper statistical procedure is the Chi-square Test of
Goodness of Fit.

NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION

The probability for a binomial random variable X to take the value x is

n 
f x   px q x nx
 ,


for 0  x  n and q  p  1.
The above formula becomes cumbersome to apply if n is LARGE. In such a situation, as long as neither p nor q is
close to zero, we can compute the required probabilities by applying the normal approximation to the binomial
distribution. The binomial distribution can be quite closely approximated by the normal distribution when n is
sufficiently large and neither p nor q is close to zero. As a rule of thumb, the normal distribution provides a reasonable
approximation to the binomial distribution if both np and nq are equal to or greater than 5, i.e.
np > 5 and nq > 5

EXAMPLE:

Suppose that a past record indicate that, in a particular province of an under-developed country, the death rate from
Malaria is 20%. Find the probability that in a particular village of that particular province, the number of deaths is
between 70 and 80 (inclusive) out of a total of 500 patients of Malaria.

SOLUTION:

Regarding ‘death from Malaria’ as success, we have

n = 500
and p = 0.20.

It is obvious that it is very cumbersome to apply the binomial formula in order to compute P(70 < X < 80).
In this problem,
np = 500(0.2) = 100 > > > 5, and nq = 500(0.8) = 400 > > > 5,

therefore we can happily apply the normal approximation to the binomial distribution. In order to apply the normal
approximation to the binomial, we need to keep in mind the following two points:
1) The first point is: The mean and variance of the binomial distribution valid in our problem will be regarded as the
mean and variance of the normal distribution that will be used to approximate the binomial distribution.
In this problem, we have:

and   np  500  0.20  100

  npq  500  0.20  0.80 
2

80
npq 80
Hence
   8.94
2) The second important point is:

We need to apply a correction that is known as the Continuity Correction. The rationale for this correction is as follows:
The binomial distribution is essentially a discrete distribution whereas the normal distribution is a continuous
distribution i.e.:

BINOMIAL DISTRIBUTION

Virtual University of Pakistan 230

NORMAL DISTRIBUTION

In applying the normal approximation to the binomial, we have the following situation:

THE NORMAL DISTRIBUTION SUPERIMPOSED ON THE BINOMIAL DISTRIBUTION

But, the question arises: “How can a set of distinct vertical lines be replaced by a continuous curve?”
In order to overcome this problem, what we do is to replace every integral value x of our binomial random variable by
an interval x - 0.5 to x + 0.5. By doing so, we will have the following situation. The x-value 70 is replaced by the
interval 69.5 - 70.5, The x-value 71 is replaced by the interval 70.5 - 71.The x-value 72is replaced by the interval 71.5 -
72.5..................The x-value 80 is replaced by the interval 79.5 - 80.5
Hence:
Applying the continuity correction,
P(70 < X < 80)
is replaced by
P(69.5 < X < 80.5).
Accordingly, the area that we need to compute is the area under the normal curve between the values 69.5 and 80.5.
It is left to the students to compute this area, and thus determine the required probability. (This computation
involves a few steps.)

By doing so, the students will find that, in that particular village of that province, the probability that the number of
deaths from Malaria in a sample of 500 lies between 70 and 80 (inclusive) is 0.0145 i.e. 1½%.
This brings us to the end of the second part of this course i.e. Probability Theory.
In the next lecture, we will begin the third and last portion of this course i.e. Inferential Statistics----that area
of Statistics which enables us to draw conclusions about various phenomena on the basis of data collected on sample
basis.

Statistics MCQs for Competitive Exams
100% (4)
Statistics MCQs for Competitive Exams
8 pages
Statistics MCQ'S I.Com Part 2
83% (18)
Statistics MCQ'S I.Com Part 2
8 pages
MCQs Basic Statistics 1
85% (47)
MCQs Basic Statistics 1
6 pages
Statistics Mcqs With Answers NEW
100% (8)
Statistics Mcqs With Answers NEW
59 pages
Statistical Inference Mcqs Final
100% (1)
Statistical Inference Mcqs Final
35 pages
MCQ For Statistics 621 With Solutions PDF
85% (13)
MCQ For Statistics 621 With Solutions PDF
21 pages
MCQ Measures of Central Tendency With Correct Answers PDF
100% (1)
MCQ Measures of Central Tendency With Correct Answers PDF
10 pages
Linear Regression Analysis Mcqs
86% (7)
Linear Regression Analysis Mcqs
2 pages
MCQs On Correlation and Regression Analysis
82% (17)
MCQs On Correlation and Regression Analysis
3 pages
MCQs Unit 4 Correlation and Regression
80% (10)
MCQs Unit 4 Correlation and Regression
14 pages
MCQs on Hypothesis Testing Concepts
100% (1)
MCQs on Hypothesis Testing Concepts
7 pages
Business Statistics MCQs
75% (4)
Business Statistics MCQs
24 pages
MCQ Testing of Hypothesis
88% (8)
MCQ Testing of Hypothesis
10 pages
MCQ Binomial and Hypergeometric Probability Distribution With Correct Answers
83% (6)
MCQ Binomial and Hypergeometric Probability Distribution With Correct Answers
5 pages
MCQ Testing of Hypothesis With Correct Answers
93% (15)
MCQ Testing of Hypothesis With Correct Answers
7 pages
Mcqs Time Series 1
100% (10)
Mcqs Time Series 1
3 pages
MCQ Testing of Hypothesis With Correct Answers
100% (7)
MCQ Testing of Hypothesis With Correct Answers
8 pages
MCQs on Estimation and Confidence Intervals
100% (3)
MCQs on Estimation and Confidence Intervals
7 pages
MCQs On Correlation and Regression Analysis 2
78% (9)
MCQs On Correlation and Regression Analysis 2
3 pages
Biostatistics & Probability MCQs
100% (1)
Biostatistics & Probability MCQs
8 pages
MCQs On Correlation and Regression Analysis 1
100% (10)
MCQs On Correlation and Regression Analysis 1
3 pages
MCQ
80% (5)
MCQ
2 pages
Statistical MCQ
100% (2)
Statistical MCQ
7 pages
Hypothesis Testing MCQ (Free PDF) - Objective Question Answer For Hypothesis Testing Quiz - Download Now!
No ratings yet
Hypothesis Testing MCQ (Free PDF) - Objective Question Answer For Hypothesis Testing Quiz - Download Now!
24 pages
Sampling Techniques MCQ
100% (2)
Sampling Techniques MCQ
47 pages
Estimation Theory MCQ
86% (7)
Estimation Theory MCQ
8 pages
Statistics MCQs with Answers
100% (2)
Statistics MCQs with Answers
6 pages
MCQ Time Series With Correct Answers
100% (10)
MCQ Time Series With Correct Answers
5 pages
MCQ Sampling and Sampling Distributions Wiht Correct Answers
100% (20)
MCQ Sampling and Sampling Distributions Wiht Correct Answers
6 pages
MCQ Random Variable and Probability Distributions Wiht Correct Answers
82% (11)
MCQ Random Variable and Probability Distributions Wiht Correct Answers
4 pages
MCQ Normal Distribution With Correct Answers
100% (18)
MCQ Normal Distribution With Correct Answers
6 pages
Chapter 23 Bivariate Statistical Analysis MCQS
100% (2)
Chapter 23 Bivariate Statistical Analysis MCQS
3 pages
Statistical Methods MCQ'S
91% (11)
Statistical Methods MCQ'S
41 pages
Mcqs On Index Numbers
100% (1)
Mcqs On Index Numbers
17 pages
AS MCQ New
100% (2)
AS MCQ New
13 pages
MCQ Index Numbers With Correct Answers
89% (19)
MCQ Index Numbers With Correct Answers
5 pages
Chapter 15: Chi Squared Tests
No ratings yet
Chapter 15: Chi Squared Tests
29 pages
Non Parametric 307 MCQ
82% (11)
Non Parametric 307 MCQ
3 pages
Applied Statistics MCQ
0% (2)
Applied Statistics MCQ
7 pages
Mcqs Time Series 2
67% (6)
Mcqs Time Series 2
3 pages
Two Way Anova (18 Ms PT Amd 03, 18 Ms PT Amd 14)
100% (2)
Two Way Anova (18 Ms PT Amd 03, 18 Ms PT Amd 14)
2 pages
Regression and Correlation MCQs
100% (1)
Regression and Correlation MCQs
9 pages
Statistics Mcqs - Estimation Part 1: Examrace
100% (1)
Statistics Mcqs - Estimation Part 1: Examrace
7 pages
Topic 5: Correlation: A. B. C. D. A. B. C. D. A. B. C. D. E. A. B. C. D. A. C. B. D. A. B. C. D. A. C. B. D
100% (3)
Topic 5: Correlation: A. B. C. D. A. B. C. D. A. B. C. D. E. A. B. C. D. A. C. B. D. A. B. C. D. A. C. B. D
4 pages
MCQs On Correlation and Regression Analysis
100% (6)
MCQs On Correlation and Regression Analysis
3 pages
Sampling Distributions, Estimation and Hypothesis Testing - Multiple Choice Questions
100% (1)
Sampling Distributions, Estimation and Hypothesis Testing - Multiple Choice Questions
6 pages
MCQ Introduction With Correct Answers
No ratings yet
MCQ Introduction With Correct Answers
5 pages
Statistics Mcqs Book Good For Practice
No ratings yet
Statistics Mcqs Book Good For Practice
86 pages
Epidemiology & Biostatistics Quiz Questions
No ratings yet
Epidemiology & Biostatistics Quiz Questions
179 pages
STS 201 Tutorial Questions by Godspeed & Poseidon-1
No ratings yet
STS 201 Tutorial Questions by Godspeed & Poseidon-1
48 pages
1st Year CH # 01
No ratings yet
1st Year CH # 01
11 pages
Statistics MCQs
100% (1)
Statistics MCQs
22 pages
Psychology 4th Semester
No ratings yet
Psychology 4th Semester
40 pages
Statistics Exam Multiple Choice Questions
100% (1)
Statistics Exam Multiple Choice Questions
28 pages
QT 600
0% (1)
QT 600
20 pages
Statistics Quiz: Concepts and Variables
No ratings yet
Statistics Quiz: Concepts and Variables
8 pages
Probability and Statistics
No ratings yet
Probability and Statistics
6 pages
Statistics & Research MCQs
No ratings yet
Statistics & Research MCQs
31 pages
Ecn 103-1-1
No ratings yet
Ecn 103-1-1
6 pages
Tutorial Chapter 1 Apr2022
No ratings yet
Tutorial Chapter 1 Apr2022
9 pages
Slides Data
No ratings yet
Slides Data
1 page
Azhar
No ratings yet
Azhar
24 pages
Effective Communication
No ratings yet
Effective Communication
1 page
PACRA Maintains Pakistan Refinery Ratings
No ratings yet
PACRA Maintains Pakistan Refinery Ratings
8 pages
Cost Classification in Management Accounting
0% (1)
Cost Classification in Management Accounting
8 pages
Graphic Design Admission List
No ratings yet
Graphic Design Admission List
3 pages
4.6.5 Packet Tracer - Connect A Wired and Wireless Lan - en XL
No ratings yet
4.6.5 Packet Tracer - Connect A Wired and Wireless Lan - en XL
4 pages
SketchUp Tutorial
No ratings yet
SketchUp Tutorial
16 pages
473457
No ratings yet
473457
10 pages
Senior Java Developer Resume Profile
No ratings yet
Senior Java Developer Resume Profile
9 pages
Mixed Quiz 5
No ratings yet
Mixed Quiz 5
3 pages
3D Object Representations
No ratings yet
3D Object Representations
45 pages
TSB wk2 1900411
No ratings yet
TSB wk2 1900411
2 pages
Software Cost Estimation Techniques
No ratings yet
Software Cost Estimation Techniques
22 pages
Teleportation Paper
No ratings yet
Teleportation Paper
2 pages
Böning Automationstechnologie GMBH & Co. KG Am Steenöver 4 D-27777 Ganderkesee Germany Internet
No ratings yet
Böning Automationstechnologie GMBH & Co. KG Am Steenöver 4 D-27777 Ganderkesee Germany Internet
76 pages
Investigation of The Mechanical Properties Surface
No ratings yet
Investigation of The Mechanical Properties Surface
23 pages
Chevrolet Luv Dmax CNG Wiring Harness: O Sensor
100% (1)
Chevrolet Luv Dmax CNG Wiring Harness: O Sensor
1 page
Name of File Payslip For Periode 6-6 (07 - 16 - 2024)
No ratings yet
Name of File Payslip For Periode 6-6 (07 - 16 - 2024)
2 pages
PTM Pada Remaja - Posyandu Remaja
No ratings yet
PTM Pada Remaja - Posyandu Remaja
48 pages
Computer Network and Information Security
No ratings yet
Computer Network and Information Security
33 pages
CNET307 Assignment 2 - Project Charter
No ratings yet
CNET307 Assignment 2 - Project Charter
11 pages
Scrutiny Form PDF
No ratings yet
Scrutiny Form PDF
6 pages
Type of Sampler
No ratings yet
Type of Sampler
10 pages
Principle of EE1 Lesson 5
No ratings yet
Principle of EE1 Lesson 5
61 pages
MSEC Electronics Engineering Syllabus 2024
No ratings yet
MSEC Electronics Engineering Syllabus 2024
68 pages
Description and Discussion On Dcase 2025 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection For Machine Condition Monitoring
No ratings yet
Description and Discussion On Dcase 2025 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection For Machine Condition Monitoring
4 pages
(@bohring - Bot) BT and PnC-L4
No ratings yet
(@bohring - Bot) BT and PnC-L4
9 pages
Forrester The Total Economic Impact of Talend
No ratings yet
Forrester The Total Economic Impact of Talend
32 pages
Reduced Harmonics Technologyin Altivar 212 Adjustable Speed Drive Controllers
No ratings yet
Reduced Harmonics Technologyin Altivar 212 Adjustable Speed Drive Controllers
8 pages
RAS English Course
No ratings yet
RAS English Course
55 pages
Library User Orientation 2016
No ratings yet
Library User Orientation 2016
1 page
Chapter 2 - System Planning
No ratings yet
Chapter 2 - System Planning
66 pages
Lesson Plan - Pivot Table
No ratings yet
Lesson Plan - Pivot Table
8 pages
Data Sheet Apaq c130 RTD en
No ratings yet
Data Sheet Apaq c130 RTD en
3 pages
Application ID 8394A00221 Do Id 8394A002211: Customer Name Nischay Nischay
No ratings yet
Application ID 8394A00221 Do Id 8394A002211: Customer Name Nischay Nischay
2 pages