Statistical Models and Data Classification
Statistical Models and Data Classification
MCQ No 1.1
The science of collecting, organizing, presenting, analyzing and interpreting data to
assist in making more effective decisions is called:
(a) Statistic (b) Parameter (c) Population (d) Statistics
MCQ No 1.2
Methods of organizing, summarizing, and presenting data in an informative way are called:
(a) Descriptive statistics (b) Inferential statistics (c) Theoretical statistics (d) Applied statistics
MCQ No 1.3
The methods used to determine something about a population on the basis of a sample
is called:
(a) Inferential statistics (b) Descriptive statistics (c) Applied statistics (d) Theoretical statistics
MCQ No 1.4
When the characteristic being studied is nonnumeric, it is called a:
(a) Quantitative variable (b) Qualitative variable (c) Discrete variable (d) Continuous variable
MCQ No 1.5
When the variable studied can be reported numerically, the variable is called a:
(a) Quantitative variable (b) Qualitative variable (c) Independent variable (d) Dependent variable
MCQ No 1.6
A specific characteristic of a population is called:
(a) Statistic (b) Parameter (c) Variable (d) Sample
MCQ No 1.7
A specific characteristic of a sample is called:
(a) Variable (b) Constant (c) Parameter (d) Statistic
MCQ No 1.8
A set of all units of interest in a study is called:
(a) Sample (b) Population (c) Parameter (d) Statistic
MCQ No 1.9
A part of the population selected for study is called a:
(a) Variable (b) Data (c) Sample (d) Parameter
MCQ No 1.10
Listing of the data in order of numerical magnitude is called:
(a) Raw data (b) Arrayed data (c) Discrete data (d) Continuous data
MCQ No 1.11
Listings of the data in the form in which these are collected are known as:
(a) Secondary data (b) Raw data (c) Arrayed data (d) Qualitative data
MCQ No 1.12
Data that are collected by any body for some specific purpose and use are called:
(a) Qualitative data (b) Primary data (c) Secondary data (d) Continuous data
MCQ No 1.13
The data which have under gone any treatment previously is called:
(a) Primary data (b) Secondary data (c) Symmetric data (d) Skewed data
MCQ No 1.14
The data obtained by conducting a survey is called:
(a) Primary data (b) Secondary data (c) Continuous data (d) Qualitative data
MCQ No 1.15
The data collected from published reports is known as:
(a) Discrete data (b) Arrayed data (c) Secondary data (d) Primary data
MCQ No 1.16
A survey in which information is collected from each and every individual of the population is
known as:
(a) Sample survey (b) Pilot survey (c) Biased survey (d) Census survey
MCQ No 1.17
Data used by an agency which originally collected them are:
(a) Primary data (b) Raw data (c) Secondary data (d) Grouped data
MCQ No 1.18
Registration is the source of:
(a) Primary data (b) Secondary data (c) Qualitative data (d) Continuous data
MCQ No 1.19
Data in the population census reports are:
(a) Ungrouped data (b) Secondary data (c) Primary data (d) Arrayed data
MCQ No 1.20
Issuing a national identity card is an example of:
(a) Sampling (b) Statistic (c) Census (d) Registration
MCQ No 1.21
A variable that assumes only some selected values in a range is called:
(a) Continuous variable (b) Quantitative variable (c) Discrete variable (d) Qualitative variable
MCQ No 1.22
A variable that assumes any value within a range is called:
(a) Discrete variable (b) Continuous variable (c) Independent variable (d) Dependent variable
MCQ No 1.23
A variable that provides the basis for estimation is called:
(a) Dependent variable (b) Independent variable (c) Continuous variable (d) Qualitative variable
MCQ No 1.24
The variable that is being predicted or estimated is called:
(a) Dependent variable (b) Independent variable (c) Discrete variable (d) Continuous variable
MCQ No 1.25
Monthly rainfall in a city during the last ten years is an example of a:
(a) Discrete variable (b) Continuous variable (c) Qualitative variable (d) Independent variable
MCQ No 1.26:
The proportion of females in a sample of 50 accounts officers is an example of a:
(a) Parameter (b) Statistic (c) Array (d) Variable
MCQ No 1.27:
Number of family members in different families in a town is an example of a:
(a) Discrete variable (b) Continuous variable (c) Dependent variable (d) Qualitative variable
MCQ No 1.28
Colours of flowers are an example of:
(a) Quantitative variable (b) Qualitative variable (c) Skewed variable (d) Symmetric variable
MCQ No 1.29
If each measurement in a data set falls into one and only one of a set of categories,
the data set is called:
(a) Quantitative (b) Qualitative (c) Continuous (d) Constant
MCQ No 1.30
Any phenomenon which is not measurable is called:
(a) Variable (b) Constant (c) Parameter (d) Attribute
MCQ No 1.31
A constant can assume values:
(a) Zero (b) One (c) Fixed (d) Not fixed
MCQ No 1.32
A value which does not change from one individual to another individual is called:
(a) Variable (b) Statistic (c) Constant (d) Array
MCQ No 1.33
In the plural sense, statistics means:
(a) Numerical data (b) Methods (c) Population data (d) Sample data
MCQ No 1.34
In the singular sense, statistics means:
(a) Methods (b) Numerical data (c) Sample data (d) Population data
MCQ No 1.35
Weight of earth is:
(a) Discrete variable (b) Qualitative variable (c) Continuous variable (d) Difficult to tell
MCQ No 1.36
Weights of students in a class marks is a:
(a) Discrete data (b) Continuous data (c) Qualitative data (d) Constant data
MCQ No 1.37
Life of a T.V tube is a:
(a) Discrete variable (b) Continuous variable (c) Qualitative variable (d) Constant
MCQ No 1.38
Questionnaire method is used in collecting:
(a) Primary data (b) Secondary data (c) Published data. (d) True data
MCQ No 1.39
Census returns are:
(a) Primary data (b) Secondary data (c) Qualitative data (d) True data
MCQ No 1.40
Students divided into different groups according to their intelligence and gender
will generate:
(a) Quantitative data (b) Qualitative data (c) Continuous data (d) Constant
MCQ No 1.41
Statistics are:
(a) Aggregate of facts and figures (b) Always true (c) Always continuous (d) Always qualitative
MCQ No 1.42
Statistics results are:
(a) Randomly true (b) Always true (c) Not true (d) True on average
MCQ No 1.44
A statistical population may consist of:
(a) Finite number of values (b) Infinite number of values
(c) Either of (a) and (b) (d) None of (a) and (b)
MCQ No 1.45
The only continuous variable here is:
(a) Rain fall on different days in a city (b) Number of customers entering a store on different days
(c) Number of flights landing on an airport on different days (d) None of them
MCQ No 1.46
Example of descriptive statistics is:
(a) 70% people in Pakistan live in rural areas. (b) 50% people are likely to vote in the national
election (c) 20% of the bulbs produced in a factory will be defective (d) Difficult to tell.
MCQ No 1.47
Example of inferential statistics is:
(a) Percentage of smokers in Pakistan (b) Percentage of skilled workers in a factory.
(c) Estimate of increase in prices in the next year (d) None of the above
MCQ No 1.48
Statistics are always:
(a) Exact (b) Estimated values (c) Constant (d) Population values
MCQ No 1.49
Statistics must be:
(a) Comparable (b) Not comparable (c) Discrete in nature (d) Qualitative in nature
MCQ No 1.50
Given 6 quantities, X1 through X6, the correct notation for adding quantities 3 through 6 is:
MCQ No 1.51
(a) Add all quantities from Y1 through Yn (b) Add all quantities from Y=2 through Yn
(c) Add all quantities from Y=2 through Y=n (d) Add all quantities from Y2 through Yn
MCQ No 1.53
MCQ No 1.54
The figure 22.25 rounded to one decimal place is:
(a) 22.3 (b) 22.1 (c) 22.2 (d) 22
MCQ No 1.55
The figure 22.15 rounded to one decimal place is:
(a) 22.2 (b) 22.1 (c) 22 (d) 22.3
MCQ No 1.56
The figure 22.26 rounded to one decimal place is:
(a) 22.2 (b) 22.3 (c) 22.1 (d) 22
MCQ No 1.57
The figure 22.24 rounded to one decimal place is:
(a) 22.2 (b) 22.3 (c) 22.1 (d) 22
MCQ No 1.58
How many methods are used for the collection of data?
(a) 4 (b) 3 (c) 2 (d) 1
MCQ’S OF PRESENTATION OF DATA
MCQ No 2.1:
When data are classified according to a single characteristic, it is called:
(a) Quantitative classification (b) Qualitative classification
(c) Area classification (d) Simple classification
MCQ No 2.2:
Classification of data by attributes is called:
(a) Quantitative classification (b) Chronological classification
(c) Qualitative classification (d) Geographical classification
MCQ No 2.3:
Classification of data according to location or areas is called:
(a) Qualitative classification (b) Quantitative classification
(c) Geographical classification (d) Chronological classification
MCQ No 2.4:
Classification is applicable in case of:
(a) Normal characters (b) Quantitative characters (c) Qualitative characters (d) Both (b) and (c)
MCQ No 2.5:
In classification, the data are arranged according to:
(a) Similarities (b) Differences (c) Percentages (d) Ratios
MCQ No 2.6:
When data are arranged at regular interval of time, the classification is called:
(a) Qualitative (b) Quantitative (c) Chronological (d) Geographical
MCQ No 2.7:
When an attribute has more than three levels it is called:
(a) Manifold-division (b) Dichotomy (c) One-way (d) Bivariate
MCQ No 2.8:
The series
Country Pakistan India Britain Egypt Japan
Birth rate 45 40 10 35 10
is of the type:
(a) Discrete (b) Continuous (c) Individual (d) Time series
MCQ No 2.9:
The series
Country Pakistan India Britain Egypt Japan
Death rate 15 16 10 12 10
is of the type:
(a) Inclusive (b) Exclusive (c) Geographical (d) Time series
MCQ No 2.10
In an array, the data are:
(a) In ascending order (b) In descending order (c) Either (a) or (b) (d) Neither (a) or (b)
MCQ No 2.11
The number of tally sheet count for each value or a group is called:
(a) Class limit (b) Class width (c) Class boundary (d) Frequency
MCQ No 2.12
The frequency distribution according to individual variate values is called:
(a) Discrete frequency distribution (b) Cumulative frequency distribution
(c) Percentage frequency distribution (d) Continuous frequency distribution
MCQ No 2.13
A series arranged according to each and every item is known as:
(a) Discrete series (b) Continuous series (c) Individual series (d) Time series
MCQ No 2.14
A frequency distribution can be:
(a) Qualitative (b) Discrete (c) Continuous (d) Both (b) and (c)
MCQ No 2.15
The following frequency distribution:
X 5 15 38 47 68
f 2 4 9 3 1
Is classified
(a) Relative frequency distribution (b) Continuous distribution
(c) Percentage frequency distribution (d) Discrete distribution
MCQ No 2.16
Frequency distribution is often constructed with the help of:
(a) Entry table (b) Tally sheet (c) Both (a) and (b) (d) Neither (a) and (b)
MCQ No 2.17
The data given as 3, 5, 15, 35, 70, 84, 96 will be called as:
(a) Individual series (b) Discrete series (c) Continuous series (d) Time series
MCQ No 2.18
Frequency of a variable is always in:
(a) Fraction form (b) Percentage form (c) Less than form (d) Integer form
MCQ No 2.19
Data arranged in ascending or descending order of magnitude is called:
(a) Ungrouped data (b) Grouped data (c) Discrete frequency distribution (d) Arrayed data
MCQ No 2.20
The grouped data are called:
(a) Primary data (b) Secondary data (c) Raw data (d) Difficult to tell
MCQ No 2.21
A series of data with exclusive classes along with the corresponding frequencies is called:
(a) Discrete frequency distribution (b) Continuous frequency distribution
(c) Percentage frequency distribution (d) Cumulative frequency distribution
MCQ No 2.22
In an exclusive classification, the limits excluded are:
(a) Upper limits (b) Lower limits (c) Both lower and upper limits (d) Either lower or upper limits
MCQ No 2.23
The series
Weights(pounds) 15----20 20----25 25----30 30----35 35----40
No. of items 10 15 30 10 5
is categorized as:
(a) Continuous series (b) Discrete series (c) Time series (d) Geometric series
MCQ No 2.24
The series
Year 2007 2008 2009 2010 2011
Profit (000 Rs.) 7 10 16 18 22
will be called as:
(a) Time series (b) Discrete series (c) Continuous series (d) Individual series
MCQ No 2.25:
The suitable formula for computing the number of classes is:
(a) 3.322 logN (b) 0.322 logN (c) 1+3.322 logN (d) 1- 3.322 logN
MCQ No 2.26:
The number of classes in a frequency distribution is obtained by dividing the range of variable by
the:
(a) Total frequency (b) Class interval (c) Mid-point (d) Relative frequency
MCQ No 2.27:
If the number of workers in a factory is 256, the number of classes will be:
(a) 8 (b) 9 (c) 10 (d) 12
MCQ No 2.28:
The largest and the smallest values of any given class of a frequency distribution are called:
(a) Class Intervals (b) Class marks (c) Class boundaries (d) Class limits
MCQ No 2.29
If there are no gaps between consecutive classes, the limits are called:
(a) Class limits (b) Class boundaries (c) Class intervals (d) Class marks
MCQ No 2.30
The extreme values used to describe the different classes in a frequency distribution are called:
(a) Class intervals (b) Class boundaries (c) Class limits (d) Cumulative frequency
MCQ No 2.31
If in a frequency table, either the lower limit of first class or the upper limit of last class is not a fixed
number, then classes are called:
(a) One-way classes (b) Two-way classes (c) Discrete classes (d) Open-end classes
MCQ No 2.32
The class boundaries can be taken when the nature of variable is:
(a) Discrete (b) Continuous (c) Both (a) and (b) (d) Qualitative
MCQ No 2.33
Class boundaries are also called:
(a) Mathematical limits (b) Arithmetic limits (c) Geometric limits (d) Qualitative limits
MCQ No 2.34
The average of lower and upper class limits is called:
(a) Class boundary (b) Class frequency (c) Class mark (d) Class limit
MCQ No 2.35
The lower and upper class limits are 20 and 30, the midpoints of the class is:
(a) 20 (b) 25 (c) 30 (d) 50
MCQ No 2.36
A frequency distribution that contains a class with limits of "10 and under 20" would have a midpoint:
(a) 10 (b) 14.9 (c) 15 (d) 20
MCQ No 2.37
If the number of workers in a factory is 128 and maximum and minimum hourly wages are 100 and 20
respectively. For the frequency distribution of hourly wages, the class interval is:
(a) 8 (b) 9 (c) 10 (d) 80
MCQ No 2.38
Width of interval h is equal to:
MCQ No 2.39
Length of interval is calculated as:
(a) The difference between upper limit and lower limit (b) The sum of upper limit and lower limit
(c) Half of the difference between upper limit and lower limit (d) Half of the sum of upper limit and lower limit
MCQ No 2.40
The class marks are given below:
10,12,14,16,18. The first class of the distribution is:
(a) 9----12 (b) 10.5----12.5 (c) 9----11 (d) 10----12
MCQ No 2.41
If the midpoints are 10, 15, 20, 25 and 30. The last class boundary of the distribution is:
(a) 25----30 (b) 27.5----32.5 (c) 20----35 (d) 30----35
MCQ No 2.42
The number of classes depends upon:
(a) Class marks (b) Frequency (c) Class interval (d) Class boundary
MCQ No 2.43
The class interval is the difference between:
(a) Two extreme values (b) Two successive frequencies
(c) Two successive upper limits (d) Two largest values
MCQ No 2.44
When the classes are 40----44, 45----49, 50----54, ... the class interval is:
(a) 4 (b) (c) 100 (d) 5
MCQ No 2.45:
A grouping of data into mutually exclusive classes showing the number of observations in each class
is called:
(a) Frequency polygon (b) Relative frequency
(c) Frequency distribution (d) Cumulative frequency
MCQ No 2.46:
The following frequency distribution
Classes Less than 2 Less than 4 Less than 6 Less than 8 Less than 10
Frequency 2 6 16 19 20
is classified as:
(a) Inclusive classification (b) Exclusive classification
(c) Discrete classification (d) Cross classification
MCQ No 2.47:
The following frequency distribution
Classes 10----20 20----30 30----40 40----50 50----60
Frequency 2 4 6 4 2
is classified as:
(a) Exclusive classification (b) Inclusive classification
(c) Geographical classification (d) Two-way classification
MCQ No 2.48:
The following frequency distribution
Classes 0----4 5----9 10----14 15----19 20----24
Frequency 2 3 7 5 3
is classified as:
(a) Multiple classification (b) Qualitative classification
(c) Inclusive classification (d) Exclusive classification
MCQ No 2.49:
The following frequency distribution
Classes More than 4 More than 4 More than 6 More than 8 More than 10
Frequency 2 6 16 19 20
is classified as:
(a) Geographical classification (b) Chronological classification
(c) Inclusive classification (d) Exclusive classification
MCQ No 2.50:
The class frequency divided by the total number of observations is called:
(a) Percentage frequency (b) Relative frequency
(c) Cumulative frequency (d) Bivariate frequency
MCQ No 2.51:
The relative frequency multiplied by 100 is called:
(a) Percentage frequency (b) Cumulative frequency
(c) Bivariate frequecy (d) Simple frequency
MCQ No 2.52
In a relative frequency distribution, the total of the relative frequencies is:
(a) 100 (b) One (c) ∑f (d) ∑ X
MCQ No 2.53:
In a percentage frequency distribution, the total of the percentage frequencies is always equal to:
(a) 1 (b) ∑f (c) 100% (d) ∑X
MCQ No 2.54
The cumulative frequency of first group in more than cumulative frequency distribution is always equal to:
(a) 1 (b) 100 (c) ∑f (d) ∑X
MCQ No 2.55
The cumulative frequency of last class in less than cumulative frequency distribution is always equal to:
(a) ∑f (b) ∑X (c) 1 (d) 100
MCQ No 2.56:
The following frequency distribution:
Classes Less than 10 Less than 20 Less than 30 Less than 40 Less than 50
Frequency 2 6 16 19 20
is classified as:
(a) Less than cumulative frequency distribution (b) More than cumulative frequency distribution
(c) Discrete frequency distribution (d) Cumulative percentage frequency distribution
MCQ No 2.57:
The following frequency distribution
Classes 50----55 55----60 60----65 65----70 70----75
Frequency 40 36 30 16 4
is classified as:
(a) Relative frequency distribution (b) Less than cumulative frequency distribution
(c) More than cumulative frequency distribution (d) Bivariate frequency distribution
MCQ No 2.58
A frequency distribution formed considering two variables at a time is called:
(a) Univariate frequency distribution (b) Bivariate frequency distribution
(c) Trivariate frequency distribution (d) Bimodal distribution
MCQ No 2.59
The sum of rows or sum of columns, of a bivariate, frequency distribution is equal to:
(a) ∑X (b) ∑fX (c) ∑(f+X) (d) ∑f
MCQ No 2.60:
The arrangement of data in rows and columns is called:
(a) Classification (b) Tabulation (c) Frequency distribution (d) Cumulative frequency distribution
MCQ No 2.61:
When the qualitative or quantitative raw data are classified according to one characteristic, the
tabulation of different groups is called:
(a) Dichotomy (b) Manifold-division (c) Bivariate (d) One-way
MCQ No 2.62
A statistical table consists of at least:
(a) Two parts (b) Three parts (c) Four parts (d) Five parts
MCQ No 2.63
In a statistical table, prefatory note is shown:
(a) Below the body (b) Box head ` (c) Foot note (d) Below the title
MCQ No 2.64
A source note in a statistical table is given:
(a) At the end of a table (b) In the beginning of a table
(c) In the middle of a table (d) Below the body of a table
MCQ No 2.65
In a statistical table, column captions are called:
(a) Box head (b) Stub (c) Body (d) Title
MCQ No 2.66
In a statistical table, row captions are called:
(a) Box head (b) Stub (c) Body (d) Title
MCQ No 2.67:
The headings of the rows of a table are called:
(a) Prefatory notes (b) Titles (c) Stubs (d) Captions
MCQ No 2.68:
The headings of the columns of a table are called:
(a) Stubs (b) Captions (c) Footnotes (d) Source notes
MCQ No 2.69:
The budgets of two families can be compared by:
(a) Sub-divided rectangles (b) Pie diagram (c) Both (a) and (b) (d) Histogram
MCQ No 2.70:
Total angle of the pie-chart is:
(a) 45 (b) 90 (c) 180 (d) 360
MCQ No 2.71:
Diagram are another form of:
(a) Classification (b) Tabulation (c) Angle (d) Percentage
MCQ No 2.72
In pie diagram, the angle of a sub-sector is obtained as:
MCQ No 2.73:
A pie diagram is represented by a:
(a) Rectangle (b) Circle (c) Triangle (d) Square
MCQ No 2.74:
A sector diagram is also called:
(a) Bar diagram (b) Histogram (c) Historigram (d) Pie diagram
MCQ No 2.75:
Which of the following is not a one-dimensional diagram:
(a) Simple bar diagram (b) Multiple bar diagram
(c) Component bar diagram (d) Pie diagram
MCQ No 2.76:
Which of the following is a two-dimensional diagram:
(a) Sub-divided bar (b) Percentage component bar chart
(c) Sub-divided rectangles (d) Multiple bar diagram
MCQ No 2.77:
Pie diagram represents the components of a factor by:
(a) Circles (b) Sectors (c) Angles (d) Percentages
MCQ No 2.78:
The suitable diagram to represent the data relating to the monthly expenditure on different items by a
family is:
(a) Historigram (b) Histogram (c) Multiple bar diagram (d) Pie diagram
MCQ No 2.79
A graph of time series or historical series is called:
(a) Histogram (b) Historigram (c) Frequency curve (d) Frequency polygon
MCQ No 2.80
The historigram is the graphical presentation of data which are classified:
(a) Geographically (b) Numerically (c) Qualitatively (d) According to time
MCQ No 2.81
Historigram and histogram are:
(a) Always same (b) Not same (c) Off and on same (d) Randomly same
MCQ No 2.82
A distribution in which the observations are concentrated at one end of the distribution is called a:
(a) Symmetric distribution (b) Normal distribution
(c) Skewed distribution (d) Uniform distribution
MCQ No 2.83
For graphic presentation of a frequency distribution, the paper to be used is:
(a) Carbon paper (b) Ordinary paper (c) Graph paper (d) Butter paper
MCQ No 2.84
Histogram can be drawn only for:
(a) Discrete frequency distribution (b) Continuous frequency distribution
(c) Cumulative frequency distribution (d) Relative frequency distribution
MCQ No 2.85
Histogram is a graph of:
(a) Frequency distribution (b) Time series (c) Qualitative data (d) Ogive
MCQ No 2.86
Histogram and frequency polygon are two graphical representations of:
(a) Frequency distribution (b) Class boundaries (c) Class intervals (d) Class marks
MCQ No 2.87
Frequency polygon can be drawn with the help of:
(a) Historigram (b) Histogram (c) Circle (d) Percentage
MCQ No 2.88
In a cumulative frequency polygon, the cumulative frequency of each class is plotted against:
(a) Mid-point (b) Lower class boundary (c) Upper class boundary (d) Upper class limit
MCQ No 2.89
The graph of the cumulative frequency distribution is called:
(a) Histogram (b) Frequency polygon (c) Pictogram (d) Ogive
MCQ No 2.90
When successive mid-points in a histogram are connected by straight lines, the graph is called a:
(a) Historigram (b) Ogive (c) Frequency curve (d) Frequency polygon
MCQ No 2.91
A frequency polygon is a closed figure which is:
(a) One sided (b) Two sided (c) Three sided (d) Many sided
MCQ No 2.92
Ogive curve can be occurred for the distribution of:
(a) Less than type (b) More than type (c) Both (a) and (b) (d) Neither (a) and (b)
MCQ No 2.93
The word ogive is also used for:
(a) Frequency polygon (b) Cumulative frequency polygon
(c) Frequency curve (d) Histogram
MCQ No 2.94
Cumulative frequency polygon can be used for the calculation of:
(a) Mean (b) Median (c) Mode (d) Geometric mean
MCQ’S OF MEASURES OF CENTRAL TENDENCY
MCQ No 3.1
Any measure indicating the centre of a set of data, arranged in an increasing or decreasing order of
magnitude, is called a measure of:
(a) Skewness (b) Symmetry (c) Central tendency (d) Dispersion
MCQ No 3.2
Scores that differ greatly from the measures of central tendency are called:
(a) Raw scores (b) The best scores (c) Extreme scores (d) Z-scores
MCQ No 3.3
The measure of central tendency listed below is:
(a) The raw score (b) The mean (c) The range (d) Standard deviation
MCQ No 3.4
The total of all the observations divided by the number of observations is called:
(a) Arithmetic mean (b) Geometric mean (c) Median (d) Harmonic mean
MCQ No 3.5
While computing the arithmetic mean of a frequency distribution, the each value of a class is
considered equal to:
(a) Class mark (b) Lower limit (c) Upper limit (d) Lower class boundary
MCQ No 3.6
Change of origin and scale is used for calculation of the:
(a) Arithmetic mean (b) Geometric mean
(c) Weighted mean (d) Lower and upper quartiles
MCQ No 3.7
The sample mean is a:
(a) Parameter (b) Statistic (c) Variable (d) Constant
MCQ No 3.8
The population mean µ is called:
(a) Discrete variable (b) Continuous variable (c) Parameter (d) Sampling unit
MCQ No 3.9
The arithmetic mean is highly affected by:
(a) Moderate values (b) Extremely small values
(c) Odd values (d) Extremely large values
MCQ No 3.10
The sample mean is calculated by the formula:
MCQ No 3.11
If a constant value is added to every observation of data, then arithmetic mean is obtained
by:
(a) Subtracting the constant (b) Adding the constant
(c) Multiplying the constant (d) Dividing the constant
MCQ No 3.12
Which of the following statements is always true?
(a) The mean has an effect on extreme scores (b) The median has an effect on extreme scores
(c) Extreme scores have an effect on the mean (d) Extreme scores have an effect on the median
MCQ No 3.13
The elimination of extreme scores at the bottom of the set has the effect of:
(a) Lowering the mean (b) Raising the mean (c) No effect (d) None of the above
MCQ No 3.14
The elimination of extreme scores at the top of the set has the effect of:
(a) Lowering the mean (b) Raising the mean (c) No effect (d) Difficult to tell
MCQ No 3.15
The sum of deviations taken from mean is:
(a) Always equal to zero (b) Some times equal to zero
(c) Never equal to zero (d) Less than zero
MCQ No 3.16
If = 25, which of the following will be minimum:
(a) ∑(X – 27)2 (b) ∑(X – 25)2 (c) ∑(X – 22)2 (d) ∑(X + 25)2
MCQ No 3.17
The sum of the squares fo the deviations about mean is:
(a) Zero (b) Maximum (c) Minimum (d) All of the above
MCQ No 3.18
MCQ No 3.19
For a certain distribution, if ∑(X -20) = 25, ∑(X- 25) =0, and ∑(X-35) = -25, then is
equal to:
(a) 20 (b) 25 (c) -25 (d) 35
MCQ No 3.20
The sum of the squares of the deviations of the values of a variable is least when the deviations are
measured from:
(a) Harmonic mean (b) Geometric mean (c) Median (d) Arithmetic mean
MCQ No 3.21
If X1, X2, X3, ... Xn, be n observations having arithmetic mean and if Y =4X ± 2, then is
equal to:
(a) 4X (b) 4 (c) 4 ± 2 (d) 4 ± 2
MCQ No 3.22
If =100 and Y=2X – 200, then mean of Y values will be:
(a) 0 (b) 2 (c) 100 (d) 200
MCQ No 3.23
Step deviation method or coding method is used for computation of the:
(a) Arithmetic mean (b) Geometric mean (c) Weighted mean (d) Harmonic mean
MCQ No 3.24
If the arithmetic mean of 20 values is 10, then sum of these 20 values is:
(a) 10 (b) 20 (c) 200 (d) 20 +
10
MCQ No 3.25
Ten families have an average of 2 boys. How many boys do they have together?
(a) 2 (b) 10 (c) 12 (d) 20
MCQ No 3.26
If the arithmetic mean of the two numbers X1 and X2 is 5 if X1=3, then X2 is:
(a) 3 (b) 5 (c) 7 (d) 10
MCQ No 3.27
Given X1=20 and X2= -20. The arithmetic mean will be:
(a) Zero (b) Infinity (c) Impossible (d) Difficult to tell
MCQ No 3.28
The mean of 10 observations is 10. All the observations are increased by 10%. The mean of increased
observations will be:
(a) 10 (b) 1.1 (c) 10.1 (d) 11
MCQ No 3.29
The frequency distribution of the hourly wage rate of 60 employees of a paper mill is as follows:
Wage rate (Rs.) 54----56 56----58 58----60 60----62 62----64
Number of workers 10 10 20 10 10
The mean wage rate is:
(a) Rs. 58.60 (b) Rs. 59.00 (c) Rs. 57.60 (d) Rs. 57.10
MCQ No 3.30
The sample mean of first n natural numbers is:
(a) n(n+ 1) / 2 (b) (n+ 1) / 2 (c) n/2 (d) (n+ 1) / 2
MCQ No 3.31
The mean of first 2n natural numbers is:
MCQ No 3.32
The sum of deviations is zero when deviations are taken from:
(a) Mean (b) Median (c) Mode (d) Geometric mean
MCQ No 3.33
When the values in a series are not of equal importance, we calculate the:
(a) Arithmetic mean (b) Geometric mean (c) Weighted mean (d) Mode
MCQ No 3.34
When all the values in a series occur the equal number of times, then it is not possible to calculate the:
(a) Arithmetic mean (b) Geometric mean (c) Harmonic mean (d) Weighted mean
MCQ No 3.35
The mean for a set of data obtained by assigning each data value a weight that reflects its relative
importance within the set, is called:
(a) Geometric mean (b) Harmonic mean (c) Weighted mean (d) Combined mean
MCQ No 3.36
If 1, 2, 3, ... , k be the arithmetic means of k distributions with respective frequencies n1, n2, n3, ... ,
nk, then the mean of the whole distribution c is given by:
(a) ∑ / ∑n (b) ∑n / ∑ (c) ∑n / ∑n (d) ∑(n+ ) / ∑n
MCQ No 3.37
The combined arithmetic mean is calculated by the formula:
MCQ No 3.38
The arithmetic mean of 10 items is 4 and the arithmetic mean of 5 items is 10. The combined arithmetic
mean is:
(a) 4 (b) 5 (c) 6 (d) 90
MCQ No 3.39
The midpoint of the values after they have been ordered from the smallest to the largest or the largest
to the smallest is called:
(a) Mean (b) Median (c) Lower quartile (d) Upper quartile
MCQ No 3.40
The first step in calculating the median of a discrete variable is to determine the:
(a) Cumulative frequencies (b) Relative weights
(c) Relative frequencies (d) Array
MCQ No 3.41
The suitable average for qualitative data is:
(a) Mean (b) Median (c) Mode (d) Geometric mean
MCQ No 3.42
Extreme scores will have the following effect on the median of an examination:
(a) They may have no effect on it (b) They may tend to raise it
(c) They may tend to lower it (d) None of the above
MCQ No 3.43
We must arrange the data before calculating:
(a) Mean (b) Median (c) Mode (d) Geometric mean
MCQ No 3.44
If the smallest observation in a data is decreased, the average which is not affected is:
(a) Mode (b) Median (c) Mean (d) Harmonic mean
MCQ No 3.45
If the data contains an extreme value, the suitable average is:
(a) Mean (b) Median (c) Weighted mean (d) Geometric mean
MCQ No 3.46
Sum of absolute deviations of the values is least when deviations are taken from:
(a) Mean (b) Mode (c) Median (d) Q3
MCQ No 3.47
The frequency distribution of the hourly wages rate of 100 employees of a paper mill is as follows:
Wage rate (Rs.) 54----56 56----58 58----60 60----62 62----64
Number of workers 20 20 20 20 20
The median wage rate is:
(a) Rs.55 (b) Rs.57 (c) Rs.56 (d) Rs.59
MCQ No 3.48
The values of the variate that divide a set of data into four equal parts after arranging the observations in
ascending order of magnitude are called:
(a) Quartiles (b) Deciles (c) Percentiles (d) Difficult to tell
MCQ No 3.49
The lower and upper quartiles of a symmetrical distribution are 40 and 60 respectively. The value of
median is:
(a) 40 (b) 50 (c) 60 (d) (60 – 40) / 2
MCQ No 3.50
If in a discrete series 75% values are less than 30, then:
(a) Q3 < 75 (b) Q3 < 30 (c) Q3 = 30 (d) Q3 > 30
MCQ No 3.51
If in a discrete series 75% values are greater than 50, then:
(a) Q1 = 50 (b) Q1 < 50 (c) Q1 > 50 (d) Q1 ≠ 50
MCQ No 3.52
If in a discrete series 25% values are greater than 75, then:
(a) Q1 > 75 (b) Q1 = 75 (c) Q3 = 75 (d) Q3 > 75
MCQ No 3.53
If in a discrete series 40% values are less than 40, then :
(a) D4 ≠ 40 (b) D4 < 40 (c) D4 > 40 (d) D4 = 40
MCQ No 3.54
If in a discrete series 15% values are greater than 40, then:
(a) P15 = 70 (b) P85 = 15 (c) P85 = 70 (d) P70 = 70
MCQ No 3.55
The middle value of an ordered series is called:
(a) Median (b) 5th decile (c) 50th percentile (d) All the above
MCQ No 3.56
If in a discrete series 50% values are less than 50, then:
(a) Q2 = 50 (b) D5 = 50 (c) P50 = 50 (d) All of the above
MCQ No 3.57
The mode or model value of the distribution is that value of the variate for which frequency is:
(a) Minimum (b) Maximum (c) Odd number (d) Even number
MCQ No 3.58
Suitable average for averaging the shoe sizes for children is:
(a) Mean (b) Mode (c) Median (d) Geometric mean
MCQ No 3.59
Extreme scores on an examination have the following effect on the mode:
(a) They tend to raise it (b) they tend to lower it
(c) They have no effect on it (d) difficult to tell
MCQ No 3.60
A measurement that corresponds to largest frequency in a set of data is called:
(a) Mean (b) Median (c) Mode (d) Percentile
MCQ No 3.61
Which of the following average cannot be calculated for the observations 2, 2, 4, 4, 6, 6, 8, 8, 10, 10 ?
(a) Mean (b) Median (c) Mode (d) All of the above
MCQ No 3.62
Mode of the series 0, 0, 0, 2, 2, 3, 3, 8, 10 is:
(a) 0 (b) 2 (c) 3 (d) No mode
MCQ No 3.63
A distribution with two modes is called:
(a) Unimodel (b) Bimodal (c) Multimodal (d) Normal
MCQ No 3.64
The model letter of the word “STATISTICS” is:
(a) S (b) T (c) Both S and I (d) Both S and T
MCQ No 3.65
The mode for the following frequency distribution is:
Weekly sales of burner units 0 1 2 3 Over 3
Number of weeks 38 6 5 1 0
(a) 0 (b) 2 (c) 3 (d) No mode
MCQ No 3.66
Which of the following statements is always correct?
(a) Mean = Median = Mode (b) Arithmetic mean = Geometric mean = Harmonic mean
(c) Median = Q2 = D5 = P50 (d) Mode = 2Median - 3Mean
MCQ No 3.67
In a moderately symmetrical series, the arithmetic mean, median and mode are related as:
(a) Mean - Mode = 3(Mean - Median) (b) Mean - Median = 2(Median - Mode)
(c) Median - Mode = (Mean - Median) / 2 (d) Mode – Median = 2Mean – 2Median
MCQ No 3.68
In a moderately skewed distribution, mean is equal to!
(a) (3Median - Mode) / 2 (b) (2Mean + Mode) / 3
(c) 3Median – 2Mean (d) 3Median - Mode
MCQ No 3.69
In a moderately asymmetrical distribution, the value of median is given by:
(a) 3Median + 2Mean (b) 2Mean + Mode
(c) (2Mean + Mode) / 3 (d) (3Median - Mode) / 2
MCQ No 3.70
For moderately skewed distribution, the value of mode is calculated as:
(a) 2Mean – 3Median (b) 3Median – 2Mean
(c) 2Mean + Mode (d) 3Median - Mode
MCQ No 3.71
In a moderately skewed distribution, Mean = 45 and Median = 30, then the value of mode is:
(a) 0 (b) 30 (c) 45 (d) 180
MCQ No 3.72
If for any frequency distribution, the median is 10 and the mode is 30, then approximate value of mean is
equal to:
(a) 0 (b) 10 (c) 30 (d) 60
MCQ No 3.73
In a moderately asymmetrical distribution, the value of mean and mode is 15 and 18 respectively. The value of
median will be:
(a) 48 (b) 18 (c) 16 (d) 15
MCQ No 3.74
MCQ No 3.75
Which of the following is correct in a positively skewed distribution?
(a) Mean = Median = Mode (b) Mean < Median < Mode
(c) Mean > Median > Mode (d) Mean + Median + Mode
MCQ No 3.76
If the values of mean, median and mode coincide in a unimodel distribution, then the distribution will
be:
(a) Skewed to the left (b) Skewed to the right (c) Multimodal (d) Symmetrical
MCQ No 3.77
A curve that tails off to the right end is called:
(a) Positively skewed (b) Negatively skewed (c) Symmetrical (d) Both (b) and (c)
MCQ No 3.78
The sum of the deviations taken from mean is:
(a) Always equal to zero (b) Some times equal to zero
(c) Never equal to zero (d) Less than zero
MCQ No 3.79
If a set of data has one mode and its value is less than mean, then the distribution is called:
(a) Positively skewed (b) Negatively skewed (c) Symmetrical (d) Normal
MCQ No 3.80
Taking the relevant root of the product of all non-zero and positive values are called:
(a) Arithmetic mean (b) Geometric mean (c) Harmonic mean (d) Combined mean
MCQ No 3.81
The best average in percentage rates and ratios is:
(a) Arithmetic mean (b) Lower and upper quartiles
(c) Geometric mean (d) Harmonic mean
MCQ No 3.82
The suitable average for computing average percentage increase in population is:
(a) Geometric mean (b) Harmonic mean (c) Combined mean (d) Population mean
MCQ No 3.83
If 10% is added to each value of variable, the geometric mean of new variable is added by:
(a) 10 (b) 1/100 (c) 10% (d) 1.1
MCQ No 3.84
If each observation of a variable X is increased by 20%, then geometric mean is also increased by:
(a) 20 (b) 1/20 (c) 20% (d) 100%
MCQ No 3.85
If any value in a series is negative, then we cannot calculate the:
(a) Mean (d) Median (c) Geometric mean (d) Harmonic mean
MCQ No 3.86
Geometric mean for X1 andX2 is:
MCQ No 3.87
Geometric mean of 2, 4, 8 is:
(a) 6 (b) 4 (c) 14/3 (d) 8
MCQ No 3.88
Geometric mean is suitable when the values are given as:
(a) Proportions (b) Ratios (c) Percentage rates (d) All of the above
MCQ No 3.89
If the geometric of the two numbers X1 and X2 is 9 if X1=3, then X2 is equal to:
(a) 3 (b) 9 (c) 27 (d) 81
MCQ No 3.90
If the two observations are a = 2 and b = -2, then their geometric mean will be:
(a) Zero (b) Infinity (c) Impossible (d) Negative
MCQ No 3.91
Geometric mean of -4, -2 and 8 is:
(a) 4 (b) 0 (c) -2 (d) Impossible
MCQ No 3.92
The ratio among the number of items and the sum of reciprocals of items is called:
(a) Arithmetic mean (b) Geometric mean (c) Harmonic mean (d) Mode
MCQ No 3.93
Harmonic mean for X1 and X2 is:
MCQ No 3.94
The appropriate average for calculating the average speed of a journey is:
(a) Median (b) Arithmetic mean (c) Mode (d) Harmonic mean
MCQ No 3.95
Harmonic mean gives less weightage to:
(a) Small values (b) Large values (c) Positive values (d) Negative values
MCQ No 3.96
The harmonic mean of the values 5, 9, 11, 0, 17, 13 is:
(a) 9.5 (b) 6.2 (c) 0 (d) Impossible
MCQ No 3.97
If the harmonic mean of the two numbers X1 and X2 is 6.4 if X2=16, then X1 is:
(a) 4 (b) 10 (c) 16 (d) 20
MCQ No 3.98
If a = 5 and b = -5, then their harmonic mean is:
(a) -5 (b) 5 (c) 0 (d) ∞
MCQ No 3.99
For an open-end frequency distribution, it is not possible to find:
(a) Arithmetic mean (b) Geometric mean (c) Harmonic mean (d) All of the above
MCQ No 3.100
If all the items in a variable are non zero and non negative then:
(a) A.M > G.M > H.M (b) G.M > A.M > H.M (c) H.M > G.M > A.M (d) A.M < G.M < H.M
MCQ No 3.101
The geometric mean of a set of positive numbers X1, X2, X3, ... , Xn is less than or equal to their
arithmetic mean but is greater than or equal to their:
(a) Harmonic mean (b) Median (c) Mode (d) Lower and upper quartiles
MCQ No 3.102
Geometric mean and harmonic mean for the values 3, -11, 0, 63, -14, 100 are:
(a) 0 and 3 (b) 3 and -3 (c) 0 and 0 (d) Impossible
MCQ No 3.103
If the arithmetic mean and harmonic mean of two positive numbers are 4 and 16, then their
geometric mean will be:
(a) 4 (b) 8 (c) 16 (d) 64
MCQ No 3.104
The arithmetic mean and geometric mean of two observations are 4 and 8 respectively, then harmonic
mean of these two observations is:
(a) 4 (b) 8 (c) 16 (d) 32
MCQ No 3.105
The geometric mean and harmonic mean of two values are. 8 and 16 respectively, then arithmetic
mean of values is:
(a) 4 (b) 16 (c) 24 (d) 128
MCQ No 3.106
Which pair of averages cannot be calculated when one of numbers in the series is zero?
(a) Geometric mean and Median (b) Harmonic mean and Mode
(c) Simple mean and Weighted mean (d) Geometric mean and Harmonic mean
MCQ No 3.107
In a given data the average which has the least value is:
(a) Mean (b) Median (c) Harmonic mean (d) Geometric mean
MCQ No 3.108
If all the values in a series are same, then:
(a) A.M = G.M = H.M (b) A.M ≠ G.M ≠ H.M (c) A.M > G.M > H.M (d) A.M < G.M < H.M
MCQ No 3.109
The averages are affected by change of:
(a) Origin (b) Scale (c) Both (a) and (b) (d) None of the above
MCQ’s of Measures of Dispersion
MCQ No 4.1
The scatter in a series of values about the average is called:
(a) Central tendency (b) Dispersion (c) Skewness (d) Symmetry
MCQ No 4.2
The measurements of spread or scatter of the individual values around the central point is called:
(a) Measures of dispersion (b) Measures of central tendency
(c) Measures of skewness (d) Measures of kurtosis
MCQ No 4.3
The measures used to calculate the variation present among the observations in the unit of the variable is
called:
(a) Relative measures of dispersion (b) Coefficient of skewness
(c) Absolute measures of dispersion (d) Coefficient of variation
MCQ No 4.4
The measures used to calculate the variation present among the observations relative to their average is
called:
(a) Coefficient of kurtosis (b) Absolute measures of dispersion
(c) Quartile deviation (d) Relative measures of dispersion
MCQ No 4.5
The degree to which numerical data tend to spread about an average value called:
(a) Constant (b) Flatness (c) Variation (d) Skewness
MCQ No 4.6
The measures of dispersion can never be:
(a) Positive (b) Zero (c) Negative (d) Equal to 2
MCQ No 4.7
If all the scores on examination cluster around the mean, the dispersion is said to be:
(a) Small (b) Large (c) Normal (d) Symmetrical
MCQ No 4.8
If there are many extreme scores on all examination, the dispersion is:
(a) Large (b) Small (c) Normal (d) Symmetric
MCQ No 4.9
Given below the four sets of observations. Which set has the minimum variation?
(a) 46, 48, 50, 52, 54 (b) 30, 40, 50, 60, 70 (c) 40, 50, 60, 70, 80 (d) 48, 49, 50, 51, 52
MCQ No 4.10
Which of the following is an absolute measure of dispersion?
(a) Coefficient of variation (b) Coefficient of dispersion
(c) Standard deviation (d) Coefficient of skewness
MCQ No 4.11
The measure of dispersion which uses only two observations is called:
(a) Mean (b) Median (c) Range (d) Coefficient of variation
MCQ No 4.12
The measure of dispersion which uses only two observations is called:
(a) Range (b) Quartile deviation (c) Mean deviation (d) Standard deviation
MCQ No 4.13
In quality control of manufactured items, the most common measure of dispersion is:
(a) Range (b) Average deviation (c) Standard deviation (d) Quartile deviation
MCQ No 4.14
The range of the scores 29, 3, 143, 27, 99 is:
(a) 140 (b) 143 (c) 146 (d) 70
MCQ No 4.15
If the observations of a variable X are, -4, -20, -30, -44 and -36, then the value of the range will be:
(a) -48 (b) 40 (c) -40 (d) 48
MCQ No 4.16
The range of the values -5, -8, -10, 0, 6, 10 is:
(a) 0 (b) 10 (c) -10 (d) 20
MCQ No 4.17
If Y = aX ± b, where a and b are any two numbers and a ≠ 0, then the range of Y values will be:
(a) Range(X) (b) a range(X) + b (c) a range(X) – b (d) |a| range(X)
MCQ No 4.18
If the maximum value in a series is 25 and its range is 15, the maximum value of the series is:
(a) 10 (b) 15 (c) 25 (d) 35
MCQ No 4.19
Half of the difference between upper and lower quartiles is called:
(a) Interquartile range (b) Quartile deviation (c) Mean deviation (d) Standard deviation
MCQ No 4.20
If Q3=20 and Q1=10, the coefficient of quartile deviation is:
(a) 3 (b) 1/3 (c) 2/3 (d) 1
MCQ No 4.21
Which measure of dispersion can be computed in case of open-end classes?
(a) Standard deviation (b) Range (c) Quartile deviation (d) Coefficient of variation
MCQ No 4.22
If Y = aX ± b, where a and b are any two constants and a ≠ 0, then the quartile deviation of Y values is
equal to:
(a) a Q.D(X) + b (b) |a| Q.D(X) (c) Q.D(X) – b (d) |b| Q.D(X)
MCQ No 4.23
The sum of absolute deviations is minimum if these deviations are taken from the:
(a) Mean (b) Mode (c) Median (d) Upper quartile
MCQ No 4.24
The mean deviation is minimum when deviations are taken from:
(a) Mean (b) Mode (c) Median (d) Zero
MCQ No 4.25
If Y = aX ± b, where a and b are any two numbers but a ≠ 0, then M.D(Y) is equal to:
(a) M.D(X) (b) M.D(X) ± b (c) |a| M.D(X) (d) M.D(Y) + M.D(X)
MCQ No 4.26
The mean deviation of the scores 12, 15, 18 is:
(a) 6 (b) 0 (c) 3 (d) 2
MCQ No 4.27
Mean deviation computed from a set of data is always:
(a) Negative (b) Equal to standard deviation
(c) More than standard deviation (d) Less than standard deviation
MCQ No 4.28
The average of squared deviations from mean is called:
(a) Mean deviation (b) Variance (c) Standard deviation (d) Coefficient of variation
MCQ No 4.29
The sum of squares of the deviations is minimum, when deviations are taken from:
(a) Mean (b) Mode (c) Median (d) Zero
MCQ No 4.30
Which of the following measures of dispersion is expressed in the same units as the units of observation?
(a) Variance (b) Standard deviation
(c) Coefficient of variation (d) Coefficient of standard deviation
MCQ No 4.31
Which measure of dispersion has a different unit other than the unit of measurement of values:
(a) Range (b) Standard deviation (c) Variance (d) Mean deviation
MCQ No 4.32
Which of the following is a unit free quantity:
(a) Range (b) Standard deviation (c) Coefficient of variation (d) Arithmetic mean
MCQ No 4.33
If the dispersion is small, the standard deviation is:
(a) Large (b) Zero (c) Small (d) Negative
MCQ No 4.34
The value of standard deviation changes by a change of:
(a) Origin (b) Scale (c) Algebraic signs (d) None
MCQ No 4.35
The standard deviation one distribution dividedly the mean of the distribution and expressing in
percentage is called:
(a) Coefficient of Standard deviation (b) Coefficient of skewness
(c) Coefficient of quartile deviation (d) Coefficient of variation
MCQ No 4.36
The positive square root of the mean of the squares of the cleviations of observations from their mean is
called:
(a) Variance (b) Range (c) Standard deviation (d) Coefficient of variation
MCQ No 4.37
The variance is zero only if all observations are the:
(a) Different (b) Square (c) Square root (d) Same
MCQ No 4.38
The standard deviation is independent of:
(a) Change of origin (b) Change of scale of measurement
(c) Change of origin and scale of measurement (d) Difficult to tell
MCQ No 4.39
If there are ten values each equal to 10, then standard deviation of these values is:
(a) 100 (b) 20 (c) 10 (d) 0
MCQ No 4.40
If X and Y are independent random variables, then S.D(X ± Y) is equal to:
(a) S.D(X) ± S.D(Y) (b) Var(X) ± Var(Y) (c) (d)
MCQ No 4.41
S.D(X) = 6 and S.D(Y) = 8. If X and Yare independent random variables, then S.D(X-Y) is:
(a) 2 (b) 10 (c) 14 (d) 100
MCQ No 4.42
For two independent variables X and Y if S.D(X) = 1 and S.D(Y) = 3, then Var(3X - Y) is equal to:
(a) 0 (b) 6 (c) 18 (b) 12
MCQ No 4.43
If Y = aX ± b, where a and b are any two constants and a ≠ 0, then Vat (Y) is equal to:
(a) a Var(X) (b) a Var(X) + b (c) a2 Var(X) – b (d) a2 Var(X)
MCQ No 4.44
If Y = aX + b, where a and b are any two numbers but a ≠ 0, then S.D(Y) is equal to:
(a) S.D(X) (b) a S.D(X) (c) |a| S.D(X) (d) a S.D(X) + b
MCQ No 4.45
The ratio of the standard deviation to the arithmetic mean expressed as a percentage is called:
(a) Coefficient of standard deviation (b) Coefficient of skewness
(c) Coefficient of kurtosis (d) Coefficient of variation
MCQ No 4.46
Which of the following statements is correct?
(a) The standard deviation of a constant is equal to unity
(b) The sum of absolute deviations is minimum if these deviations are taken from the mean.
(c) The second moment about origin equals variance
(d) The variance is positive quantity and is expressed in square of the units of the observations
MCQ No 4.47
Which of the following statements is false?
(a) The standard deviation is independent of change of origin
(b) If the moment coefficient of kurtosis β2 = 3, the distribution is mesokurtic or normal.
(c) If the frequency curve has the same shape on both sides of the centre line which divides the curve into
two equal parts, is called a symmetrical distribution.
(d) Variance of the sum or difference of any two variables is equal to the sum of their
respective variances
MCQ No 4.48
If Var(X) = 25, then is equal to:
(a) 15/2 (b) 50 (c) 25 (d) 5
MCQ No 4.49
To compare the variation of two or more than two series, we use
(a) Combined standard deviation (b) Corrected standard deviation
(c) Coefficient of variation (d) Coefficient of skewness
MCQ No 4.50
The standard deviation of -5, -5, -5, -5, 5 is:
(a) -5 (b) +5 (c) 0 (d) -25
MCQ No 4.51
Standard deviation is always calculated from:
(a) Mean (b) Median (c) Mode (d) Lower quartile
MCQ No 4.52
The mean of an examination is 69, the median is 68, the mode is 67, and the standard deviation is 3.
The measures of variation for this examination is:
(a) 67 (b) 68 (c) 69 (d) 3
MCQ No 4.53
The variance of 19, 21, 23, 25 and 27 is 8. The variance of 14, 16, 18, 20 and 22 is:
(a) Greater than 8 (b) 8 (c) Less than 8 (d) 8 - 5 = 3
MCQ No 4.54
In a set of observations the variance is 50. All the observations are increased by 100%. The variance of
the increased observations will become:
(a) 50 (b) 200 (c) 100 (d) No change
MCQ No 4.55
Three factories A, B, C have 100, 200 and 300 workers respectively. The mean of the wages is the same
in the three factories. Which of the following statements is true?
(a) There is greater variation in factory C.
(b) Standard deviation in. factory A is the smallest.
(c) Standard deviation in all the three factories are equal
(d) None of the above
MCQ No 4.56
An automobile manufacturer obtains data concerning the sales of six of its deals in the last week of
1996. The results indicate the standard deviation of their sales equals 6 autos. If this is so, the variance of
their sales equals:
(a) (b) 6 (c) (d) 36
MCQ No 4.57
If standard deviation of the values 2, 4, 6, 8 is 2.236, then standard deviation of the values 4, 8,12, 16 is:
(a) 0 (b) 4.472 (c) 4.236 (d) 2.236
MCQ No 4.58
Var(X) = 4 and Var(Y) =9. If X and Y are independent random variable then Var(2X + Y) is:
(a) 13 (b) 17 (c) 25 (d) -1
MCQ No 4.59
If = Rs.20, S= Rs.10, then coefficient of variation is:
(a) 45% (b) 50% (c) 60% (d) 65%
MCQ No 4.60
Which of the following measures of dispersion is independent of the units employed?
(a) Coefficient of variation (b) Quartile deviation
(c) Standard deviation (d) Range
MCQ No 4.61
In sheppard’s correction µ2 is equal to:
MCQ No 4.62
The moments about mean are called:
(a) Raw moments (b) Central moments (c) Moments about origin (d) All of the above
MCQ No 4.63
The moments about origin are called:
(a) Moments about zero (b) Raw moments (c) Both (a) and (b) (d) Neither (a) nor (b)
MCQ No 4.64
All odd order moments about mean in a symmetrical distribution are:
(a) Positive (b) Negative (c) Zero (d) Three
MCQ No 4.65
The second moment about arithmetic mean is 16, the standard deviation will be:
(a) 16 (b) 4 (c) 2 (d) 0
MCQ No 4.66
The first and second moments about arbitrary constant are -2 and 13 respectively, The standard deviation will
be:
(a) -2 (b) 3 (c) 9 (d) 13
MCQ No 4.67
Moment ratios β1 and β2 are:
(a) Independent of origin and scale of measurement
(b) Expressed in original unit of the data
(c) Unit less quantities
(d) Both (a) and (c)
MCQ No 4.68
The first moment about X = 0 of a distribution is 12.08. The mean is:
(a) 10.80 (b) 10.08 (c) 12.08 (d) 12.88
MCQ No 4.69
First two moments about the value 2 of a variable are 1 and 16. The variance will be:
(a) 13 (b) 15 (c) 16 (d) Difficult to tell
MCQ No 4.70
The first three moments of a distribution about the mean are 1, 4 and 0. The distribution is:
(a) Symmetrical (b) Skewed to the left (c) Skewed to the right (d) Normal
MCQ No 4.71
If the third central is negative, the distribution will be:
(a) Symmetrical (b) Positively skewed (c) Negatively skewed (d) Normal
MCQ No 4.72
If the third moment about mean is zero, then the distribution is:
(a) Positively skewed (b) Negatively skewed (c) Symmetrical (d) Mesokurtic
MCQ No 4.73
Departure from symmetry is called:
(a) Second moment (b) Kurtosis (c) Skewness (d) Variation
MCQ No 4.74
In a symmetrical distribution, the coefficient of skewness will be:
(a) 0 (b) Q1 (c) Q3 (d) 1
MCQ No 4.75
The lack of uniformity or symmetry is called:
(a) Skewness (b) Dispersion (c) Kurtosis (d) Standard deviation
MCQ No 4.76
For a positively skewed distribution, mean is always:
(a) Less than the median (b) Less than the mode
(c) Greater than the mode (d) Difficult to tell
MCQ No 4.77
For a symmetrical distribution:
(a) β1 > 0 (b) β1 < 0 (c) β1 = 0 (d) β1 = 3
MCQ No 4.78
If mean=50, mode=40 and standard deviation=5, the distribution is:
(a) Positively skewed (b) Negatively skewed (c) Symmetrical (d) Difficult to tell
MCQ No 4.79
If mean=25, median=30 and standard deviation=15, the distribution will be:
(a) Symmetrical (b) Positively skewed (c) Negatively skewed (d) Normal
MCQ No 4.80
If mean=20, median=16 and standard deviation=2, then coefficient of skewness is:
(a) 1 (b) 2 (c) 4 (d) -2
MCQ No 4.81
If mean=10, median=8 and standard deviation=6, then coefficient of skewness is:
(a) 1 (b) -1 (c) 2/6 (d) 2
MCQ No 4.82
If the sum of deviations from median is not zero, then a distribution will be:
(a) Symmetrical (b) Skewed (c) Normal (d) All of the above
MCQ No 4.83
In case of positively skewed distribution, the extreme values lie in the:
(a) Middle (b) Left tail (c) Right tail (d) Anywhere
MCQ No 4.84
Bowley's coefficient of skewness lies between:
(a) 0 and 1 (b) 1 and +1 (c) -1 and 0 (d) -2 and +2
MCQ No 4.85
In a symmetrical distribution, Q3 – Q1 = 20, median = 15. Q3 is equal to:
(a) 5 (b) 15 (c) 20 (d) 25
MCQ No 4.86
Which of the following is correct in a negatively skewed distribution?
(a) The arithmetic mean is greater than the mode
(b) The arithmetic mean is greater than the median
(c) (Q3 – Median) = (Median – Q1)
(d) (Q3 – Median) < (Median – Q1)
MCQ No 4.87
The lower and upper quartiles of a distribution are 80 and 120 respectively, while median is 100. The
shape of the distribution is:
(a) Positively skewed (b) Negatively skewed (c) Symmetrical (d) Normal
MCQ No 4.88
In a symmetrical distribution Q1 = 20 and median= 30. The value of Q3 is:
(a) 50 (b) 35 (c) 40 (d) 25
MCQ No 4.89
The degree of peaked ness or flatness of a unimodel distribution is called:
(a) Skewness (b) Symmetry (c) Dispersion (d) Kurtosis
MCQ No 4.90
For a leptokurtic distribution, the relation between second and fourth central moment is:
MCQ No 4.91
For a platydurtic distribution, the relation between and is:
MCQ No 4.92
For a mesokurtic distribution, the relation between fourth and second mean moment is:
MCQ No 4.93
The second and fourth moments about mean are 4 and 48 respectively, then the distribution is:
(a) Leptokurtic (b) Platykurtic (c) Mesokurtic or normal (d) Positively skewed
MCQ No 4.94
In a mesokurtic or normal distribution, µ4 = 243. The standard deviation is:
(a) 81 (b) 27 (c) 9 (d) 3
MCQ No 4.95
The value of β2 can be:
(a) Less than 3 (b) Greater than 3 (c) Equal to 3 (d) All of the above
MCQ No 4.96
In a normal (mesokurtic) distribution:
(a) β1=0 and β2=3 (b) β1=3 and β2=0 (c) β1=0 and β2>3 (d) β1=0 and β2<3
MCQ No 4.97
Any frequency distribution, the following empirical relation holds:
(a) Quartile deviation = Standard deviation
(b) Mean deviation = Standard deviation
(c) Standard deviation = Mean deviation = Quartile deviation
(d) All of the above
MCQ of REGRESSION AND CORRELATION
MCQ 14.1
A process by which we estimate the value of dependent variable on the basis of one or more independent
variables is called:
(a) Correlation (b) Regression (c) Residual (d) Slope
MCQ 14.2
The method of least squares dictates that we choose a regression line where the sum of the square of
deviations of the points from the lie is:
(a) Maximum (b) Minimum (c) Zero (d)
Positive
MCQ 14.3
A relationship where the flow of the data points is best represented by a curve is called:
(a) Linear relationship (b) Nonlinear relationship (c) Linear positive (d) Linear negative
MCQ 14.4
All data points falling along a straight line is called:
(a) Linear relationship (b) Non linear relationship (c) Residual (d) Scatter diagram
MCQ 14.5
The value we would predict for the dependent variable when the independent variables are all equal to zero
is called:
(a) Slope (b) Sum of residual (c) Intercept (d) Difficult to tell
MCQ 14.6
The predicted rate of response of the dependent variable to changes in the independent variable is called:
(a) Slope (b) Intercept (c) Error (d) Regression equation
MCQ 14.7
The slope of the regression line of Y on X is also called the:
(a) Correlation coefficient of X on Y (b) Correlation coefficient of Y on X
(c) Regression coefficient of X on Y (d) Regression coefficient of Y on X
MCQ 14.8
In simple linear regression, the numbers of unknown constants are:
(a) One (b) Two (c) Three (d) Four
MCQ 14.9
In simple regression equation, the numbers of variables involved are:
(a) 0 (b) 1 (c) 2 (d) 3
MCQ 14.10
If the value of any regression coefficient is zero, then two variables are:
(a) Qualitative (b) Correlation (c) Dependent (d) Independent
MCQ 14.11
The straight line graph of the linear equation Y = a+ bX, slope will be upward if:
(a) b = 0 (b) b < 0 (c) b > 0 (b) b ≠ 0
MCQ 14.12
The straight line graph of the linear equation Y = a + bX, slope will be downward If:
(a) b > 0 (b) b < 0 (c) b = 0 (d) b ≠ 0
MCQ 14.13
The straight line graph of the linear equation Y = a + bX, slope is horizontal if:
(a) b = 0 (b) b ≠ 0 (c) b = 1 (d) a = b
MCQ 14.14
If regression line of = 5, then value of regression coefficient of Y on X is:
(a) 0 (b) 0.5 (c) 1 (d) 5
MCQ 14.15
If Y = 2 - 0.2X, then the value of Y intercept is equal to:
(a) -0.2 (b) 2 (c) 0.2X (d) All of the above
MCQ 14.16
If one regression coefficient is greater than one, then other will he:
(a) More than one (b) Equal to one (c) Less than one (d) Equal to minus one
MCQ 14.17
To determine the height of a person when his weight is given is:
(a) Correlation problem (b) Association problem (c) Regression problem (d) Qualitative
problem
MCQ 14.18
The dependent variable is also called:
(a) Regression (b) Regressand (c) Continuous variable (d) Independent
MCQ 14.19
The dependent variable is also called:
(a) Regressand variable (b) Predictand variable (c) Explained variable (d) All of these
MCQ 14.20
The independent variable is also called:
(a) Regressor (b) Regressand (c) Predictand (d) Estimated
MCQ 14.21
In the regression equation Y = a+bX, the Y is called:
(a) Independent variable (b) Dependent variable (c) Continuous variable (d) None of the above
MCQ 14.22
In the regression equation X = a + bY, the X is called:
(a) Independent variable (b) Dependent variable (c) Qualitative variable (d) None of the above
MCQ 14.23
In the regression equation Y = a +bX, a is called:
(a) X-intercept (b) Y-intercept (c) Dependent variable (d) None of the above
MCQ 14.24
The regression equation always passes through:
(a) (X, Y) (b) (a, b) (c) ( , ) (d) ( , Y)
MCQ 14.25
The independent variable in a regression line is:
(a) Non-random variable (b) Random variable (c) Qualitative variable (d) None of the above
MCQ 14.26
The graph showing the paired points of (Xi, Yi) is called:
(a) Scatter diagram (b) Histogram (c) Historigram (d) Pie diagram
MCQ 14.27
The graph represents the relationship that is:
(a) Linear (b) Non linear (c) Curvilinear (d) No relation
MCQ 14.28
The graph represents the relationship that is.:
(a) Linear positive (b) Linear negative (c) Non-linear (d) Curvilinear
MCQ 14.29
When regression line passes through the origin, then:
(a) Intercept is zero (b) Regression coefficient is zero (c) Correlation is zero (d) Association is zero
MCQ 14.30
When bXY is positive, then byx will be:
(a) Negative (b) Positive (c) Zero (d) One
MCQ 14.31
The correlation coefficient is the of two regression coefficients:
(a) Geometric mean (b) Arithmetic mean (c) Harmonic mean (d) Median
MCQ 14.32
When two regression coefficients bear same algebraic signs, then correlation coefficient is:
(a) Positive (b) Negative (c) According to two signs (d) Zero
MCQ 14.33
It is possible that two regression coefficients have:
(a) Opposite signs (b) Same signs (c) No sign (d) Difficult to tell
MCQ 14.34
Regression coefficient is independent of:
(a) Units of measurement (b) Scale and origin (c) Both (a) and (b) (d) None of them
MCQ 14.35
In the regression line Y = a+ bX:
(a) (b) (c) (d)
MCQ 14.36
In the regression line Y = a + bX, the following is always true:
(a) (b) (c) (d)
MCQ 14.37
The purpose of simple linear regression analysis is to:
(a) Predict one variable from another variable
(b) Replace points on a scatter diagram by a straight-line
(c) Measure the degree to which two variables are linearly associated
(d) Obtain the expected value of the independent random variable for a given value of the
dependent variable
MCQ 14.38
The sum of the difference between the actual values of Y and its values obtained from the fitted
regression line is always:
(a) Zero (b) Positive (c) Negative (d) Minimum
MCQ 14.39
If all the actual and estimated values of Y are same on the regression line, the sum of squares of
error will be:
(a) Zero (b) Minimum (c) Maximum (d) Unknown
MCQ 14.40
MCQ 14.41
A measure of the strength of the linear relationship that exists between two variables is called:
(a) Slope (b) Intercept (c) Correlation coefficient (d) Regression
equation
MCQ 14.42
When the ratio of variations in the related variables is constant, it is called:
(a) Linear correlation (b) Nonlinear correlation (c) Positive correlation (d) Negative correlation
MCQ 14.43
If both variables X and Y increase or decrease simultaneously, then the coefficient of correlation will
be:
(a) Positive (b) Negative (c) Zero (d) One
MCQ 14.44
If the points on the scatter diagram indicate that as one variable increases the other variable tends to
decrease the value of r will be:
(a) Perfect positive (b) Perfect negative (c) Negative (d) Zero
MCQ 14.45
If the points on the scatter diagram show no tendency either to increase together or decrease together
the value of r will be close to:
(a) -1 (b) +1 (c) 0.5 (d) 0
MCQ 14.46
If one item is fixed and unchangeable and the other item varies, the correlation coefficient will be:
(a) Positive (b) Negative (c) Zero (d) Undecided
MCQ 14.47
In scatter diagram, if most of the points lie in the first and third quadrants, then coefficient of
correlation is:
(a) Negative (b) Positive (c) Zero (d) All of the above
MCQ 14.48
If the two series move in reverse directions and the variations in their values are always
proportionate, it is said to be:
(a) Negative correlation (b) Positive correlation
(c) Perfect negative correlation (d) Perfect positive correlation
MCQ 14.49
If both the series move in the same direction and the variations are in a fixed proportion, correlation
between them is said to be:
(a) Perfect correlation (c) Linear correlation
(c) Nonlinear correlation (d) Perfect positive correlation
MCQ 14.50
The value of the coefficient of correlation r lies between:
(a) 0 and 1 (b) -1 and 0 (c) -1 and +1 (d) -0.5 and +0.5
MCQ 14.51
If X is measured in yours and Y is measured in minutes, then correlation coefficient has the unit:
(a) Hours (b) Minutes (c) Both (a) and (b) (d) No unit
MCQ 14.52
The range of regressioin coefficient is:
(a) -1 to +1 (b) 0 to 1 (c) -∞ to +∞ (d) 0 to ∞
MCQ 14.53
The signs of regression coefficients and correlation coefficient are always:
(a) Different (b) Same (c) Positive (d) Negative
MCQ 14.54
The arithmetic mean of the two regression coefficients is greater than or equal to:
(a) -1 (b) +1 (c) 0 (d) r
MCQ 14.55
In simple linear regression model Y = α + βX + ε where α and β are called:
(a) Estimates (b) Parameters (c) Random errors (d) Variables
MCQ 14.56
Negative regression coefficient indicates that the movement of the variables are in:
(a) Same direction (b) Opposite direction (c) Both (a) and (b) (d) Difficult to tell
MCQ 14.57
Positive regression coefficient indicates that the movement of the variables are in:
(a) Same direction (b) Opposite direction (c) Upward direction (d) Downward direction
MCQ 14.58
If the value of regression coefficient is zero, then the two variable are called:
(a) Independent (b) Dependent (c) Both (a) and (b) (d) Difficult to tell
MCQ 14.59
The term regression was used by:
(a) Newton (b) Pearson (c) Spearman (d) Galton
MCQ 14.60
In the regression equation Y = a + bX, b is called:
(a) Slope (b) Regression coefficient (c) Intercept (d) Both (a) and (b)
MCQ 14.61
When the two regression lines are parallel to each other, then their slopes are:
(a) Zero (b) Different (c) Same (d) Positive
MCQ 14.62
The measure of change in dependent variable corresponding to an unit change in independent
variable is called:
(a) Slope (b) Regression coefficient (c) Both (a) and (b) (d) Neither (a) and (b)
MCQ 14.63
In correlation problem both variables are:
(a) Equal (b) Unknown (c) Fixed (d) Random
MCQ 14.64
In the regression equation Y = a + bX, where a and b are called:
(a) Constants (b) Estimates (c) Parameters (d) Both (a) and (b)
MCQ 14.65
If byx = bxy = 1 and Sx = Sy, then r will be:
(a) 0 (b) -1 (c) 1 (d) Difficult to calculate
MCQ 14.66
The correlation coefficient between X and -X is:
(a) 0 (b) 0.5 (c) 1 (d) -1
MCQ 14.67
If byx = bxy = rxy, then:
(a) Sx ≠ Sy (b) Sx = Sy (c) Sx > Sy (d) Sx < Sy
MCQ 14.68
If rxy = 0.4, then r(2x, 2y) is equal to:
(a) 0.4 (b) 0.8 (c) 0 (d) 1
MCQ 14.69
rxy is equal to:
(a) 0 (b) -1 (c) 1 (d) 0.5
MCQ 14.70
If rxy = 0.75, then correlation coefficient between u = 1.5X and v = 2Y is:
(a) 0 (b) 0.75 (c) -0.75 (d) 1.5
MCQ 14.71
If byx = -2 and rxy= -1, then bxy is equal to:
(a) -1 (b) -2 (c) 0.5 (d) -0.5
MCQ 14.72
If byx = 1.6 and bxy = 0.4, then rxy will be:
(a) 0.4 (b) 0.64 (c) 0.8 (d) -0.8
MCQ 14.73
If byx = -0.8 and bxy = -0.2, then ryx is equal to:
(a) -0.2 (b) -0.4 (c) 0.4 (d) -0.8
MCQ 14.74
If = 6 – X, then r will be:
(a) 0 (b) 1 (c) -1 (d) Both (b) and (c)
MCQ 14.75
If = X + 10, then r equal to:
(a) 1 (b) -1 (c) 1/2 (d) Difficult to tell
MCQ 14.76
If Y = -10X and X = -0.1Y, then r is equal to:
(a) 0.1 (b) 1 (c) -1 (d) 10
MCQ 14.77
If the figure +1 signifies perfect positive correlation and the figure -1 signifies a perfect negative
correlation, then the figure 0 signifies:
(a) A perfect correlation (b) Uncorrelated variables
(c) Not significant (d) Weak correlation
MCQ 14.78
A perfect positive correlation is signified by:
(a) 0 (b) -1 (c) +1 (d) -1 to +1
MCQ 14.79
If a statistics professor tells his class: "All those who got 100 on the statistics test got 20 on the
mathematics test, and all those that got 100 on the mathematics test got 20 on the statistics test", he
is saying that the correlation between the statistics test and the mathematics test is:
(a) Negative (b) Positive (c) Zero (d) Difficult to tell
MCQ 14.80
If is zero, the correlation is:
(a) Weak negative (b) High positive (c) High negative (d) None of the preceding
MCQ 14.81
If rxy = 1, then:
(a) byx = bxy (b) byx > bxy (c) byx < bxy (d) byx . bxy = 1
MCQ 14.82
The relation between the regression coefficient byx and correlation coefficient r is:
MCQ 14.83
The relation between the regression coefficient bxy and correlation coefficient r is:
MCQ 14.84
If the sum of the product of the deviation of X and Y from their means is zero, the correlation
coefficient between X and Y is:
(a) Zero (b) Maximum (c) Minimum (d) Undecided
MCQ 14.85
If the coefficient of correlation between the variables X and Y is r, the coefficient of correlation
between X2 and Y2 is:
(a) -1 (b) 1 (c) r (d) r2
MCQ 14.86
If rxy = 0.75, then rxy will be:
(a) 0.25 (b) 0.50 (c) 0.75 (d) -0.75
MCQ 14.87
If , then byx is equal to:
(a) Positive (b) Negative (c) Zero (d) One
MCQ 14.88
If , then intercept a is equal to:
(a) 0 (b) 1 (c) -1 to +1 (d) 0 to 1
MCQ 14.89
:
(a) Less than zero (b) Greater than zero (c) Equal to zero (d) Not equal to zero
MCQ 14.90
When rxy < 0, then byx and bxy will be:
(a) Zero (b) Not equal to zero (c) Less than zero (d) Greater than zero
MCQ 14.91
When rxy > 0, then byx and bxy are both:
(a) 0 (b) < 0 (c) > 0 (d) < 1
MCQ 14.92
If rxy = 0, then:
(a) byx = 0 (b) bxy = 0 (c) Both (a) and (b) (d) byx ≠ bxy
MCQ 14.93
If bxy = 0.20 and rxy = 0.50, then byx is equal to:
(a) 0.20 (b) 0.25 (c) 0.50 (d) 1.25
MCQ 14.94
A regression model may be:
(a) Linear (b) Non-linear (c) Both (a) and (b) (d) Neither (a)
and (b)
MCQ 14.95
If r is negative, we know that:
(a)
(b)
(c)
(d)
MCQ INDEX NUMBERS
MCQ No 5.1
An index number is called a simple index when it is computed from:
(a) Single variable (b) Bi-variable (c) Multiple variables (d) None of them
MCQ No 5.2
Index numbers are expressed in:
(a) Ratios (b) Squares (c) Percentages (d) Combinations
MCQ No 5.3
If all the values are of equal importance, the index numbers are called:
(a) Weighted (b) Unweighted (c) Composite (d) Value index
MCQ No 5.4
Index numbers can be used for:
(a) Forecasting (b) Fixed prices (c) Different prices (d) Constant prices
MCQ No 5.5
Index for base period is always taken as:
(a) 100 (b) One (c) 200 (d) Zero
MCQ No 5.6
When the prices of rice are to be compared, we compute:
(a) Volume index (b) Value index (c) Price index (d) Aggregative index
MCQ No 5.7
When index number is calculated for several variables, it is called:
(a) Composite index (b) Whole sale price index (c) Volume index (d) Simple index
MCQ No 5.8
How many types are used for the calculation of index numbers:
(a) 2 (b) 3 (c) 4 (d) 5
MCQ No 5.9
In chain base method, the base period is:
(a) Fixed (b) Not fixed (c) Constant (d) Zero
MCQ No 5.10
Which formula is used in chain indices?
MCQ No 5.11
Price relatives are a percentage ratio of current year price and:
(a) Base year quantity (b) Previous year quantity (c) Base year price (d) Current year quantity
MCQ No 5.12
Indices calculated by the chain base method are free from:
(a) Seasonal variations (b) Errors (c) Percentages (d) Ratios
MCQ No 5.13
The chain base indices are not suitable for:
(a) Long range comparisons (b) Short range comparisons (c) Percentages (d) Ratios
MCQ No 5.14
An index number that can serve many purposes is called:
(a) General purpose index (b) Special purpose index
(c) Cost of living index (d) None of them
MCQ No 5.15
Another name of consumer's price index number is:
(a) Whole-sale price index number (b) Cost of living index
(c) Sensitive (d) Composite
MCQ No 5.16
Consumer price index indicates:
(a) Rise (b) Fall (c) Both (a) and (b) (d) Neither (a) and (b)
MCQ No 5.17
Purchasing power of money can be accessed through:
(a) Simple index (b) Fisher’s index (c) Consumer price index (d) Volume index
MCQ No 5.18
Cost of living at two different cities can be compared with the help of:
(a) Value index (b) Consumer price index (c) Volume index (d) Un-weighted index
MCQ No 5.19
Consumer price index numbers are obtained by:
(a) Laspeyre's formula (b) Fisher ideal formula
(c) Marshall Edgeworth formula (d) Paasche's formula
MCQ No 5.20
Laspeyre's index = 110, Paasche's index = 108, then Fisher's Ideal index is equal to:
(a) 110 (b) 108 (c) 100 (d) 109
MCQ No 5.21
Most commonly used index number is:
(a) Volume index number (b) Value index number
(c) Price index number (d) Simple index number
MCQ No 5.22
For consumer price index, price quotations are collected from:
(a) Fair price shops (b) Government depots (c) Retailers (d) Whole-sale dealers
MCQ No 5.23
Price relatives computed by chain base method are called:
(a) Price relatives (b) Chain indices (c) Link relatives (d) None of them
MCQ No 5.24
Consumer price index are obtained by:
(a) Paasche's formula (b) Fisher's ideal formula
(c) Marshall Edgeworth formula (d) Family budget method formula
MCQ No 5.25
The aggregative expenditure method and family budget method always give:
(a) Different results (b) Approximate results (c) Same results (d) None of them
MCQ No 5.26
In fixed base method, the base period should be:
(a) For away (b) Abnormal (c) Unreliable (d) Normal
MCQ No 5.27
If all the values are not of equal importance the index number is called:
(a) Simple (b) Unweighted (c) Weighted (d) None
MCQ No 5.28
Which of the following formula satisfy the time reversal test?
MCQ No 5.29
When the price of a year is. divided by the price of a particular year we get:
(a) Simple relative (b) Link relative (c) (a) and (b) both (d) None of them
MCQ No 5.30
When the price of a divided by the price of the preceding year, we, get:
(a) Value index (b) Link relative (c) Simple relative (d) None of them
MCQ No 5.31
The most appropriate average in averaging the price relatives is:
(a) Median (b) Harmonic mean (c) Arithmetic mean (d) Geometric mean
MCQ No 5.32
In constructing index number geometric mean relatives are:
(a) Non-reversible (b) Reciprocal (c) Reversible (d) None of them
MCQ No 5.33
The general purchasing power of the currency of a country is determined by:
(a) Retail price index (b) Volume index (c) Composite index (d) Whole-sale price index
MCQ No 5.34
What type of index number can help the government to formulate its price policies and to take
appropriate economic measures to control prices:
(a) Whole sale price index (b) Consumer's price (c) Quantity (d) None of them
MCQ No 5.35
The most suitable average in chain base method is:
(a) Arithmetic mean (b) Median (c) Mode (d) Geometric mean
MCQ No 5.36
Base year quantities weights are used in:
(a) Laspeyre's method (b) Paasche's method (c) Fisher's ideal method (d) Difficult to tell
MCQ No 5.37
Chain process is used to make comparisons of price index numbers in:
(a) Price relative (b) Link relative (c) Simple relative (d) None of the above
MCQ No 5.38
In the computation of consumer price index numbers, we use:
(a) Aggregate expenditure method (b) Family budget method
(c) Chain base method (d) Both (a) and (b)
MCQ No 5.39
The Federal Bureau of Statistics prepares:
(a) The wholesale price index (b) The consumer price index
(c) The sensitive price indicator (d) All of the above
MCQ No 5.40
While computing a weighted index, the current period quantities are used in the:
(a) Laspeyre's method· (b) Paasche's method
(c) Marshall Edgeworth method (d) Fisher's ideal method
MCQ No 5.41
The best method to measure the relative change in prices of commodities is:
(a) Quantity index number (b) Value index number
(c) Volume index number (d) Price index number
MCQ No 5.42
When the base year values are used as weights, the weighted average of relatives price index
number is the same as the:
(a) Laspeyre's index (b) Paasche's index (c) Simple aggregative index (d) Quantity index
MCQ No 5.43
To measure the relative change in purchasing a specified basket of goods and services between two
periods for a certain locality for fixed income group of people, we can use:
(a) Consumer price index (b) Paasche's price index (c) Cost of living index (d) Both (a) and (c)
MCQ No 5.44
Fisher's ideal index number is the geometric mean of the:
(a) Laspeyre's and Marshall Edgeworth indices
(b) Laspeyre's and Paasche's indices
(c) Paasche's and Marshal Edgeworth indices (d) all of the above
(d) All of the above
MCQ No 5.45
A number that measures a relative change in a single variable with respect to abase.is called:
(a) Good index number (b) Composite index number
(c) Simple index number (d) Quantity index number
MCQ No 5.46
A number that measures an average relative change in a group of related variables with respect to
A base is called:
(a) Simple index number (b) Composite index number
(c) Price index number (d) Quantity index number
MCQ No 5.47
An index number constructed to measure the relative change in the price of an item or a group of
items is called:
(a) Quantity index number (b) Price index number (c) Volume index number (d) Difficult to tell
MCQ No 5.48
When relative change is measured for a fixed period, it is called:
(a) Chain base method (b) Fixed base method
(c) Simple aggregative method (d) Cost of living Index method
MCQ No 5.49
The ratio of a sum of prices ill current period to the sum of prices ill the base period, expressed as a
percentage is called:
(a) Simple price index number
(b) Simple aggregative price index number
(c) Weighted aggregative price index number
(d) Quantity index number
MCQ No 5.50
An index that measures the average relative change in group of variables keeping in view the relative
importance of the variables is called:
(a) Simple index number (b) Composite index number
(c) Weighted index number (d) Price index number
MCQ No 5.51
Link relative of current year is equal to:
MCQ No 5.52
Simple average of relatives is equal to:
MCQ No 5.53
Paasche's price index number is also called:
(a) Base year weighted (b) Current year weighted
(c) Simple aggregative index (d) Consumer price index
MCQ No 5.54
Laspeyre's price index number is also called:
(a) Base year weighted (b) Current year weighted
(c) Cost of living index (d) Simple aggregative index
MCQ No 5.55
Index number having downward bias is:
(a) Laspeyre's index (b) Paasche’s index
(c) Fisher's ideal index (d) Marshall Edgeworth index
MCQ No 5.56
Index number having upward bias is:
(a) Laspeyre's index (b) Paasche's index (c) Fisher's ideal index (d) Marshal Edgworth index
MCQ No 5.57
Marshall Edgeworth price index was proposed by:
(a) One English economist (b) Two English economist
(c) Three English economist (d) Many English economist
MCQ No 5.58
Index number calculated by Fisher's formula is ideal because it satisfy:
(a) Circular test (b) Factor reversal test (c) Time reversal test (d) All of the above
MCQ No 5.59
The test which is lot obeyed by any of the weighted index numbers unless the weights are constant:
(a) Circular test (b) Time reversal test (c) Factor reversal test (d) None of them
MCQPROBABILITY
MCQ 6.1
When the possible outcomes of an experiment are equally likely to occur, this we apply:
(a) Relative probability (b) Subjective probability
(c) Conditional probability (d) Classical probability
MCQ 6.2
A number between 0 and 1 that is use to measure uncertainty is called:
(a) Random variable (b) Trial (c) Simple event (d) Probability
MCQ 6.3
Probability lies between:
(a) -1 and +1 (b) 0 and 1 (c) 0 and n (d) 0 and ∞
MCQ 6.4
Probability can be expressed as:
(a) Ration (b) Fraction (c) Percentage (d) All of the above
MCQ 6.5
The probability based on the concept of relative frequency is called:
(a) Empirical probability (b) Statistical probability (c) Both (a) and (b) (d) Neither (a) nor (b)
MCQ 6.6
The probability of an event cannot be:
(a) Equal to zero (b) Greater than zero (c) Equal to one (d) Less than zero
MCQ 6.7
A measure of the chance that an uncertain event will occur:
(a) An experiment (b) An event (c) A probability (d) A trial
MCQ 6.8
A graphical device used to list all possibilities of a sequence of outcomes in systematic way is
called:
(a) Probability histogram (b) Venn diagram (c) Pie diagram (d) Tree diagram
MCQ 6.9
A random experiment contains:
(a) At least one outcome (b) At least two outcomes
(c) At most one outcome (d) At most two outcomes
MCQ 6.10
The probability of all possible outcomes of a random experiment is always equal to:
(a) One (b) Zero (c) Infinity (d) All of the above
MCQ 6.11
The outcome of tossing a coin is a:
(a) Mutually exclusive event (b) Compound event (c) Certain event (d) Simple event
MCQ 6.12
The result of no interest of an experiment is called:
(a) Constant (b) Event (c) Failure (d) Success
MCQ 6.13
A set of all possible outcomes of an experiment is called:
(a) Combination (b) Sample point (c) Sample space (d) Compound event
MCQ 6.14
The numbers of counting rules that are useful in determining the number of outcomes in an
experiment are:
(a) One (d) Two (c) Three (d) Four
MCQ 6.15
The events having no experimental outcomes in common is called:
(a) Equally likely events (b) Exhaustive events
(c) Mutually exclusive events (d) Independent events
MCQ 6.16
A set of outcomes formed after some additional information is called:
(a) Sample space (b) Reduced sample space (c) Null set (d) Random experiment
MCQ 6.17
The probability associated with the reduced sample space is called:
(a) Conditional probability (b) Statistical probability
(c) Mathematical probability (d) Subjective probability
MCQ 6.18
An arrangement of objects without regard to order is called:
(a) Permutation (b) Combination (c) Random experiment (d) Sample point
MCQ 6.19
The number of permutations of a set of n things, taken r at a time with n 2 r given by:
MCQ 6.20
If three candidates are selected to attend a course from the ten candidates, the number of ways of selecting
the candidates is an example of:
(a) Combination (b) Permutation (c) Reduced sample space (d) Both (a) and (b)
MCQ 6.21
When each outcome of a sample space is as likely to occur as any other, the outcomes are called:
(a) Exhaustive (b) Mutually exclusive (c) Equally likely (d) Not mutually exclusive
MCQ 6.22
If A is any event in S and its complement, then P( ) is equal to:
(a) 1 (b) 0 (c) 1- A (d) 1 - P(A)
MCQ 6.23
When certainty is involved in a situation, its probability is equal to:
(a) Zero (b) Between -l and + 1 (c) Between 0 and 1 (d) One
MCQ 6.24
Which of the following cannot be taken as probability of an event?
(a) 0 (b) 0.5 (c) 1 (d)
-1
MCQ 6.25
If an event contains more than one sample points, it is called a:
(a) Simple event (b) Compound event (c) Impossible event (d) Certain event
MCQ 6.26
When the occurrence of one event has no effect on the probability of the occurrence of another
event, the events are called:
(a) Independent (b) Dependent (c) Mutually exclusive (d) Equally likely
MCQ 6.27
A particular result of an experiment is called:
(a) Trial (b) Simple event (c) Compound event (d) Outcome
MCQ 6.28
A collection of one or more outcomes of an experiment is called:
(a) Event (b) Outcome (c) Sample point (d) None of the above
MCQ 6.29
A process that leads to the occurrence of one and only one of several possible observations is
called:
(a) Random experiment (c) Random variable (c) Experiment (d) Probability distribution
MCQ 6.30
Which statement is false?
(a) The classical definition applies when there are n equally likely outcomes to an experiment
(b) The empirical definition occurs when number of times an event happen is divided by the number
of observations.
(c) A subjective probability is based on whatever information is available
(d) The general rule of addition is used when the events are mutually exclusive
MCQ 6.31
The term 'sample space' is used for:
(a) All possible outcomes (b) All possible coins (c) Probability (d) Sample
MCQ 6.32
The term 'event' is used for:
(a) Time (b) A sub-set of the sample space
(c) Probability (d) Total number of outcomes.
MCQ 6.33
The six faces of the die are called equally likely if the die is:
(a) Small (b) Fair (c) Six-faced (d) Round
MCQ 6.34
If we toss a coin and P(H) = 2P(T), then probability of head is equal to:
(a) 0 (b) 1/2 (c) 1/3 (d) 2/3
MCQ 6.35
A letter is chosen at random from the word "Statistics". The probability of getting a vowel is:
(a) 1/10 (b) 2/10 (c) 3/10 (d) 4/10
MCQ 6.36
An arrangement in which the order of the objects selected from a specific pool of objects is important
called:
(a) Combination (b) Permutation (c) Factorial (d) Sample space
MCQ 6.37
Two books are to be selected at random without replacement out of four books. Then number of possible
selections are:
(a) 4 (b) 2 (c) 6 (d) 3
MCQ 6.38
Three books of different colours are to be arranged in a book-shelf. The possible arrangements
are: (a) 3 (b) 1 (c) 6 (d) 2
MCQ 6.39
If a sample S = {1, 2}, the number of all possible sub-sets are:
(a) 2 (b) 1 (c) 3 (d)
4
MCQ 6.40
When a die and a coin are rolled together, all possible outcomes are:
(a) 6 (b) 2 (c) 36 (d) 12
MCQ 6.41
When two coins are tossed, the possible outcomes are:
(a) 2 (b) 4 (c) 1 (d) None of them
MCQ 6.42
If three coins are tossed, the possible outcomes are:
(a) 8 (b) 3 (c) 1 (d) None of them
MCQ 6.43
If n coins are tossed, the possible outcomes are:
(a) n (b) 2 (c) 2n (d) All of them
MCQ 6.44
If two dice are roiled, the possible outcomes are:
(a) 6 (b) 36 (c) 1 (d) Difficult to answer
MCQ 6.45
When n dice are rolled, the possible outcomes are:
(a) 6n (b) 6 (c) 1 (d) 18
MCQ 6.46
When one card is selected at random from a pack of 52 playing cards, the possible selections are:
(a) 104 (b) 52 (c) 520 (d) 2704
MCQ 6.47
Two cards are selected at random with replacement from a pack of 52 playing cards. The possible
outcomes are:
(a) 52 x 52 (b) 52 (c) 1326 (d) 2
MCQ 6.48
A bag contains 4 white and 2 black balls of the same size and weight, and two balls are selected at
random without replacement, the possible selections are:
(a) 6 (b) 4 (c) 36 (d) 15
MCQ 6.49
Two balls are selected at random with replacement from a bag containing 3 red, 3 black and 2 green
balls. The possible outcomes are:
(a) 8 (b) 64 (c) 16 (d) 2
MCQ 6.50
Five cards are selected at random from a pack of 52 cards with replacement. The possible
combinations are:
(a) 52 (b) (52)5 (c) 52 x 52 (d) (5)52
MCQ 6.51
The digits 1, 2, 3, 4, 5 are the roll numbers of 5 students. These roll numbers are written on the paper
slips and two paper slips are selected at random without replacement. The possible combinations are:
(a) 5 (b) 2 (c) 25 (d) 10
MCQ 6.52
Which is the impossible event when a die is rolled:
(a) 2 or 3 (b) 5 or 6 (c) 1 (d) 0 or 7
MCQ 6.53
The probability of drawing any one spade card is:
(a) 1/13 (b) 1/4 (c) 4/13 (d) 1/52
MCQ 6.54
A balance die is rolled, the probability of getting an odd number is:
(a) 1/2 (b) 1/4 (c) 1/6 (d) 1/36
MCQ 6.55
Two fair dice are rolled. The probability of throwing an odd sum is:
(a) 1 (b) 1/2 (c) 1/6 (d) 1/36
MCQ 6.56
Given P(A) = 0.4, P(B) = 0.5 and P(A⋃B)=0.9,then:
(a) A and B are not mutually exclusive events (b) A and B are equally likely events
(c) A and Bare independent events (d) A and B are mutually exclusive events
MCQ 6.57
If P(B/A) = 0.50 and P(A⋂B) = 0.40, then p(A) will be equal to:
(a) 0.40 (b) 0.50 (c) 0.80 (d) 1
MCQ 6.58
Which of the following statements is incorrect:
⋃ ⋂ ⋃ ⋂
⋂ ⋃ ⋂⋃
MCQ 6.59
If P(A/B) = P(A) and P(B/A)=P(B), then A and B are:
(a) Mutually exclusive (b) Dependent (c) Equally likely (d) Independent
MCQ 6.60
A fair coin is tossed 100 times, the expected number of heads is:
(a) 100 (b) 50 (c) 30 (d) 60
MCQ 6.61
When two dice are rolled, the maximum total on the two faces of the dice will
be: (a) 6 (b) 36 (c) 12 (d) 2
MCQ 6.62
A random sample of 200 random digits is selected from a random number table. Expected number of
zeros in the sample is:
(a) Zero (b) 10 (c) 20 (d) 5
MCQ 6.63
Six digits are selected at random again and again from a random number table and the even digits are
counted each time. In most of the cases, the number of even digits will be:
(a) 2 (b) 3 (c) 4 (d) 6
MCQ 6.64
Two events A and B are called mutually exclusive if:
(a) A⋃B = Φ (b) A⋂B = Φ (c) A⋂B = S (d) A⋂B = 1
MCQ 6.65
If A and B are two mutually exclusive events, then:
(a) P(A⋂B) = 0 (b) P(A⋂B) = 1 (c) P(A⋃B) = 0 (d) P(A⋂B) = S
MCQ 6.66
When A and B are two non-empty and mutually exclusive events, then:
(a) P(A⋃B) = P(A).P(B) (b) P(A⋃B) = P(A) + P(B)
(c) P(A⋂B) = P(A).P(B) (d) P(A⋂B) = P(A)+P(B)
MCQ 6.67
The two events A and B are called not mutually exclusive events if:
(a) A⋂B = Φ (b) A⋂B ≠ Φ (c) A⋃B = Φ (d) A⋂B = zero
MCQ 6.68
If A and B are disjoint events then the statement which is always true is:
(a) P(A/B) = 0 (b) P(A⋃B) = 0 (c) P(A⋂B) = 1 (d) P(A) = P(B)
MCQ 6.69
The events A, B and C are called exhaustive events if:
(a) A⋃B⋃C = S (b) A⋂B⋂C = S (c) A⋃B⋃C = Φ (d) A⋃B⋃C = Zero
MCQ 6.70
If A and B are not-mutually exclusive events, then:
(a) P(A⋃B) + P(A⋂B) = P(A) + P(B) (b) P(A⋃B) = P(A) + P(B)
(c) P(A⋃B) = P(A).P(B) (d) P(A⋂B) = P(A) + P(B)
MCQ 6.71
If an event is the complement of the event A, then:
(a) A⋃ = S (b) A⋂ = S (c) A⋃ =Φ (d) P(A) = P( )
MCQ 6.72
If A1, A2, A3, ..., Ak are k mutually exclusive events, then:
(a) P(A1⋃A2⋃A3⋃ ...⋃Ak ) = P(A1)+P(A2)+P(A3)+...+ P(Ak)
(b) P(A1⋃A2⋃A3⋃ ...⋃Ak ) > 1
(c) P(A1⋂A2⋂A3⋂ ...⋂Ak ) = 1
(d) P(A1⋂A2⋂A3⋂ ...⋂Ak ) = P(A1⋃A2⋃A3⋃ ...⋃Ak )
MCQ 6.73
If A is an empty set and B is a non-empty set then:
(a) A⋂B = S (b) A⋂B = B (c) A⋃B = B (d) P(A) = P(B)
MCQ 6.74
If A is an empty set and S is the sample space then:
(a) P(A⋃S) = P(S) (b) P(A⋃S) = P(Φ) (c) P(A⋂S) = 1 (d) P(A⋃S) = Zero
MCQ 6.75
If A and B are independent events, then:
(a) P(A⋃B) = P(A).P(B) (b) P(A⋂B) = P(A).P(B)
(c) P(A⋂B) = P(A)+P(B) (d) P(A) = P(B)
MCQ 6.76
If A and B are two independent events, then:
(a) P(A/B) = P(A) (b) P(A) = P(B) (c) P(A) < P(B) (d) P(A/B) = P(B/A)
MCQ 6.77
A and B are two independent events. Which one of these equations is false?
(a) P(A⋂ ) = P(A)P( ) (b) P( ⋂ ) = P( ⋂ )
(c) P( ⋂ ) = P( )P( ) (d) P(A⋃B) = P(A)P(B)
MCQ 6.78
The conditional probability of the event A when event B has occurred is denoted by:
(a) P(A + B) (b) P(A - B) (c) P(A/B) (d) P( )
MCQ 6.79
If A and B are any two events, then P(A/B)+P( /B) is equal to:
(a) 0 (b) 0.25 (c) 0.5 (d) 1
MCQ 6.80
If A is an arbitrary event, then P(A/A) is equal to :
(a) Zero (b) One (c) Infinity (d) Less than one
MCQ 6.81
If A and B are any two events, then P( /B) is equal to:
(a) P(A/B) (b) 1- P(A/B) (c) 1+ P(A/B) (d) P( ⋂B)
MCQ 6.82
If A and B are any two events, then P(A⋃ ):
(a) 1+P(A⋂B) (b) 1-P(A⋃B) (c) 1- P(A⋂B) (d) P(A)+P(B)
MCQ 6.83
If A and B are any two events, then P( ⋂ ):
(a) 1-P(A⋃B) (b) 1-P(A⋂B) (c) 1-P( ⋂B) (d) 1-P(A⋂ )
MCQ 6.84
Which of the following statements is correct?
⋂ ⋃ ⋂ ⋃ ⋂ ⋃⋂⋂⋂⋃
⋂ ⋂ ⋃ ⋃ ⋂ ⋂⋃ ⋂
MCQ 6.85
If A and B are two mutually exclusive and exhaustive events and P(A)=2P(B), then P(B) is equal to:
(a) 1/2 (b) 2/3 (c) 1/3 (d) 1/4
MCQ 6.86
Two coins are tossed. Probability of getting head on the first coin is:
(a) 2/4 (a) 1 (c) Zero (d) 4
MCQ 6.87
A die and a coin are tossed together. Probability of getting head on the coin is:
(a) 6/12 (b) 6 (c) 12 (d) Zero
MCQ 6.88
A fair die is rolled. Probability of getting even face given that face is less than 5 is given by:
(a) 1/2 (b) 5 (c) 2 (d) 6
MCQ 6.89
Two coins are tossed. The probability that both faces will be matching given by:
(a) 1/4 (b) 1/2 (c) 1 (d) Zero
MCQ 6.90
Two coins are tossed. Probability of getting two heads given that there is at least one head is given
by:
(a) 1/2 (b) 1/3 (c) 1/4 (d) 2/3
MCQ 6.91
A fair die is rolled. Probability of getting more than4 or less than 3 is given by:
(a) 2/3 (b) 1/3 (c) 1/2 (d) 4/3
MCQ 6.92
74. A fair die is rolled. Probability of getting even face or face more than 4 is:
(a) 1/3 (b) 2/3 (c) 1/2 (d) 5/6
MCQ 6.93
Two dice are rolled. Probability of getting similar faces is:
(a) 5/36 (b) 1/6 (c) 1/3 (d) 1/2
MCQ 6.94
Two dice are rolled. Probability of getting total less than 4 or total more than 10 is given
by: (a) 10/36 (c) 4/36 (c) 1/36 (d) 14/36
MCQ 6.95
Two dice are rolled. Probability of getting a total of 4 given that both-faces are similar is:
(a) 5/36 (b) 1/36 (c) 4/36 (d) 1/6
MCQ 6.96
If A and B are two not-independent events, then the probability that both A and B will happen
together is:
(a) P(A⋂B) = P(A)P(B/A) (b) P(A⋂B) = P(A)P(B)
(c) P(A⋂B) = P(A)+P(B) (d) P(A⋂B) = P(A)
MCQ 6.97
If A and B are two dependent events, then:
(a) P(A) P(B/A) = P(B)P(A/B) (b) P(A/B) = P(B/A)
(c) P(A/B) = P(A) (d) P(A) = P(B)
MCQ 6.98
Which one is true?
MCQ 6.99
MCQ 6.100
MCQ 6.101
Given P(A)=2/3, P(B)=3/8 and PAB)=1/4, then A and B are:
(a) Independent (b) Dependent (c) Mutually exclusive (d) Equally likely
MCQ BINOMIAL AND HYPERGEOMETRIC DISTRIBUTIONS
MCQ 8.1
A Bernoulli trial has:
(a) At least two outcomes (b) At most two outcomes
(c) Two outcomes (d) Fewer than two outcomes
MCQ 8.2
The two mutually exclusive outcomes in a Bernoulli trial are usually called:
(a) Success and failure (b) Variable and constant
(c) Mean and variance (d) With and without replacement
MCQ 8.3
Nature of the binomial random variable X is:
(a) Quantitative (b) Qualitative (c) Discrete (d) Continuous
MCQ 8.4
In a binomial probability distribution, the sum of probability of failure and probability of success is
always:
(a) Zero (b) Less than 0.5 (c) Greater than 0.5 (d) One
MCQ 8.5
Ina binomial experiment, the successive trials are:
(a) Dependent (b) Independent (c) Mutually exclusive (d) Fixed
MCQ 8.6
The parameters of the binomial distribution are:
(a) n and p (b) p and q (c) np and nq (d) np and npq
MCQ 8.7
The range of binomial distribution is:
(a) 0 to n (b) 0 to ∞ (c) -1 to +1 (d) 0 to 1
MCQ 8.8
The mean and standard deviation of the binomial probability distribution 'are respectively:
(a) np and npq (b) np and (c) np and nq (d) n and p
MCQ 8.9
In a binomial experiment with three trials, the variable can take:
(a) 2 values (b) 3 values (c) 4 values (d) 5 values
MCQ 8.10
The shape of the binomial probability distribution depends upon the values of its:
(a) Mean (b) Variance (c) Parameters (d) Quartiles
MCQ 8.11
In binomial distribution the numbers of trials are:
(a) Very large (b) Very small (c) Fixed (d) Not fixed
MCQ 8.12
In a binomial probability distribution, relation between mean and variance is:
(a) Mean < Variance (b) Mean = Variance
(c) Mean > Variance (d) Difficult to tell
MCQ 8.13
In binomial distribution when n = 1, then it becomes:
(a) Hypergeometric distribution (b) Normal distribution
(c) Uniform distribution (d) Bernoulli distribution
MCQ 8.14
The mean of a binomial distribution depends on:
(a) Number of trials (b) Probability of success
(c) Probability of failure (d) Number of trials and probability of success
MCQ 8.15
The variance of a binomial distribution depends on:
(a) Number of trials (b) Probability of success
(c) Probability of failure (d) All of the above
MCQ 8.16
Which of the following is not property of a binomial experiment?
(a) Probability of success remains constant
(b) n is fixed
(c) Successive trials are dependent
(d) It has two parameters
MCQ 8.17
The binomial probability distribution is symmetrical when:
(a) p = q (b) p < q (c) p > q (d) np > npq
MCQ 8.18
The binomial distribution is negatively skewed if:
(a) p < 1/2 (b) p = 1/2 (c) p > 1/2 (d) p = 1
MCQ 8.19
In a binomial probability distribution, the skewness is positive for:
(a) p < 1/2 (b) p = 1/4 (c) np = npq (d) np = nq
MCQ 8.20
Which of the following statements is false?
(a) Expected value of a constant
(b) In a binomial distribution the standard deviation is always less than its variance
(c) In a binomial distribution the mean is always greater than its variance
(d) In binomial experiment the probability of success remains constant from trial to trial
MCQ 8.21
If a binomial probability distribution has parameters (n, p)= (5, 0.6), the probability of x = 3.5 is:
(a) 0 (b) 1 (c) 0.6 (d) 0.4
MCQ 8.22
In a binomial experiment n= 4, P(x=2) = 216/625 and P(x=3) = 216/625. P(x=-2) is:
(a) 216/625 (b) 1 (c) 0.6 (d) Difficult to tell
MCQ 8.23
If n = 6 and p= 0.9 then the value of P(x=7) is:
(a) Zero (b) Less than zero (c) More than zero (d) One
MCQ 8.24
In a binomial probability distribution, coefficient of skewness = = 0, it means that the
distribution is:
(a) Symmetrical (b) Skewed to the left (c) Skewed to the right (d) Highly skewed
MCQ 8.25
For a binomial distribution with n = 10, p = 0.5, the probability of zero or more successes is:
(a) 1 (b) 0.5 (c) 0.25 (d) 0.75
MCQ 8.26
In a binomial distribution, the mean, median and mode coincide when:
(a) p < 1/2 (b) p > ½ (c) p ≠ 1/2 (d) p = 1/2
MCQ 8.27
In which distribution, the probability success remains constant from trial to trial?
(a) Hypergeometric distribution (b) Binomial distribution
(c) Sampling distribution (d) Frequency distribution
MCQ 8.28
In a binomial experiment when n = 5, the maximum number of successes will be:
(a) 0 (b) 2.5 (c) 4 (d) 5
MCQ 8.29
In a binomial experiment when n = 10, the minimum number of successes will be:
(a) 0 (b) 5 (c) 10 (d) 11
MCQ 8.30
If n = 10 and p = 0.6, then P(x ≥ 0) is:
(a) 0.5 (b) 0.6 (c) 1.0 (d) 1.2
MCQ 8.31
A random variable X has a binomial distribution with n = 4, the standard deviation of X is:
(a) 4 pq (b) 2 (c) 4 p (d) 4 (q+p)
MCQ 8.32
In a multiple choice test there are five possible answers to each of 20 questions. If a candidate
guesses the correct answer each time, the mean number of correct answers is:
(a) 4 (b) 5 (c) 1/5 (d) 20
MCQ 8.33
If three coins are tossed, the probability of two heads is:
(a) 1/8 (b) 3/8 (c) 2/3 (d) 0
MCQ 8.34
Random variable x has binomial distribution with n = 8 and p = ½.. The most probable value of X is:
(a) 2 (b) 3 (c) 4 (d) 5
MCQ 8.35
The value of second moment about the mean in a binomial distribution is 36. The value of the
standard deviation of a binomial distribution is:
(a) 36 (b) 6 (c) 1/36 (d) 1/6
MCQ 8.36
For a binomial probability distribution, the expected frequency of x successes in N experiments is:
MCQ 8.37
In a binomial frequency distribution 100 (1/5 + 4/5)5. The parameters n and p are respectively:
(a) (5, 1/5) (b) (1/5, 4/5) (c) (100, 4/5) (d) (5, 4/5)
MCQ 8.38
For a binomial frequency distribution 100 (1/5 + 4/5)5, the mean is:
(a) 1/5 (b) 4/5 (c) 5 (d) 4
MCQ 8.39
For a binomial distribution (1/3 + 2/3)18, the standard deviation of the binomial distribution will
be:
(a) 2 (b) 4 (c) 6 (d) 12
MCQ 8.40
The hypergeometric distribution has:
(a) One parameter (b) Two parameters (c) Three parameters (d) Four parameters
MCQ 8.41
The parameters of the hypergeometric distribution are:
(a) N, n, p (b) N, n, np (c) N, n, k (d) n and p
MCQ 8.42
Nature of the Hypergeometric random variable is:
(a) Continuous (b) Discrete (c) Qualitative (d) Quantitative
MCQ 8.43
In hypergeometric· distribution, the successive trials are:
(a) Independent (b) Dependent (c) Very large (d) Very small
MCQ 8.44
In a hypergeometric distribution, the probability of success:
(a) Remains constant from trial to trial
(b) Does not remain constant from trial to trial
(c) Equal to probability of failure
(d) Less than probability of failure
MCQ 8.45
If in a hypergeometric distribution N = 10, k = 5 and n = 4; then the probability of failure is:
(a) 2 (b) 0.5 (c) 1 (d) 0.25
MCQ 8.46
The rang of hypergeometric distribution is:
(a) 0 to n (b) 0 to k (c) 0 to N (d) 0 to n or k (whichever is less)
MCQ 8.47
The number of trials in hypergeometric distribution is:
(a) Not fixed (b) Fixed (c) Large (d) Small
MCQ 8.48
The probability of a success changes from trial to trial in:
(a) Binomial distribution (b) Hypergeometric distribution
(c) Normal distribution (d) Frequency distribution
MCQ 8.49
The mean of the hypergeometric distribution is:
MCQ 8.50
The standard deviation of the hypergeometric distribution is:
MCQ 8.51
In hypergeometric probability distribution, the relation between mean and variance is:
(a) Mean > variance (b) Mean < Variance (c) Mean = Variance (d) Mean = 2Variance
MCQ 8.52
Which of the following is the property of hypergeometric experiment?
(a) p remains constant from trial to trial
(b) Successive trials are independent
(c) Sampling is performed without replacement
(d) n is not fixed
MCQ 8.53
Hypergeometric distribution reduces to binomial distribution when:
(a) N = n (b) n → ∞ (c) N → ∞ (d) N < n
MCQ 8.54
In a hypergeometric distribution N=6, n=4 and k=3, then the mean is equal to:
(a) 2 (b) 4 (c) 6 (d) 24
MCQ 8.55
Given N = 11, n = 5, k = 7; P(x ≥ 1) equals:
(a) 1 (b) 1/66 (c) 65/66 (d) None of the above
MCQ 8.56
Given N =12, n =5, k= 4; P(x ≤ 4) equals:
(a) Less than one (b) Exactly one (c) More than one (d) Between 0.5 and 1
1.(c) 2.(a) 3.(c) 4.(d) 5.(b) 6.(a) 7.(a) 8.(b) 9.(c) 10.(c) 11.(c) 12.(c) 13.(d) 14.(d) 15.(d)
16.(c) 17.(a) 18.(c) 19.(a) 20.(b) 21.(a) 22.(c) 23.(a) 24.(a) 25.(a) 26.(d) 27.(b) 28.(d) 29.(a) 30.(c)
31.(b) 32.(a) 33.(b) 34.(c) 35.(b) 36.(c) 37.(d) 38.(d) 39.(a) 40.(c) 41.(c) 42.(b) 43.(b) 44.(b) 45.(b)
46.(d) 47.(b) 48.(b) 49.(a) 50.(b) 51.(a) 52.(c) 53.(c) 54.(a) 55.(a) 56.(b)
MCQ NORMAL DISTRIBUTION
MCQ 10.1
The range of normal distribution is:
(a) 0 to n (b) 0 to ∞ (c) -1 to +1 (d) -∞ to +∞
MCQ 10.2
In normal distribution:
(a) Mean = Median = Mode (b) Mean < Median < Mode
(c) Mean> Median > Mode (d) Mean ≠ Median ≠ Mode
MCQ 10.3
Which of the following is true for the normal curve:
(a) Symmetrical (b) Unimodel (c) Bell-shaped (d) All of the above
MCQ 10.4
In a normal curve, the ordinate is highest at:
(a) Mean (b) Variance (b) Standard deviation (d) Q1
MCQ 10.5
The parameters of the normal distribution are:
(a) µ and σ2 (b) µ and σ (c) np and nq (d) n and p
MCQ 10.6
The shape of the normal curve depends upon the value of:
(a) Standard deviation (b) Q1 (c) Mean deviation (d) Quartile deviation
MCQ 10.7
The normal distribution is a proper probability distribution of a continuous random variable, the total area
under the curve f(x) is:
(a) Equal to one (b) Less than one (c) More than one (d) Between -1 and +1
MCQ 10.8
In a normal probability distribution of a continuous random variable, the value of standard deviation is:
(a) Zero (b) Less than zero (c) Greater than zero (d) None of the above
MCQ 10.9
In a normal curve, the highest point on the curve occurs at the mean, µ, which is also the:
(a) Median and mode (b) Geometric mean and harmonic mean
(c) Lower and upper quartiles (d) Variance and standard deviation
MCQ 10.10
The normal curve is symmetrical and for symmetrical distribution, the values of all odd order moments
about mean will always be:
(a) 1 (b) 0.5 (c) 0.25 (d) 0
MCQ 10.11
If , the points of inflection of normal distribution are:
(a) (b) (c) (d)
MCQ 10.12
In normal probability distribution for a continuous random variable, the value of a mean deviation is
approximately equal to:
(a) 2/3 (b) 2/3 σ (c) 4/5 (d) 4/5 σ
MCQ 10.13
In a normal distribution whose mean is land standard deviation 0, the value 4 quartile deviation is
approximately:
(a) 4/5 (b) 4/5 σ (c) 2/3 σ (d) 2/3
MCQ 10.14
In a normal distribution, the lower and upper quartiles are equidistant from the mean and are at a distance of:
(a) 0.7979 (b) 0.7979 σ (c) 0.6745 (d) 0.6745 σ
MCQ 10.15
The value of e is approximately equal to:
(a) 2.7183 (b) 2.1783 (c) 2.8173 (d) 2.1416
MCQ 10.16
The value of π is approximately equal to:
(a) 3.4116 (b) 3.1416 (c) 3.1614 (d) 3.6416
MCQ 10.17
If , the standard normal variate is distributed as:
(a) (b) (c) (d)
MCQ 10.18
The coefficient of skewness of a normal distribution is:
(a) Positive (b) Negative (c) Zero (d) Three
MCQ 10.19
The total area of the normal probability density function is equal to:
(a) 0 (b) 0.5 (c) 1 (d) 0.25
MCQ 10.20
In a standard normal distribution, the value of mode is:
(a) Equal to zero (b) Less than zero (c) Greater than zero (d) Exactly one
MCQ 10.21
The normal probability density function curve is symmetrical about the mean, µ, i.e. the area to the right of
the mean is the same as the area to the left of the mean. This means that P(X<µ) =P(X>µ) is equal to:
(a) 0 (b) 1 (c) 0.5 (d) 0.25
MCQ 10.22
The skewness and kurtosis of the normal distribution are respectively:
(a) Zero and zero (b) Zero and one (c) One and zero (d) One and one
MCQ 10.23
In a normal curve µ ± 0.6745σ covers:
(a) 50% area (b) 68.27% area (c) 95.45% area (d) 99.73% area
MCQ 10.24
The lower and upper quartiles for a standardized normal variate are respectively:
(a) -0.6745σ and 0.6745σ (b) -0.6745 σ and 0.6745
(c) 0.7979σ and 0.7979σ (d) -0.7979 and 0.7979
MCQ 10.25
The maximum ordinate of a normal curve is at:
(a) X = µ (b) X = µ + σ (c) X = µ - 2σ (d) X = σ2
MCQ 10.26
The value of the standard deviation σ of a normal distribution is always:
(a) Equal to zero (b) Greater than zero (c) Less than zero (d) Equal to 0.5
MCQ 10.27
If X~N(100, 64), then standard deviation σ is:
(a) 100 (b) 64 (c) 8 (d) 100 - 64 = 36
MCQ 10.28
If , the coefficient of variation is equal to:
(a) Zero (b) One (c) Infinity (d) Hundred percent
MCQ 10.29
The points of inflection of the standard normal distribution lie at:
(a) -1 and 0 (b) 0 and 1 (c) -1 and +1 (d) µ and σ
MCQ 10.30
If , then µ4 is equal to:
(a) 0 (b) 1 (c) 3 (d) σ4
MCQ 10.31
The value of second moment about the mean in a normal distribution is 5. The fourth moment about
the mean in the distribution is:
(a) 5 (b) 15 (c) 25 (d) 75
MCQ 10.32
If X is a normal random variable having mean µ, then E|X - µ| is equal to:
(a) Variance (b) Standard deviation (c) Quartile deviation (d) Mean deviation
MCQ 10.33
If X is a normal random variable having mean µ, then E(X - µ)2 is equal to:
(a) σ2 (b) σ (c) 3σ4 (d) β1
MCQ 10.34
Which of the following is possible in normal distribution?
(a) σ < 0 (b) σ = 0 (c) σ > 0 (d) σ > n
MCQ 10.35
The range of standard normal distribution is:
(a) 0 to n (b) 0 to ∞ (c) 0 to k (d) -∞ to +∞
MCQ 10.36
In the normal distribution, the value of the maximum ordinate is equal to:
MCQ 10.37
The value of the ordinate at points of inflection of the normal curve is equal to:
MCQ 10.38
If , then β2 is equal to:
(a) 0 (b) 3 (c) 3σ4 (d) σ2
MCQ 10.39
Pearson’s constants for a normal distribution with mean µ and variance σ2 are:
(a) β1=0, β2=0, γ1=0, γ2=0 (b) β1=0, β2=1, γ1=1, γ2=3
(c) β1=0, β2=3, γ1=0, γ2=0 (d) β1=3, β2=0, γ1=0, γ2=0
MCQ 10.40
The value of maximum ordinate in standard normal distribution is equal to:
MCQ 10.41
A random variable X is normally distributed with µ = 70 and σ2 = 25. The third moment about arithmetic
mean is:
(a) Zero (b) Less than zero (c) Greater than zero (d) None of the above
MCQ 10.42
For the standard normal distribution, P(Z > mean) is:
(a) More than 0.5 (b) Less than 0.5 (c) Equal to 0.5 (d) Difficult to tell
MCQ 10.43
Given a standardized normal distribution (with a mean of zero and a standard' deviation of one),
P(Z < variance) is equal to:
(a) 0.8413 (b) 0.3413 (c) 0.1587 (d) 0.5000
MCQ 10.44
The area to the left of (µ+σ) for a normal distribution is approximately equal to:
(a) 0.16 (b) 0.34 (c) 0.50 (d) 0.84
MCQ 10.45
The median of a normal distribution corresponds to a value of Z is:
(a) 0 (b) 1 (c) 0.5 (d) -0.5
MCQ 10. 46
The mean and standard deviation of the standard normal distribution a respectively:
(a) 0 and 1 (b) 1 and 0 (c) µ and σ2 (d) π and e
MCQ 10.47
In a standard normal distribution, the area to the left of Z = 1 is:
(a) 0.6413 (b) 0.7413 (c) 0.8413 (d) 0.3413
MCQ 10.48
The semi-inter quartile range for a standard normal random variable Z is:
(a) 0.6745 (b) 0.6745 σ (c) 0.7979 (d) 0.7979 σ
MCQ 10.49
If , then µ4 is equal to:
(a) 3 (b) 3 σ (c) 3 σ2 (d) 3 σ4
MCQ 10.50
If , then β2 is equal to:
(a) 0 (b) 3 (c) 3 σ4 (d) σ4/3
MCQ 10.51
P(µ-σ < X < µ+σ) is equal to:
(a) 0.5000 (b) 0.6827 (c) 0.9545 (b) 0.9973
MCQ 10.52
In a normal curve µ ± 2σ covers:
(a) 50% area (b) 68.27% area (c) 95.45% area (d) 99.73% area
MCQ 10.53
In X is N(µ, σ2), the percentage of the area contained within the limits µ ± 3σ:
(a) 50% (b) 68.27% (c) 95.45% (d) 99.73%
MCQ 10.54
Most of the area under the normal curve with parameters µ and σ lies between:
(a) µ - 0.5σ and µ + 0.5σ (b) µ - σ and µ + σ
(c) µ - 2σ and µ + 2σ (d) µ - 3σ and µ + 3σ
MCQ 10.55
The probability density function of the standard normal distribution is:
MCQ
10.56
The equation of the normal frequency distribution is:
MCQ 10.57
If X is N(µ,σ2) and if Y =a + bX, then mean and variance of Y are respectively:
(a) µ and σ2 (b) a + µ and bσ2 (c) a + bµ and σ2 (d) a + bµ and b2σ2
MCQ 10.58
For a normal distribution with mean µ and standard deviation σ:
(a) Approximately 5% of values are outside the range (µ - 2σ) to (µ + 2σ)
(b) Approximately 5% of values are greater than (µ + 2σ)
(c) Approximately 5% of values are outside the range (µ - σ) to (µ + σ)
(d) Approximately 5% of values are less than (µ - 3σ)
MCQ 10.59
The normal probability distribution with mean np and variance npq may used to approximate the
binomial distribution if n ≥ 50 and both np and nq are:
(a) Greater than 5 (b) Less than 5 (c) Equal to 5 (d) Difficult to tell
MCQ 10.60
In a normal distribution Q1 = 20 and Q3 = 40, then mean is equal to:
(a) 20 (b) 30 (a) 40 (b) 60
MCQ 10.61
If Z is a standard normal variate, then P(-1.645 ≤ Z ≤ +1.645) is equal to:
(a) 0.90 (b) 0.95 (c) 0.98 (d) 0.99
MCQ 10.62
If Z is a standard normal variate, then P(-2.33 ≤ Z ≤ +2.33) is equal to:
(a) 0.4901 (b) 0.6827 (c) 0.9545 (d) 0.9802
MCQ 10.63
If Z is a standard normal variate, then P(- 2.575 ≤ Z ≤ +2.575) is equal to:
(a) 0.9951 (b) 0.99 (c) 0.4951 (d) 0.4949
MCQ 10.64
If Z is a standard normal variate, then P[ IZI< 1.96] is equal to:
(a) 0.0250 (b) 0.4750 (c) 0.95 (d) 0.9750
MCQ 10.65
For a normal distribution with µ = 10, σ = 2, the probability of a value greater than 10
is: (a) 0.1915 (b) 0.3085 (c) 0.6915 (d) 0.5000
MCQ 10.66
Given a random variable X which is normally distributed with a mean and variance both equal to 100.
The value of mean deviation is approximately equal to:
(a) 7 (b) 8 (c) 8.5 (d) 9
MCQ 10.67
If X is a normal variate with mean 50 and standard deviation 3. The value of quartile
deviation is approximately equal to:
(a) 1 (b) 1.5 (c) 2 (d) 2.5
MCQ 10.68
In a normal distribution mean is 100 and standard deviation is 10. The values of points of inflection
are:
(a) 100 and 110 (b) 80 and 120 (c) 90 and 110 (d) None of the above
MCQ 10.69
If X is a normal variate with mean 20 and variance 16. The respective values of β1 and β2 are:
(a) 0 and 3 (b) 3 and 1 (c) 0.5 and 1 (d) 3 and 3
MCQ 10.70
If X is N(100; 5), the fourth central moment is:
(a) 65 (b) 75 (c) 85 (d) 100
MCQ 10.71
A normal distribution has the mean µ=200. If 70 percent of the area under the curve lies to the left
of 220, the area to the right of 220 is:
(a) 0.3 (b) 0.5 (c) 0.2 (d) 0.7
MCQ 10.72
Given a normal distribution with µ = 100 and σ2 = 100, the area to the left of 100 is:
(a) One (b) Equal to 0.5 (c) Less than 0.5 (d) Greater than 0.5
MCQ 10.73
If a normal distribution with µ = 200 have P(X > 225) = 0.1587, then P(X < 175) equal to:
(a) 0.3413 (b) 0.8413 (c) 0.1587 (d) 0.5000
MCQ 10.74
A random variable has a normal distribution with the mean µ = 400. If 8 percent of the area under
the curve lies to the left of 500, the area between 400 and 500 is:
(a) 0.5 (b) 0.2 (c) 0.3 (d) Zero
MCQ 10.75
If Y = 5X+ 10 and X is N(10, 25), then mean of Y is:
(a) 50 (b) 60 (c) 70 (d) 135
MCQ 10.76
If X is a normal random variable with mean µ = 50 arid standard deviation σ = 7, if Y = X – 7 then standard
deviation of Y is:
(a) 7 (b) 14 (c) 0 (d) 49
Introduction
Statics Collection of data. Presentation
of data. Analysis of data.
Interpretation of data For
Research projects.
Types of Statics
Descriptive statistics
If a business analyst is using data gathered on a
group to describe or reach conclusions about that
same group, the statistics are called descriptive
statistics.
Inferential statistics
If a researcher gathers data from a sample and uses the
statistics generated to reach conclusions about the
population from which the sample was taken, the
statistics are inferential statistics.
Population:
The collection of all individuals, items or data under consideration
in a statistical study.
Sample:
That part of the population from which information is collected.
Parameter: Numerical calculation of population.
Static: Result of sample.
Variable:
A characteristic that varies with an individual or an object, is called
a variable.
For example, age is a variable as it varies from person to
person. A variable can assume a number of values. The given
set of all possible values from which the variable takes on a
value is called its Domain. If for a given problem, the domain
of a variable contains only one value, then the variable is
referred to as a constant.
Qualitative variable.
If the characteristic is non-numerical such as education, sex,
eye-color, quality, intelligence, poverty, satisfaction, etc. the
variable is referred to as a qualitative variable. A qualitative
characteristic is also called an attribute
Quantitative variable
A variable is called a quantitative variable when a
characteristic can be expressed numerically such as age,
weight, income or number of children.
Secondary Data
They are the data that are sourced from someplace that has originally collected it. This means that this kind
of data has already been collected by some researchers or investigators in the past and is available either in
published or unpublished form. This information is impure as statistical operations may have been
performed on them already. An example is an information available on the Government of Pakistan, the
Department of Finance’s website or in other repositories, books, journals, etc.
Class Limit
Corresponding to a class interval, the class limits may be defined as the minimum value and the maximum
value the class interval may contain.
The minimum value is known as the lower class limit (LCL) and the maximum value is known as the upper
class limit (UCL).
Class Boundary
Class boundaries may be defined as the actual class limit of a class interval.
For overlapping classification or mutually exclusive classification that excludes the upper class limits like
10–20, 20–30, 30–40 … etc. the class boundaries coincide with the class limits.
This is usually done for a continuous variable. However, for non-overlapping or mutually inclusive
classification that includes both the class limits like 0–9, 10–19, 20–29 … which is usually applicable for a
discrete variable, we have
𝐿𝐶𝐵 = 𝐿𝐶𝐿 − 𝐷/2
𝑈𝐶𝐵 = 𝑈𝐶𝐿 + 𝐷/2
Where D is the difference between the LCL of the next class interval and the UCL of the given class
interval.
For the data presented in the above table, LCB of the first class interval and the corresponding UCB
Apart from the stuff class limit and class boundary, let us look at the midpoint of a class interval.
Example 1
Tally marks are often used to make a frequency distribution table. For example, let’s say you survey a
number of households and find out how many pets they own. The results are 3, 0, 1, 4, 4, 1, 2, 0, 2, 2, 0, 2, 0,
1, 3, 1, 2, 1, 1, and 3. Looking at that string of numbers boggles the eye; a frequency distribution table will
make the data easier to understand.
To make the frequency distribution table, first write the categories in one column (number of pets):
Next, tally the numbers in each category (from the results above). For example, the number zero appears
four times in the list, so put four tally marks “||||”:
Finally, count up the tally marks and write the frequency in the final column. The frequency is just the
total. You have four tally marks for “0”, so put 4 in the last column:
Example 2
. The list of IQ scores are: 118, 123, 124, 125, 127, 128, 129, 130, 130, 133, 136, 138, 141, 142, 149, 150,
and 154. Class Interval is 8.
Tally the numbers in each category from the above. For example four numbers exist between 118-125, so
put four tally marks “||||”:
IQ TALLY NUMBER
118-125 ||||
126-133 |||| |
134-141 |||
142-149 ||
150-157 ||
Finally, count up the tally marks and write the frequency in the final column.
IQ TALLY NUMBER
118-125 |||| 4
126-133 |||| | 6
134-141 ||| 3
142-149 || 2
150-157 || 2
Presentation of Data
Relative frequency If frequency of a class is divided by the sum of frequencies we get
what is called a relative frequency. If we calculate the relative
frequencies for all the classes, we get the relative frequency
distribution. The total of the relative frequencies is equal to 1.
RELATIVE FREQUENCY TABLE
Weights
Frequency
(kilograms) Relative fr
55-57 4 4/40=
58-60 6 6/40=
61-63 14 14/40=
64-66 12 12/40=
67-69 4 4/40=
Total 40
The relative frequencies are also called proportions and in
discussion on probability we shall call them probabilities of the
classes. The idea of relative frequencies is helpful in understanding
the basic lessons on probability. It is also used in the normal
distribution and other probability distributions where the total area
under the curve is unity.
Percentage relative frequency distribution
If a relative frequency is multiplied by 100, we get
percentage relative frequency. If all the relative frequencies
are converted into percentage relative frequencies, we get
percentage relative frequency distribution or simply
percentage frequency distribution.
PERCENTAGE RELATIVE FREQUENCY TABLE
Weights
Frequency
(kilograms) Relative fr
55-57 4 4/40x100=
58-60 6 6/40x100=
61-63 14 14/40x100=
64-66 12 12/40x100=
67-69 4 4/40x100=
Total 40
Cumulative frequency distribution
For cumulative frequency distribution, the class limits are
converted into class boundaries. Cumulative frequency of a class
is the total of all frequencies up to that class.
Less than' cumulative frequencies
Cumulative frequency of the class 57.5-60.5 is 4+6=10 and the
cumulative frequency of the class 60.5-63.5 is 4+6+ 14=24. This
means that there are 10 observations less than 60.5 and there are 24
observations less than 63.5. These are called less than' cumulative
frequencies.
CUMULATIVE FREQUENCY TABLE(Less Then)
Weights Class Weigh
Frequency
(kilograms) Boundaries Less Th
55-57 4 54.5- 57.5 less than
58-60 6 57.5-60.5 less than
61-63 14 60.5-63.5 less than
64-66 12 63.5-66.5 less than
67-69 4 66.5 - 69.5 less than
More than' cumulative frequencies
If we calculate the cumulative frequencies from the bottom, we get what are called "more than cumulative
frequencies. Thus there are 4 observations more than 66.5, there are 4+ 12=16 observations more than 63.5
and there are 4+12+14=30 observations more than 60.5.
Histogram
Histogram is a graph of the frequency distribution in which classes with class boundaries are taken on X-
axis with a suitable breadth of class and adjacent bars are erected to show the frequencies. The height of the
bars is in proportion to the size of the frequency. For uniform intervals, we take a suitable breadth for
classes.
For unequal intervals we have to adjust the frequency. If the interval becomes double, then frequency is
divided by 2 so that the area of the bar is in proportion to the areas of other bars. Histogram is a very simple
and very important graph of the frequency distribution. This graph makes the base for other graphs. If we
take the frequencies on Y-axis, we get frequency histogram, the total area of which is equal to the total
frequency. If we take relative frequencies on the Y-axis, the total area of the histogram is unity, if we take
the percentage frequencies on Y-axis, we get percentage frequency histogram, and the total area of the
histogram will be 100.
Histogram
16
14
12
Frequncy
10
8
6
4
2
0
57.5 60.5 63.5 66.5 69.5
Weights
Frequency polygon
Frequency polygon is a graph of the frequency distribution in which the frequencies are plotted against the
midpoints of the classes. The plotted point’s are joined together to get the frequency polygon.
Midpoints ( x i ) Frequency (f) 20
Frequency
17 Polygon
74.5 9 15 10 10
9
94.5 10
114.5 17 10 5
4
5
134.5 10 5
154.5 5
174.5 4 0
74.5 94.5 114.5 134.5 154.5 174.5 194.5
194.5 5
FREQUENCY CURVE
In frequency curve the points are not joined together by straight lines. The free-hand drawing method of
drawing curve is used and we get the frequency curve as shown in fig.2.10. We can draw the frequency
curve on the frequency polygon or we can draw the curve on the separate sheet of paper.
Midpoints ( x i ) Frequency (f) Frequency Curve
74.5 9 20
94.5 10 15
114.5 17
10
134.5 10
154.5 5 5
174.5 4
0
194.5 5 74.5 94.5 114.5 134.5 154.5 174.5 194.5
𝐱̅ = =122.5
60 7350
Mo
de
Mode is an appropriate average in case of qualitative data e.g. the
opinion of an average person; he is probably referring to the most
frequently expressed opinion which is the modal opinion.
Mode in case of Ungrouped Data:
“A VALue THAt occurs most frequently in A DATA is CALled mode”
xi: 2, 3, 8, 4, 6, 3, 2, 5, 3.
Mode = 3(Answer).
Mode in case of Grouped Data:
“A VALue which hAS the lARgest frequency in A set of DATA is CALled mode”
𝑓𝑚 − 𝑓1
𝐌𝐨𝐝𝐞 = 𝑙 + ∗ℎ
(𝑓𝑚 − 𝑓1) + (𝑓𝑚 − 𝑓2)
Fm = frequency of modal class l=
lower class boundary of model class F1 = frequency of
previous class from modal class h = Class interval
F2 = frequency of Next class from modal class
Class boundaries Midpoints xi Frequency fi Cum
frequ
29.5---39.5
39.5---49.5
49.5---59.5
59.5---69.5
69.5---79.5
79.5---89.5
89.5---99.5
TOTAL
𝐌𝐨𝐝𝐞 = 59.5 + (
304−190)+(304−211)
304−190
∗ 10
Median
Median: “when the observation are arranged in ascending or descending order, then a value, that divides a
distribution into equal parts, is called median”
Ungrouped data:
𝑛+1
Median = ( )𝑡ℎ 𝑡𝑒𝑟𝑚
2
If Median is in Points then also apply this. Like 5.5
𝑀𝑒𝑑𝑖𝑎𝑛 = 5 + 0.5(4𝑡ℎ 𝑡𝑒𝑟𝑚 – 5𝑡ℎ 𝑡𝑒𝑟𝑚)
Grouped data:
h n
Median = l + ∗ ( − c)
f 2
n/2 = median term
l= lower class boundary of the median class
h= class interval
f = frequency of median class
c = cumulative frequency of the class preceding median class
Median = 59.5 + 10
304 ∗ (905 − 258)= 65 Answer
2
Quartiles
Q1, Q2, Q3
Divides ranked scores into four equal parts
Ungrouped Data
𝑗(𝑛 + 1)
𝑄= 𝑡ℎ 𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
4
Grouped Data
ℎ 𝑗𝑛
𝑄 = 𝑙 + ∗ ( − 𝑐)𝑡ℎ 𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
𝑓 4
ℓ= lower boundary of the class containing the Q2 or Q3, i.e. the class corresponding to the cumulative
frequency in which 2N/4 or 3N/4 lies
h= class interval size of the class containing Q2 or Q3
f= frequency of the class containing Q2 or Q3
n= number of values, or the total frequency
C= cumulative frequency of the class preceding the class containing Q2 or Q3
Deciles
D1, D2, D3, D4, D5, D6, D7, D8, D9
Divides ranked data into ten equal parts
Ungrouped Data
𝑗(𝑛 + 1)
𝐷= 𝑡ℎ 𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
10
ℎ 𝑗𝑛
Grouped Data 𝐷 = 𝑙 + ∗ ( − 𝑐)𝑡ℎ 𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
𝑓 10
ℓ= lower boundary of the class containing the D2 or D9, i.e.
the class corresponding to the cumulative frequency in which
2N/10 or 9N/10 lies
h= class
interval size of
the class
containing D2
or D9 f=
frequency of
the class
containing D2
or D9
n= number of values, or the total frequency
‹C= cumulative frequency of the class preceding the class containing D2 or
D9
Class boundaries Midpoints xi Frequency fi
29.5---39.5 34.5 8
39.5---49.5 44.5 87
49.5---59.5 54.5 190
59.5---69.5 64.5 304
69.5---79.5 74.5 211
79.5---89.5 84.5 85
89.5---99.5 94.5 20
TOTAL 905
Percentiles
D1, D2, D3, D4, D5, D6, D7, D8, D…….D100
Divides
ranked
data into
hundred
equal
parts
Ungroup
ed Data
𝑃= �
�
(
�
�
+
1
)
𝑡ℎ val size of the
𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
100 class
containing P2
Grouped Data or P99 f=
ℎ 𝑗𝑛 frequency of
𝑃=𝑙+ ∗( the class
𝑓 100
− 𝑐)𝑡ℎ containing P2
𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 or D9
n= number of values, or the total frequency
C= cumulative frequency of the class preceding the class containing P2 or
ℓ= P99
lower
bound
ary of
the
class
contai
ning
the P2
or
P99,
i.e.
the
class
corres
pondi
ng to
the
cumul
ative
freque
ncy in
which
2N/10
0 or
99N/1
00
lies
h
=
c
l
a
s
s
i
n
t
e
r
Systematical distribution
Systematical distribution
Equal distance of both tails from end. Data is distributed in balance form.
Mode = Median =Mean
Positive Skew
Its tail is longer towards right side
Mode < Median <Mean
Negative Skew
its tail is longer towards left Side
Mode > Median >Mean
Measurement of skewed
Karl’s Formula /parson’s coefficient of skewness
𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒
𝑠𝑘 = 𝑆
3(𝑀𝑒𝑎𝑛 − 𝑀𝑒𝑑𝑖𝑎𝑛)
𝑠𝑘 = 𝑆
Answer lies between +3 to -3.
If Answer is 0, its means distribution is symmetrical.
Bowley Formula
𝑄3 + 𝑄1 − 2𝑀𝑒𝑑𝑖𝑎𝑛
𝑠𝑘 = 𝑄3 − 𝑄1
Answer lies between +1 to -1.
If Answer is 0 , its means distribution is symmetrical.
Geometric Mean
The geometric mean, G, of a set of n positive values X1, X2… Xn is defined as the positive nth root of
their product.
G.M FOR UNGROUPED DATA
G n
X X 12n
...X
(Where Xi > 0)
Taking logarithms to the base 10, we get
1
log G log X X 2 log Xn
1
n
log
log X
n log X
G anti log
n
Example:
Find the geometric mean of numbers:
45, 32, 37, 46, 39, 36, 41, 48, 36.
9
3936
45 3237
41 4836
46
X log X log X
45 1.6532 log G
32 1.5052 n
37 1.5682
46 1.6628
14.3870
1.5986
39 1.5911 9
36 1.5563
41 1.6128 Hence G antilog 1.5986
48 1.6812
36 1.5563 39.68
14.3870
Each value of X thus has to be multiplied by itself f times, and the whole procedure becomes quite a
formidable task!
In terms of logarithms, the formula becomes
1
log G f log X f log X ... f log X
1 1 2 2 k k
n
Log G.M
f log
Xn
Class-
Mileage No. of
mark log X f log X
Rating Cars (midpoint)
X
Geometric Mean
30.0 - 32.9 2 31.45 1.4976 2.9952
33.0 - 35.9 4 34.45 1.5372 6.1488
36.0 - 38.9 14 37.45 1.5735 22.0290
39.0 - 41.9 8 40.45 1.6069 12.8552
42.0 - 44.9 2 43.45 1.6380 3.2760
Total 30 Total 47.3042
This is a pure number and independent of the units in which the data has been expressed. It is used for the
purpose to compare the dispersion of a data with the dispersion of another data.
The common relative measures of dispersion are:
Coefficient of Dispersion or Coefficient of Range
Coefficient of Quartile Deviation
Coefficient of Mean Deviation
Coefficient of Standard Deviation or Coefficient of Variation (C.V)
Standard Deviation:
“The positive square root of variance is called as standard deviation”.
For ungrouped data
x x 2 X (xx) ( x x )2
S
n 4 0 0
6 +2 4
2 –2 4
S x x 2 n 0 –4 16
3 –1 1
42 5 +1 1
= 2.45
7
8 +4 16
42
For Grouped data
fx 2 fx 2
Life (in
No. of Mid-
S n Bulbs point fx fx2
n Hundreds of Hours)
f x
0–5 4 2.5 10.0 25.0
5 – 10 9 7.5 67.5 506.25
78781.25 2437.5 2
S 100 10 – 20 38 15.0 570.0 8550.0
100 20 – 40 33 30.0 990.0 29700.0
40 and over 16 50.0 800.0 40000.0
= 13.9 hundred hours 100 2437.5 78781.25
= 1390 hours
Variance
The square of standard deviation variance is called as variance.
Ungrouped data
2
2 ∑(𝑥 − 𝐱̅)
𝑆 =
𝑛
Grouped data
∑𝑓𝑥 2 ∑𝑓𝑥 2
𝑆2 = { −( )}
𝑛 𝑛
Coefficient of variation
It is a pure number without unit.
it is used to compare variation in two or more data sets given in different units.
The coefficient of variation is obtained by dividing the standard deviation by the mean and expressed in
percentage.
S tan
Coefficient of variation Deviation OR C.V . S 100
dard
Mean X
Less variation = more constant and More variation = less constant
Correlation
Correlation is a measure of the degree of relatedness of variables. It can help a business researcher
determine, for example, whether the stocks of two airlines rise and fall in any related manner. For a sample
of pairs of data, correlation analysis can yield a numerical value that represents the degree of relatedness of
the two stock prices over time.
𝑛∑𝑥𝑦 − (∑𝑥)(∑𝑌)
𝑟 = √𝑛(∑𝑥2) − (∑𝑥2) ∗ √𝑛(∑𝑦2) − (∑𝑦2)
Rank correlation
Sometimes the actual measurement or counts of individual objects are either not available or accurate
assessment is not possible. They are then arrange in order according to some characteristic of interact. Such
an order arrangement is called a ranking and the Order given to an individual or object is called rank. The
correlation between such sets of ranking is known as rank correlation.
A rank correlation coefficient measures the degree of similarity between two rankings, and can be used to
assess the significance of the relation between them.
the relationship between rankings of different ordinal variables or different rankings of the same variable,
where a "ranking" is the assignment of the ordering labels "first", "second", "third", etc. to different
observations of a particular variable.
Spear’s man Rank Correlation:
6∑𝑑2
𝑟𝑠 = 1 −
𝑛(𝑛2 − 1)
Judge X Judge Y Judge Z dxy=X-Y dxz=X-Z dyz=Y-Z 𝑑2 𝑑2 𝑑2
5 1 6 4 -1 -5 16 1 25
2 7 4 -5 -2 3 25 4 9
6 6 9 0 -3 -3 0 9 9
8 10 8 -2 0 2 4 0 4
1 4 1 -3 0 3 9 0 9
7 5 2 2 5 3 4 25 9
4 3 3 1 1 0 1 1 0
9 8 10 1 -1 -2 1 1 4
3 2 5 1 -2 -3 1 4 9
10 9 7 1 3 2 1 9 4
Total 62 54 82
6∑𝑑2
𝑟=1−
𝑛(𝑛2 − 1)
6∑𝑑2
𝑟 =1 −
𝑛(𝑛2 − 1)
Multiple correlation
An estimate of combined influence of two or more variable on the observed (dependent) variable.
𝑟2 + 𝑟2 − 2𝑟12 ∗ 𝑟23 ∗ 𝑟13
12 13
𝑅1.23 = √ 2
1 − 𝑟23
𝑟2 + 𝑟2 − 2𝑟12 ∗ 𝑟23 ∗ 𝑟13
23 21
𝑅2.31 = √ 2
1 − 𝑟31
𝑟12
2
𝑛∑𝑋1𝑋2 2− (∑𝑋1)(∑𝑋2 2
)
= √𝑛(∑𝑋 ) − (∑𝑋 ) ∗ √𝑛(∑𝑋 ) − (∑𝑋2)
1 1 2 2
𝑟13
2
𝑛∑𝑋1𝑋3 2− (∑𝑋1)(∑𝑋2 3
)
= √𝑛(∑𝑋 ) − (∑𝑋 ) ∗ √𝑛(∑𝑋 ) − (∑𝑋2)
1 1 3 3
𝑟23
2
𝑛∑𝑋2𝑋3 2− (∑𝑋2)(∑𝑋2 3
)
= √𝑛(∑𝑋 ) − (∑𝑋 ) ∗ √𝑛(∑𝑋 ) − (∑𝑋2)
2 2 3 3
𝑟13 − 𝑟12𝑟23
𝑟13.2 = √(1 − 𝑟2 )(1 − 𝑟2 )
12 23
𝑟23 − 𝑟12𝑟13
𝑟23.1 = √(1 − 𝑟2 )(1 − 𝑟2 )
12 13
Regression
Regression is a statistical method used in finance, investing, and other disciplines that attempts to determine
the strength and character of the relationship between one dependent variable (usually denoted by Y) and a
series of other variables (known as independent variables)
Simple linear regression
A statistical method that allows us to summarize and study relationships between two continuous
(quantitative) variables:
One variable, denoted x, is regarded as the predictor, explanatory, or independent variable.
The other variable, denoted y, is regarded as the response, outcome, or dependent variable.
Because the other terms are used less frequently today, we'll use the "predictor" and "response" terms to
refer to the variables encountered in this course. The other terms are mentioned only to make you aware of
them should you encounter them. Simple linear regression gets its adjective "simple," because it concerns
the study of only one predictor variable. In contrast, multiple linear regression, which we study later in this
course, gets its adjective "multiple," because it concerns the study of two or more predictor variables.
For Population:
𝐘 = α + 𝛃𝐗 + 𝐸𝑖
For Sample:
𝑦𝑖 = 𝑎 + 𝑏𝑥𝑖 + 𝑒𝑖
The estimated regression line is generally written as:
Ŷ𝑖 = 𝑎 + 𝑏𝑥𝑖
𝑒𝑖 = 0
By using Method of least square we obtain following two equations:
∑𝑦𝑖 = 𝑛𝑎 + 𝑏∑𝑥𝑖
∑𝑥𝑖𝑦𝑖 = 𝑎∑𝑥𝑖 + 𝑏∑𝑥2𝑖
Alternate Method:
Y dependent
X independent
𝑏 𝑛∑𝑋𝑌 − (∑𝑋)(∑𝑌)
= 𝑛∑𝑋2 − (∑𝑋2)
𝑎 = ȳ − 𝑏 X̄
X dependent
Y independent
𝑏 𝑛∑𝑋𝑌 − (∑𝑋)(∑𝑌)
= 𝑛∑𝑌2 − (∑𝑌2)
𝑎 = X̄ − 𝑏Ŷ
A =1.47, b= 2.831
X Y XY 𝑥2 Ŷ =1.47+2.831(X) 𝑦−Ŷ (𝑦 − Ŷ)2 𝑦2
5 16 80 25 15.625 0.375 0.140625 256
6 19 114 36 18.456 0.544 0.295936 361
8 23 184 64 24.118 -1.118 1.249924 529
10 28 280 100 29.78 -1.78 3.1684 784
12 36 432 144 35.442 0.558 0.311364 1296
13 41 533 169 38.273 2.727 7.436529 1681
15 44 660 225 43.935 0.065 0.004225 1936
16 45 720 256 46.766 -1.766 3.118756 2025
17 50 850 289 49.597 0.403 0.162409 2500
102 302 3853 1308 301.992 0.008 15.88817 11368
Stranded deviation of regression Or Stranded Error of Estimation
𝑆. ∑(𝑦 − Ŷ)2
=√ 𝑛−2
Alternate Method 2
𝑆 . = √∑𝑦 − 𝑎∑𝑦 − 𝑏∑𝑥𝑦
𝑛−2
For Sample:
Probability
of Winning
Discrete Uniform Distribution
1/1000
X
000
999
000
999
Lottery Number
INTERPRETATION
It reflects the fact that winning lottery numbers are selected by a random procedure which makes all numbers equally
likely to be selected. The point to be kept in mind is that, whenever we have a situation where the various outcomes are
equally likely, and of a form such that we have a random variable X with values 0, 1, 2, … or , as in the above example,
0000, 0001 …, 9999, we will be dealing with the discrete uniform distribution.
BINOMIAL DISTRIBUTION
The binomial distribution is a very important discrete probability distribution. It was discovered by James Bernoulli
about the year 1700.We illustrate this distribution with the help of the following example:
EXAMPLE
Suppose that we toss a fair coin 5 times, and we are interested in determining the probability distribution of X, where X
represents the number of heads that we obtain.
We note that in tossing a fair coin 5 times:
every toss results in either a head or a tail,
the probability of heads (denoted by p) is equal to ½ every time (in other words, the probability
of heads remains constant),
every throw is independent of every other throw, and
the total number of tosses i.e. 5 is fixed in advance.
The above four points represents the four basic and vitally important PROPERTIES of a binomial experiment
EXAMPLE
Suppose that we toss a fair coin 5 times, and we are interested in determining the probability distribution of X, where X
represents the number of heads that we obtain. We note that in tossing a fair coin 5 times:
Every toss results in either a head or a tail,
The probability of heads (denoted by p) is equal to ½ every time (in other words, the probability of heads
remains constant),
Every throw is independent of every other throw, and
The total number of tosses i.e. 5 is fixed in advance.
The above four points represents the four basic and vitally important PROPERTIES of binomial experiment. Now, in 5
tosses of the coin, there can be 0, 1, 2, 3, 4 or 5 heads, and the no. of heads is thus a random variable which can take one
of these six values. In order to compute the probabilities of these X-values, the formula is:
Binomial Distribution
Where
P X x xn p x q nx
n = the total no. of trials
p = probability of success in each trial
q = probability of failure in
each trial (i.e. q = 1 - p)
x = no. of successes in n trials.
x = 0, 1, 2, … n
The binomial distribution has two parameters, n and p. In this example, n = 5 since the coin was thrown 5 times, p = ½
since it is a fair coin, q = 1 – p = 1 – ½ = ½ Hence
P X x
5 1
Putting x = 0 1 5x
x
x 2 2
PX 0
5 1
0 12 50
0 2
1 1
5! 5
2
0!5!
1 1 1 5 1
Putting x = 1 2
32
PX 1
1
5 1 1 1 5 1
2 2
5!
1 1 1
4
2 2
1!4!
5
21
1 5
1 5
5 21 5
32 32
STA301 – Statistics and Probability
Similarly, we have:
P X 2
5 1 2 1 5 2
10
2 2 2
32
P X 3
5 1 3 1 53
10
3 2 2
32
P X 4
5 1
5
4
1 54
4 2 2
32
P X 5
5 1
1
5
1 55
5 2 2
32
Hence, the binomial distribution for this particular example is as follows. Binomial Distribution in the case of tossing a
fair coin five times:
P(x)
10/32
8/32
6.32
4/32
2/32
X
0 1 2 3 4 5
The next question is: What about the mean and the standard deviation of this distribution? We can calculate them just as
before, using the formulas
P X x x p x q nx
n
P(x)
10/32
8/32
6.32
4/32
2/32
X
0 1 2 3 45
1.12
E(X) S.D.(X)
WHAT DOES THIS MEAN?
What this mean is that if 5 fair coins are tossed an INFINITE no. of times, sometimes we will get no head out of 5,
sometimes/head… sometimes all 5 heads. But on the AVERAGE we should expect to get 2.5 heads in 5 tosses of the
coin, or, a total of 25 heads in 50 tosses of the coin And 1.12 gives a measure of the possible variability in the various
numbers of heads that can be obtained in 5 tosses. (As you know, in this problem, the number of heads can range from 0
to 5 had the coin been tossed 10 times, the no. of heads possible would vary from 0 to 10 and the standard deviation
would probably have been different).
Coefficient of Variation:
1.12
C.V. 100 44.8%
100 2.5
Note that the binomial distribution is not always symmetrical as in the above example. It will be symmetrical only when
p = q = ½ (as in the above example).
P(x)
X
0 1 2 3 4 5
It is skewed to the right if p < q:
P(x)
X
0 1 2 3 4 5 6 7
P(x)
X
0 1 2 3 4 5 6 7
But the degree of Skewness (or asymmetry) decreases as n increases. Next, we consider the Fitting of a
Binomial Distribution to Real Data. We illustrate this concept with the help of the following example:
EXAMPLE
The following data has been obtained by tossing a LOADED die 5 times, and noting the number of times that we
obtained a six. Fit a binomial distribution to this data.
Frequency 12 56 74 39 18 1 200
SOLUTION
The rationale of this step is that, as indicated in the last lecture, the mean of a binomial probability distribution is equal
to np, i.e.
= np
But, here, we are not dealing with a probability distribution i.e. the entire population of all possible sets of throws of a
loaded die --- we only have a sample of throws at our disposal.
As such, is not available to us, and all we can do is to replace it by its estimate X.
Hence, our equation becomesX = np.
Now, we have:
fixi
x
fi
0 56 148 117 72 5
200
398
1.99
200
Using the relationship x = np, we get 5p = 1.99 or p = 0.398.This value of p seems to indicate clearly that the die is not
fair at all! (Had it been a fair die, the probability of getting a six would have been 1/6 i.e. 0.167; a value of p = 0.398 is
very different from 0.167.) Letting the random variable X represent the number of sixes, the above calculations yield
the fitted binomial distribution as
5
x 5 x
bx;5, 0.398 x 0.398 0.602
Hence the probabilities and expected frequencies are calculated as below:
No. of Expected
Probability f(x)
Sixes (x) frequency
5 5 5
0 q 0.602 = 0.07907 15.8
0
5 5 4
1 q p 5.0.602 0.398 = 0.26136 52.5
1
5 3 2 3 2
2 q p 10.0.602 0.398 = 0.34559 69.1
2
5 2 3 3
3 q p 10.0.602 0.398 = 0.22847 45.7
3
5 4 4
4 qp 0.602 0.398 = 0.07553 15.1
4
5 5 5
5 p 0.398 = 0.00998 2.0
5
Total = 1.00000 200.0
In the above table, the expected frequencies are obtained by multiplying each of the probabilities by 200.
In the entire above procedure, we are assuming that the given frequency distribution has the characteristics of
the fitted theoretical binomial distribution, comparing the observed frequencies with the expected frequencies, we
obtain:
The graphical representation of the observed frequencies as well as the expected frequencies is as follows:
Graphical Representation of the Observed and Expected
Frequencies:
Frequency
Observed frequency
75 Expected frequency
60
45
30
15
X
0 1 3 4 5
2
The above graph quite clearly indicates that there is not much discrepancy between the observed and the expected
frequencies. Hence, we can say that it is a reasonably good fit.
There is a procedure known as the Chi-Square Test of Goodness of Fit which enables us to determine in a formal,
mathematical manner whether or not the theoretical distribution fits the observed distribution reasonably well. This test
comes under the realm of Inferential Statistics --- that area which we will deal with during the last 15 lectures of this
course. Let us consider a real-life application of the binomial distribution:
Suppose that the past record indicates that the proportion of defective articles produced by this factory is 7%.And
suppose that a law NEWLY instituted in this particular country states that there should not be more than 5% defective.
Suppose that the factory-owner makes the statement that his machinery has been overhauled so that the number of
defectives has DECREASED.
In order to examine this claim, the relevant government department decides to send an inspector to examine a sample of
20 items.
STA301 – Statistics and Probability
What is the probability that the inspector will find 2 or more defective items in his sample (so that a fine will be
imposed on the factory)?
SOLUTION
The first step is to identify the NATURE of the situation, If we study this problem closely, we realize that we are
dealing with a binomial experiment because of the fact that all four properties of a binomial experiment are being
fulfilled:
Every item selected will either be defective (i.e. success) or not defective (i.e. failure)
Every item drawn is independent of every other item
The probability of obtaining a defective item i.e. 7% is the same (constant) for all items. (This probability
figure is according to relative frequency definition of probability.
The number of items drawn is fixed in advance i.e. 20 hence; we are in a position to apply the binomial
formula
P X x x p x q nx
n
Now
1 0.07 0.93
20 0 200
0.07 0.93
20 1 201
]
0 1
There are many experiments in which the condition of independence is violated and the probability of success does not
remain constant for all trials. Such experiments are called hyper geometric experiments.
In other words, a hyper geometric experiment has the following properties:
The outcomes of each trial may be classified into one of two categories, success and failure.
The probability of success changes on each trial.
The successive trials are not independent.
The experiment is repeated a fixed number of times.
The number of success, X in a hyper geometric experiment is called a hyper geometric random variable and its
probability distribution is called the hyper geometric distribution. When the hyper geometric random variable X
assumes a value x, the hyper geometric probability distribution is given by the formula
a random sample of size n is drawn WITHOUT REPLACEMENT from a finite population of N units;
k of the units are of one kind (classified as success) and the remaining N – k of another kind (classified as
failure).
STA301 – Statistics and Probability
LECTURE NO. 29
Hyper geometric
Distribution (in some detail)
Poisson Distribution
Limiting Approximation to the Binomial
Poisson Process
Continuous Uniform Distribution
In the last lecture, we began the discussion of the HYPERGEOMETRIC PROBABILITY DISTRIBUTION. We now
consider this distribution in some detail. As indicated in the last lecture, there are many experiments in which the
condition of independence is violated and the probability of success does not remain constant for all trials. Such
experiments are called hyper geometric experiments. In other words, a hyper geometric experiment has the following
properties:
The outcomes of each trial may be classified into one of two categories, success and failure.
The probability of success changes on each trial.
The successive trials are not independent.
The experiment is repeated a fixed number of times.
The number of success, X in a hyper geometric experiment is called a hyper geometric random variable and its
probability distribution is called the hyper geometric distribution. When the hyper geometric random variable X
assumes a value x, the hyper geometric probability distribution is given by the formula
k Nk
PX
x x nx
where n N ,
EXAMPLE
The names of 5 men and 5 women are written on slips of paper and placed in a hat. Four names are drawn. What is the
probability that 2 are men and 2 are women? Let us regard ‘men’ as success. Then X will denote the number of
men. We have N = 5 + 5 = 10 names to be drawn from; Also, n = 4, (since we are drawing a sample of size 4 out of a
‘population’ of size 10) In addition, k = 5 (since there are 5 men in the population of 10). In this problem, the possible
values of X are 0, 1, 2, 3, 4, i.e. n): The hyper geometric distribution is given by
P Xx
k Nk
x nx
N
n
,
Since N = 10, k = 5 and n = 4, hence, in this problem, the hyper geometric distribution is given by
5 5
x 4 x
P(X x)
10
4
PX 2
5
2 42
5
10
4
5 5
2 2
10
4
10 10
210
10
21
In other words, the probability is a little less than 50% that two of the four names drawn will be those of MEN. In the
above example, just as we have computed the probability of X = 2, we could also have computed the probabilities of X
= 0, X = 1, X = 3 and X = 4 (i.e. the probabilities of having zero, one, three OR four men among the four names
drawn).The students are encouraged to compute these probabilities on their own, to check that the sum of these
probabilities is 1, and to draw the line chart of this distribution.
Additionally, the students are encouraged to think about the centre, spread and shape of the distribution. Next, we
consider some important PROPERTIES of the Hyper
geometric Distribution:
Population
Finite Infinite
Sampling
With
replacement
Without
replacement
The point to be understood is that, whenever we are sampling with replacement, the population remains undisturbed
(because any element that is drawn at any one draw, is re-placed into the population before the next draw).Hence, we
can say that the various trials (i.e. draws) are independent, and hence we can use the binomial formula. On the other
hand, when we are sampling without replacement from a finite population, the constitution of the population changes at
every draw (because any element that is drawn at any one draw is not re-placed into the population before the next
draw). Hence, we cannot say that the various trials are independent, and hence the formula that is appropriate in this
particular situation is the hyper geometric formula. But, if the population size is much larger than the sample size (so
that we can regard it as an ‘infinite’ population), then we note that, although we are not re-placing any element that has
been drawn back into the population, the population remains almost undisturbed. As such, we can assume that the
various trials (i.e. draws) are independent, and, once again, we can apply the binomial formula.
In this regard, the generally accepted rule is that the binomial formula can be applied when we are drawing a sample
from a finite population without replacement and the sample size n is not more than 5 percent of the population size N,
or, to put it in another way, when n < 0.05 N.
When n is greater than 5 percent of N, the hyper geometric formula should be used.
STA301 – Statistics and Probability
POISSON DISTRIBUTION
The Poisson distribution is named after the French mathematician Sime’on Denis Poisson (1781-1840) who published
its derivation in the year 1837.THE POISSON DISTRIBUTION ARISES IN THE FOLLOWING TWO
SITUATIONS:
It is a limiting approximation to the binomial distribution, when p, the probability of success is very small
but n, the number of trials is so large that the product np = is of a moderate size;
a distribution in its own right by considering a POISSON PROCESS where events occur randomly over a
specified interval of time or space or length.
Such random events might be the number of typing errors per page in a book, the number of traffic accidents in a
particular city in a 24-hour period, etc.
With regard to the first situation, if we assume that n goes to infinity and p approaches zero in such a way that = np
remains constant, then the limiting form of the binomial probability distribution is
x
Li bx; n, p e , x 0,1,2,...,
m x!
n
p0
where e = 2.71828.
The Poisson distribution has only one parameter > 0.
The parameter may be interpreted as the mean of the distribution.
Although the theoretical requirement is that n should tend to infinity, and p should tend to zero, but in PRACTICE,
generally, most statisticians use the Poisson approximation to the binomial when
p is 0.05 or less,
& n is 20 or more,
but in fact, the LARGER n is and the SMALLER p is, the better will be the approximation. We illustrate this particular
application of the Poisson distribution with the help of the following example:
EXAMPLE
Two hundred passengers have made reservations for an airplane flight. If the probability that a passenger who has a
reservation will not show up is 0.01, what is the probability that exactly three will not show up?
SOLUTION
Let us regard a “no show” as success. Then this is essentially a binomial experiment with n = 200 and p = 0.01. Since p
is very small and n is considerably large, we shall apply the Poisson distribution, using
= np = (200) (0.01) = 2.
Therefore, if X represents the number of successes (not showing up), we have
PX
3 e 2
2 3
3!
0.1353 8
3 21
0.1804
e
1
2 0.1353
2 2.71828
POISSON PROCESS
may be defined as a physical process governed at least in part by some random mechanism.
Stated differently a poisson process represents a situation where events occur randomly over a specified interval of time
or space or length. Such random events might be the number of taxicab arrivals at an intersection per day; the number
of traffic deaths per month in a city; the number of radioactive particles emitted in a given period; the number of flaws
per unit length of some material; the number of typing errors per page in a book; etc.
The formula valid in the case of a Poisson process is:
x!
where
= average number of
occurrences of the outcome
of interest per unit of time,
t = number of time-units
under consideration, and
x= number of occurrences of the
outcome of interest in t units of time.
We illustrate this concept with the help of the following example:
EXAMPLE
Telephone calls are being placed through a certain exchange at random times on the average of four per minute.
Assuming a Poisson Process, determine the probability that in a 15-second interval, there are 3 or more calls.
SOLUTION
Step-2: Identify, the average number of occurrences of the outcome of interest per unit of time,
In this problem we have the information that, on the average, 4 calls are received per minute, hence:
=4
Step-3: Identify t, the number of time-units under consideration. In this problem, we are interested in a 15-second
interval, and since 15 seconds are equal to 15/60 = ¼ minutes i.e. 1/4 units of time, therefore t = 1/4
Step-4: Compute t: In this problem,
= 4, &
t = 1/4,
Hence:
t = 4 ¼ = 1
Step-5: Apply the Poisson formula
PX x e
t
t
x
,
x!
In this problem, since t = 1, therefore and since we are interested in 3 or more calls in a 15-second interval, therefore
e 1
2 x
1
x0 x!
2 0.3679 1
x
-1
=1 (
e = 0.3679)
x0 x!
= 1 – (0.91975) = 0.08025
Hence the probability is only 8% (i.e. a very low probability) that in a 15-second interval, the telephone exchange
receives 3 or more calls.
Some of the main properties of the Poisson distribution are given below:
If the random variable X has a Poisson distribution with parameter , then its mean and variance are given
by E(X) = and Var(X) = .
(In other words, we can say that the mean of the Poisson distribution is equal to its variance.)
The shape of the Poisson distribution is positively skewed. The distribution tends to be symmetrical as
becomes larger and larger.
Comparing the Poisson distribution with the binomial, we note that, whereas the binomial distribution can be
symmetric, positively skewed, or negatively skewed (depending on whether p = 1/2, p < 1/2, or p > 1/2), the Poisson
distribution can never be negatively skewed.
Just as we discussed the fitting of the binomial distribution to real data in the last lecture, the Poisson distribution can
also be fitted to real-life data. The procedure is very similar to the one described in the case of the fitting of the
binomial distribution: The population mean is replaced by the sample mean X, and the probabilities of the various
values of X are computed using the Poisson formula. The chi-square test of goodness of fit enables us to determine
whether or not it is a good fit i.e. whether or not the discrepancy between the expected frequencies and the observed
frequencies is small. Next, we discuss some important mathematical points regarding Poisson distribution.
1) The Poisson approximation to the binomial formula works well when
n > 20 and p < 0.05.
2) Suppose that the Poisson is used to approximate the binomial which, in turn, is being used to
approximate the hyper geometric. Then the Poisson is being used to approximate the hyper geometric
Putting the two approximation conditions together, the rule of thumb is that the Poisson distribution can be
used to approximate the hyper geometric distribution when n < 0.05N, n > 20, and p < 0.05
This brings to the end of the discussion of some of the most important and well-known Univariate discrete probability
distributions. We now begin the discussion some of the well-known Univariate continuous probability distribution.
There are different types of continuous distributions e.g. the uniform distribution, the normal distribution, the geometric
distribution, and the exponential distribution. Each one has its own shape and its own mathematical properties. In this
course, we will discuss the uniform distribution and the normal distribution.
We begin with the continuous UNIFORM DISTRIBUTION (also known as the RECTANGULAR DISTRIBUTION).
UNIFORM DISTRIBUTION
f x 1 axb
b a,
The graph of this distribution is as follows
f(x)
f x
1
ba
1
ba
X
0 a b
The above function is a proper probability density function because of the fact that:
i) Since a < b, therefore f(x) > 0
1
ii)
b
1 b ba
dx
b x 1
f x dx b a
a a a
ba
Since the shape of the distribution is like that of a rectangle, therefore the total area of this distribution can also be
obtained from the simple formula:
Area of rectangle
= (Base) × (Height)
1
b a 1
ba
f(x)
f x
1
ba
1
ba
X
0 a b
The distribution derives its name from the fact that its density is constant or uniform over the interval [a, b] and is 0
elsewhere. It is also called the rectangular distribution because its total probability is confined to a rectangular region
with base equal to (b – a) and height equal to 1/(b – a). The parameters of this distribution are a and b with
ab
2 b a
and variance 2
is 12
2
Let X has the uniform distribution over [a, b]. Then its mean is
The uniform probability distribution provides a model for continuous random variables that are evenly distributed over
a certain interval. That is, a uniform random variable is one that is just as likely to assume a value in one interval as it
is to assume a value in any other interval of equal size. There is no clustering of values around any value. Instead, there
is an even spread over the entire region of possible values. As far as the real-life application of the uniform
distribution is concerned, the point to be noted is that, for continuous random variables there is an infinite number of
values in the sample space, but in some cases, the values may appear to be equally likely.
EXAMPLE-1
If a short exists in a 5 meter stretch of electrical wire, it may have an equal probability of being in any particular 1
centimeter segment along the line.
EXAMPLE-2
If a safety inspector plans to choose a time at random during the 4 afternoon work-hours to pay a surprise visit to a
certain area of a plant, then each 1 minute time-interval in this 4 work-hour period will have an equally likely chance to
being selected for the visit. Also, the uniform distribution arises in the study of rounding off errors, etc.
STA301 – Statistics and Probability
LECTURE NO. 30
Normal Distribution.
Mathematical Definition
Important Properties
The Standard Normal Distribution
Direct Use of the Area Table
Inverse Use of the Area Table
Normal Approximation to the Binomial Distribution
The normal distribution was discovered in 1733. The normal distribution has a bell-shaped curve of the type shown
below:
-
Let us begin its detailed discussion by considering its formal MATHEMATICAL DEFINITION, and its main
PROPERTIES.
NORMAL DISTRIBUTION
A continuous random variable is said to be normally distributed with mean and standard deviation if its probability
density function is given by
2
x where
1
1 2
3.1416 ~ 22 7 ,
f x 2 e , x
e ~ 2.71828
For any particular value of and any particular value of , giving different values to x and we obtain a set of ordered
pairs (x, f(x)) that yield the bell-shaped curve given above. The formula of the normal distribution defines a FAMILY of
distributions depending on the values of the two parameters and (as these are the two values that determine the
shape of the distribution).
X
1 2 3
1 < 2 < 3
( Constant)
The different values of the standard deviation, (which is a measure of dispersion), determine the flatness or
peakedness of the normal curve. In other words, achange in the standard deviation on flattens it or compresses it
while leaving its centre in the same position:
1
1 < 2 < 3
( Constant)
2
3
X
Property No. 2
Because of the symmetry of the normal curve, 50% of the area is to the right of a vertical line erected at the mean, and
50% is to the left.(Since the total area under the normal curve from - to + is unity, therefore the area to the left of
is 0.5 and the area to the right of is also 0.5.)
Property No. 4
The density function attains its maximum value at x = and falls off symmetrically on each side of . This is why the
mean, median and mode of the normal distribution are all equal to .
-
Mean = Median = Mode
Property No. 5
Since the normal distribution is absolutely symmetrical, hence 3 , the third moment about the mean is zero.
Property No. 6
Property No. 7
The moment ratios of the normal distribution come out to be 0 and 3 respectively:
Moment
3 0
Ratios:
2 2
0,
1
23 3
2
Property No. 8
No matter what the values of and are, areas under the normal curve remain in certain fixed proportions within a
specified number of standard deviations on either side of .
For the normal distribution:
The interval will always contain 68.26% of the total area.
X
– 1 + 1
The interval + 2 will always contain 95.44% of the total area.
X
–2 +2
The interval 3 will always contain 99.73% of the total area.
X
– 3 + 3
STA301 – Statistics and Probability
At this point, the student are reminded of the Empirical Rule that was discussed during the first part of this course ---
that on descriptive statistics. You will recall that, in the case of any approximately symmetric hump-shaped frequency
distribution, approximately 68% of the data-values lie betweenX + S, approximately 95% between the X + 2S, and
approximately 100% between X + 3S.You can now recognize the similarity between the empirical rule and the
property given above. (In case a distribution is absolutely normal, the areas in the above-mentioned ranges are 68.26%,
95.44% and 99.73%; in case a distribution approximately normal, the areas in these ranges will be approximately equal
to these percentages.)
Property No. 9
The normal curve contains points of inflection (where the direction of concavity changes) which are
equidistant from the mean. Their coordinates on the XY-plane are
1 1
, 2e and , 2e
respectively.
Points of Inflection
- +
Next, we consider the concept of the Standard Normal Distribution:
-1 0 1
=1
Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0159 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2083 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2380 0.2422 0.2454 0.2486 0.2518 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3880
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3990 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4430 0.4441
1.6 0.4452 0.4463 0.4474 0.4485 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4690 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4758 0.4762 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4865 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4980 0.4980 0.4981
2.9 0.4981 0.4982 0.4983 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.49865 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
3.1 0.49903 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993
In any problem involving the normal distribution, the generally established procedure is that the normal distribution
under consideration is converted to the standard normal distribution. This process is called standardization. The
formula for converting N (, ) to N (0, 1) is:
THE PROCESS OF STANDARDIZATION
The standardization formula is:
X
Z
If X is N (, ), then Z is N (0, 1). In other words, the standardization formula given above converts our normal
distribution to the one whose mean is 0 and whose standard deviation is equal to 1.
-1 0 1
=1
We illustrate this concept with the help of an interesting example:
EXAMPLE
The length of life for an automatic dishwasher is approximately normally distributed with a mean life of 3.5 years and a
standard deviation of 1.0 years. If this type of dishwasher is guaranteed for 12 months, what fraction of the sales will
require replacement?
SOLUTION
Since 12 months equal one year, hence we need to compute the fraction or proportion of dishwashers that will cease to
function before a time-span of one year. In other words, we need to find the probability that a dishwasher fails before
one year.
1.0 3.5 X
In order to find this area we nee to standardize normal distribution i.e. to convert N(3.5, 1) to N(0, 1):
The method is
X X 3.5
Z
1.0
X
- 1.0 3.5
- Z
-2.5 0
Now we need to find the area under the normal curve from z= - to Z = -2.5. Looking at the area table of the standard
normal distribution, we find that Area from 0 to 2.5 = 0.4938
0.4938
0 2.5
Hence: The area from X = 2.5 to is 0.0062
0.0062
0 2.5
But, this means that the area from - to -2.5 is also 0.0062, as shown in the following figure:
0.0062
--2.5
0
EXAMPLE
The heights of applicants to the police force in a certain country are normally distributed with mean 170 cm and
standard deviation 3.8 cm. If 1000 persons apply for being inducted into the police force, and it has been decided that
not more than 70% of these applicants will be accepted, (and the shortest 30% of the applicant are to be rejected), what
is the minimum acceptable height for the police force?
SOLUTION:
We have:
- 170
3.8
We need to compute the x-value to the left of which, there exists 30% area
30%
20% 50%
-
170
3.8
The standardization formula can be re-written as
X
Z
The Z value to the left of which there exists 30% area is obtained as follows.
0.5
0.2 0.3
- Z
0
z
By studying the figures inside the body of the area table of the standard normal distribution, we find that:
The area between z = 0
and z = 0.52 is 0.1985,
and
The area between z =
0 and z = 2.53 is
0.2019
Since 0.1985 is closer to 0.2000 than 0.2019, hence 0.52 is taken as the appropriate z-value.
Z
- 0 0.52
But, we are interested not in the upper 30% but the lower 30% of the applicants.
Hence, we have:
Z
- -0.52 0
Since the normal distribution is absolutely symmetrical, hence the z-value to the left of which there exists 30% area (on
the left-hand-side of the mean) will be at exactly the same distance from the mean as the z-value to the right of which
there exists 30% area (on the right-hand-side of the mean).
Substituting z = -0.52 in the standardization formula, we obtain:
X = 170 + 3.8 Z
= 170 + 3.8 (-0.52)
= 170 - 1.976
= 168.024 168 cm
Hence, the minimum acceptable height for the police force is 168 cm. Just as binomial, Poisson and other discrete
distributions can be fitted to real-life data; similarly, the normal distribution can also be FITTED to real data.
This can be done by equating to X, the mean computed from the observed frequency distribution (based on sample
data), and to S, the standard deviation of the observed frequency distribution. Of course, this should be done only if
STA301 – Statistics and Probability
we are reasonably sure that the shape of the observed frequency distribution is quite similar to that of the normal
distribution. (As indicated in the case of the fitting of the binomial distribution to real data), in order to decide whether
or not our fitted normal distribution is a reasonably good fit, the proper statistical procedure is the Chi-square Test of
Goodness of Fit.
EXAMPLE:
Suppose that a past record indicate that, in a particular province of an under-developed country, the death rate from
Malaria is 20%. Find the probability that in a particular village of that particular province, the number of deaths is
between 70 and 80 (inclusive) out of a total of 500 patients of Malaria.
SOLUTION:
It is obvious that it is very cumbersome to apply the binomial formula in order to compute P(70 < X < 80).
In this problem,
np = 500(0.2) = 100 > > > 5, and nq = 500(0.8) = 400 > > > 5,
therefore we can happily apply the normal approximation to the binomial distribution. In order to apply the normal
approximation to the binomial, we need to keep in mind the following two points:
1) The first point is: The mean and variance of the binomial distribution valid in our problem will be regarded as the
mean and variance of the normal distribution that will be used to approximate the binomial distribution.
In this problem, we have:
80
npq 80
Hence
8.94
2) The second important point is:
We need to apply a correction that is known as the Continuity Correction. The rationale for this correction is as follows:
The binomial distribution is essentially a discrete distribution whereas the normal distribution is a continuous
distribution i.e.:
BINOMIAL DISTRIBUTION
In applying the normal approximation to the binomial, we have the following situation:
But, the question arises: “How can a set of distinct vertical lines be replaced by a continuous curve?”
In order to overcome this problem, what we do is to replace every integral value x of our binomial random variable by
an interval x - 0.5 to x + 0.5. By doing so, we will have the following situation. The x-value 70 is replaced by the
interval 69.5 - 70.5, The x-value 71 is replaced by the interval 70.5 - 71.The x-value 72is replaced by the interval 71.5 -
72.5..................The x-value 80 is replaced by the interval 79.5 - 80.5
Hence:
Applying the continuity correction,
P(70 < X < 80)
is replaced by
P(69.5 < X < 80.5).
Accordingly, the area that we need to compute is the area under the normal curve between the values 69.5 and 80.5.
It is left to the students to compute this area, and thus determine the required probability. (This computation
involves a few steps.)
By doing so, the students will find that, in that particular village of that province, the probability that the number of
deaths from Malaria in a sample of 500 lies between 70 and 80 (inclusive) is 0.0145 i.e. 1½%.
This brings us to the end of the second part of this course i.e. Probability Theory.
In the next lecture, we will begin the third and last portion of this course i.e. Inferential Statistics----that area
of Statistics which enables us to draw conclusions about various phenomena on the basis of data collected on sample
basis.