Chapter One
Introduction
Structure
1.1 Introduction
1.2 History of Statistics
1.3 Definition of Statistics
1.4 Types of Statistics
1.5 Definition of Business statistics
1.6 Definition of statistics in the plural sense
1.7 Definition of statistics in the singular sense and different stages
of statistical investigation
1.8 Differences between Statistical data and Statistical Method
1.9 Functions of statistics
1.10 Scope/Importance of Statistics
1.10.1 Importance of statistics in Business and management
1.10.2 Importance of Statistics in Economics
1.10.3 Statistics in the economic planning
1.11 Limitations of statistics
1.12 Distrust of statistics
1.1 Introduction
The word statistics is coming out from the Latin word status or the Italian word „statista‟
or the German word „statistik‟ or French word statistique each of which means political
state. In the general word statistics means the collection of numerical data for its further
use. In the other words statistics means numerical data.
1.2 History of Statistics
Statistical methods date back at least to the 5th century BC. Some scholars pinpoint the
origin of statistics to 1663, with the publication of Natural and Political Observations
upon the Bills of Mortality by John Graunt. Early applications of statistical thinking
revolved around the needs of states to base policy on demographic and economic data,
hence its stat- etymology. The scope of the discipline of statistics broadened in the early
19th century to include the collection and analysis of data in general. Today, statistics is
widely employed in government, business, and natural and social sciences.
Its mathematical foundations were laid in the 17th century with the development of the
probability theory by Gerolamo Cardano, Blaise Pascal and Pierre de Fermat.
Mathematical probability theory arose from the study of games of chance, although the
concept of probability was already examined in medieval law and by philosophers such
as Juan Caramuel. The method of least squares was first described by Adrien-Marie
Legendre in 1805.
The modern field of statistics emerged in the late 19th and early 20th century in three
stages. The first wave, at the turn of the century, was led by the work of Francis Galton
and Karl Pearson, who transformed statistics into a rigorous mathematical discipline
used for analysis, not just in science, but in industry and politics as well. Galton's
contributions included introducing the concepts of standard deviation, correlation,
regression analysis and the application of these methods to the study of the variety of
human characteristics – height, weight, eyelash length among others. Pearson
developed the Pearson product-moment correlation coefficient, defined as a product-
moment, the method of moments for the fitting of distributions to samples and the
Pearson distribution, among many other things. Galton and Pearson founded
Biometrika as the first journal of mathematical statistics and biostatistics (then called
biometry), and the latter founded the world's first university statistics department at
University College London.
The second wave of the 1910s and 20s was initiated by William Gosset, and reached its
culmination in the insights of Ronald Fisher, who wrote the textbooks that were to define
the academic discipline in universities around the world. Fisher's most important
publications were his 1918 seminal paper The Correlation between Relatives on the
Supposition of Mendelian Inheritance, which was the first to use the statistical term,
variance, his classic 1925 work Statistical Methods for Research Workers and his 1935
The Design of Experiments, where he developed rigorous design of experiments
models. He originated the concepts of sufficiency, ancillary statistics, Fisher's linear
discriminator and Fisher information. In his 1930 book The Genetical Theory of Natural
Selection he applied statistics to various biological concepts such as Fisher's principle).
Nevertheless, A. W. F. Edwards has remarked that it is "probably the most celebrated
argument in evolutionary biology". (about the sex ratio), the Fisherian runaway, a
concept in sexual selection about a positive feedback runaway affect found in evolution.
The final wave, which mainly saw the refinement and expansion of earlier
developments, emerged from the collaborative work between Egon Pearson and Jerzy
Neyman in the 1930s. They introduced the concepts of "Type II" error, power of a test
and confidence intervals. Jerzy Neyman in 1934 showed that stratified random sampling
was in general a better method of estimation than purposive (quota) sampling.
Today, statistical methods are applied in all fields that involve decision making, for
making accurate inferences from a collated body of data and for making decisions in the
face of uncertainty based on statistical methodology. The use of modern computers has
expedited large-scale statistical computations, and has also made possible new
methods that are impractical to perform manually. Statistics continues to be an area of
active research, for example on the problem of how to analyze large data.
1.3 Definition of Statistics
There have been many definitions of term Statistics. Some important definitions of
statistics given by renowned Statisticians are stated below:
(i) According to R. A. Fisher, the science of statistics is essentially a branch of
Applied Mathematics and may be regarded as mathematics applied to
observational data.
(ii) According to Wallis and Roberts, statistics may be regarded as a body of
methods for making wise decisions in the face of uncertainty.
(iii) According to Netter & Wasserman, statistics refers to the body of technique or
methodology which has been developed for the collection, presentation and
analysis of quantitative data and for the use of such data in decision making.
(iv) According to Bowley, statistics are numerical statements of facts in any
department of enquiry placed in relation to each other.
(v) According to Boddington, statistics is the science of estimates and probabilities.
1.4 Types of Statistics
There are two major divisions of statistics such as descriptive statistics and inferential
statistics.
Descriptive Statistics: The term descriptive statistics deals with collecting,
summarizing, and simplifying data, which are otherwise quite unwieldy and voluminous.
It seeks to achieve this in a manner that meaningful conclusions can be readily drawn
from the data. Descriptive statistics may thus be seen as comprising methods of
bringing out and highlighting the latent characteristics present in a set of numerical data.
It not only facilitates an understanding of the data and systematic reporting thereof in a
manner; and also makes them amenable to further discussion, analysis, and
interpretations.
The first step in any scientific inquiry is to collect data relevant to the problem in hand.
When the inquiry relates to physical and/or biological sciences, data collection is
normally an integral part of the experiment itself. In fact, the very manner in which an
experiment is designed, determines the kind of data it would require and/or generate.
The problem of identifying the nature and the kind of the relevant data is thus
automatically resolved as soon as the design of experiment is finalized. It is possible in
the case of physical sciences. In the case of social sciences, where the required data
are often collected through a questionnaire from a number of carefully selected
respondents, the problem is not that simply resolved. For one thing, designing the
questionnaire itself is a critical initial problem. For another, the number of respondents
to be accessed for data collection and the criteria for selecting them has their own
implications and importance for the quality of results obtained. Further, the data have
been collected; these are assembled, organized, and presented in the form of
appropriate tables to make them readable. Wherever needed, figures, diagrams, charts,
and graphs are also used for better presentation of the data. A useful tabular and
graphic presentation of data will require that the raw data be properly classified in
accordance with the objectives of investigation and the relational analysis to be carried
out.
A well thought-out and sharp data classification facilitates easy description of the hidden
data characteristics by means of a variety of summary measures. These include
measures of central tendency, dispersion, skewness and kurtosis, which constitute the
essential scope of descriptive statistics. These form a large part of the subject matter of
any basic textbook on the subject, and thus they are being discussed in that order here
as well.
Inferential Statistics: Inferential Statistics is also known as inductive statistics, goes
beyond describing a given problem situation by means of collecting, summarizing, and
meaningfully presenting the related data. Instead, it consists of methods that are used
for drawing inferences, or making broad generalizations, about a totality of observations
on the basis of knowledge about a part of that totality. The totality of observations about
which an inference may be drawn, or a generalization made, is called a population or a
universe. The part of totality, which is observed for data collection and analysis to gain
knowledge about the population, is called a sample.
The desired information about a given population of our interest; may also be collected
even by observing all the units comprising the population. This total coverage is called
census. Getting the desired value for the population through census is not always
feasible and practical for various reasons. Apart from time and money considerations
making the census operations prohibitive, observing each individual unit of the
population with reference to any data characteristic may at times involve even
destructive testing. In such cases, obviously, the only recourse available is to employ
the partial or incomplete information gathered through a sample for the purpose. This is
precisely what inferential statistics does. Thus, obtaining a particular value from the
sample information and using it for drawing an inference about the entire population
underlies the subject matter of inferential statistics. Consider a situation in which one is
required to know the average body weight of all the college students in a given
cosmopolitan city during a certain year. A quick and easy way to do this is to record the
weight of only 500 students, from out of a total strength of, say, 10000, or an unknown
total strength, take the average, and use this average based on incomplete weight data
to represent the average body weight of all the college students. In a different situation,
one may have to repeat this exercise for some future year and use the quick estimate of
average body weight for a comparison. This may be needed, for example, to decide
whether the weight of the college students has undergone a significant change over the
years compared.
Inferential statistics helps to evaluate the risks involved in reaching inferences or
generalizations about an unknown population on the basis of sample information. For
example, an inspection of a sample of five battery cells drawn from a given lot may
reveal that all the five cells are in perfectly good condition. This information may be
used to conclude that the entire lot is good enough to buy or not.
Since this inference is based on the examination of a sample of limited number of cells,
it is equally likely that all the cells in the lot are not in order. It is also possible that all the
items that may be included in the sample are unsatisfactory. This may be used to
conclude that the entire lot is of unsatisfactory quality, whereas the fact may indeed be
otherwise. It may, thus, be noticed that there is always a risk of an inference about a
population being incorrect when based on the knowledge of a limited sample.
The rescue in such situations lies in evaluating such risks. For this, statistics provides
the necessary methods. These centre‟s on quantifying in probabilistic term the chances
of decisions taken on the basis of sample information being incorrect. This requires an
understanding of the what, why, and how of probability and probability distributions to
equip ourselves with methods of drawing statistical inferences and estimating the
degree of reliability of these inferences.
1.5 Business Statistics
Business Statistics is the science of good decision making in the face of uncertainty and
is used in many disciplines such as financial analysis, econometrics, auditing,
production and operations including services improvement, and marketing research.
1.6 Definition of Statistics in the plural sense
In the plural sense statistics are defined as the statistical data. The statistical data is the
numerical data, which has the following characteristics:
1. Statistics are aggregates of facts: Single and isolated figures are not statistics
because they are unrelated and cannot be compared. For example, a solitary
road accident cannot reveal anything unless such information is collected for a
period and for different roads.
2. Statistics are affected to marked extent by multiplicity of causes: Facts and
figures are affected to a large extent by a number of forces operating on them. For
example, condition of a crop is affected by a number of factors like soil condition,
use of fertilizers, rainfall, methods of cultivation, etc.
3. Statistics are numerically expressed: All statistics are expressed in numbers.
Qualitative statements do not constitute a statistical statement. Qualitative
characteristics or attributes such as intelligence, beauty, etc., cannot be included
in statistics unless they are quantified by assigning certain score as a quantitative
measure of assessment.
4. Statistics should be enumerated or estimated: Data may be obtained by
counting or measurement or it may be estimated statistically when enumeration is
not feasible or involves excessive and high cost. For example, the general quality
of a product is estimated by experimental test on small samples drawn from a
population.
5. Statistics should be collected with greater accuracy: Data is collected only
with a reasonable standard accuracy.
6. Statistics are collected in a systematic manner: Data collected in a haphazard
manner is very likely leads to wrong conclusions.
7. Statistics are collected for a pre-determined purpose: The purpose should be
well defined and specific. For example, data on the physical personality will be
irrelevant for considering ability of a person for an intellectual work. But, it will be
relevant for selection into military service.
8. Statistics should be placed in relation to each other: The data should be
comparable. For example, the statistics on yield of crop and condition of soil are
related but these figures cannot have any relation with the statistics on the health
of the people.
In the absence of the above characteristics the numerical data cannot be called
statistics. Therefore we can say: “All statistics are numerical statements of facts but all
numerical statements of facts are not statistics.”
1.7 Definition of statistics in the singular sense and different stages of statistical
investigation
In the singular sense the statistics is defined as the statistical methods. In other words,
Statistics is a science of collection, presentation, analysis and interpretation of
numerical data. There are five stages of a statistical investigation:
1. Collection of data: There are two methods of data collection – census and
sample method.
2. Organization of data: The data collected in raw form is organised. It involves
three steps: editing, classification and tabulation.
3. Presentation of data: The collected and organized data is presented into
diagrams and graphs.
4. Analysis of data: Generally used methods of analysis are measures of central
tendency, measures of variation, correlation, index numbers etc.
5. Interpretation of data: This means drawing conclusions from the data. It is
difficult task and requires high degree of skill and experience.
1.8 Differences between Statistical data and Statistical method
Some differences between Statistical data and Statistical method are given below:
Statistical data Statistical method
1. It is quantitative 1. It is an operational technique.
2. It is often in the raw state. 2. It helps in processing the raw
data.
3. It is descriptive in nature. 3. It is basically a tool of analysis.
4. It provides material for 4. The processing is done by the
processing unprocessed data. scientific methods of analysis and
interpretation.
5. It would not make much sense 5. Tools of analysis will be idle
without the application of the tools without facts available for making
of analysis. use of such tools.
6. The choice of tools will depend 6. The nature of the data to be
on the nature of data. collected will depend on the tools
brought to be used for processing.
1.9 Functions of statistics
The following are the important functions of statistics:
1. Definiteness: It presents the facts in a precise and definite form. Statements of
facts conveyed in numerical terms are more convincing than vague and general
statements.
2. Condensation: It condenses mass of data into few significant figures. By
simplifying mass of figures it helps in better understanding of data.
3. Comparison: Data is compared for the better understanding. The comparison may
be over time or with other similar data.
4. Formulating and testing of hypothesis: Statistical methods are helpful in
formulating and testing of hypothesis and to develop new theories.
5. Prediction: Based on past data and its trend we are able to predict the future. The
predictions help in meeting the requirements of the future in a better way.
6. Formulation of policies: By providing the basic material the statistics help in
formulation of appropriate policies.
1.10 Scope/Importance of Statistics
Apart from the methods comprising the scope of descriptive and inferential branches of
statistics, statistics also consists of methods of dealing with a few other issues of
specific nature. Since these methods are essentially descriptive in nature, they have
been discussed here as part of the descriptive statistics.
1.10.1 Importance of statistics in Business and management
The following are some major activities of a typical, large and progressive organization
which would indicate how statistics helps in the efficient discharge of various activities:
1. Marketing: Statistical analyses are frequently used in providing information for
marketing decisions. In the field of marketing, it is necessary to find out what can
be sold and then to evolve a suitable strategy so that goods reach the ultimate
consumer.
2. Production: In the field of production, statistical data and statistical methods play
a very important role. The decision about what to produce, how much to produce,
when to produce, for whom to produce is based largely on facts analyzed
statistically.
3. Finance: The Financial Managers in discharging their finance function efficiently
depend heavily on statistical analysis of facts and figures. Financial forecast,
breakeven analysis and investment decisions under uncertainty are but part of
their activities.
4. Banking: Banking institutions have found it increasingly necessary to establish
departments within their organizations for the purpose of gathering and analyzing
information, not only regarding their own operations, but on general economic
conditions and on every line of business in which they might be directly or
indirectly interested.
5. Investment: Statistics greatly assists investor in making clear and valued
judgment in his investment decision in selecting securities which are safe and
which have the best prospects of yielding a good income.
6. Purchase: The purchase department in discharging its functions makes use of
statistical data to frame suitable purchase policies such as from where to buy,
how much to buy, at what time to buy and at what price to buy.
7. Accounting: Statistical methods are also employed in accounting. In particular,
the auditing functions make frequent application of statistical sampling and
estimation procedures, and the cost account uses regression analysis.
8. Control: The management control process combines statistical and accounting
methods in making the overall budget for the coming year including sales,
material, labor and other costs, and net profits and capital requirements.
9. Credit: The credit department performs statistical analysis to determine how
much credit to extend to various customers.
10. Personnel: The personnel department frames personnel policies based on facts.
It makes statistical studies of wage rates, incentive plans, cost of living, labor
turnover rates, employment trends, accident rates, employee grievances,
performance appraisal, training programs etc.
11. Research and development: Many big organizations have research and
development departments which are primarily concerned with finding out how
existing products can be improved, what new product lines can be added and how
the optimal use of research made.
1.10.2 Importance of Statistics in Economics
Statistics help in the field of economics in the following ways:
1. Allocation of resources: As the resources are limited and wants unlimited we
have to decide what goods shall be produced and in what quantity. For deciding
this we require information related to the demand for different goods. Statistics
provide this information.
2. Choice of technology: There are more than one ways of production of a good.
For choosing the appropriate technology we require information about their
relative cost and output levels. Statistics provide this information.
3. Distribution of output: Under this we have to decide how the goods are going to
be distributed to different income groups. We desire that the goods should be
made available to maximum number of persons. Statistics help by providing
information relating to the food habits, nutritive capacities and the availability of
the products.
4. Formulation of economic policies: Statistics also helps in solving the economic
problems like poverty, unemployment and price rise by providing necessary data
for the formulation of economic policies.
1.10.3 Statistics in the economic planning
Economic planning may be defined as the scheme of coordinated action to achieve
certain economic goals such as the rapid economic growth, removal of poverty and
unemployment etc. Statistics help in economic planning in the following ways:
1. It helps in setting up of targets by providing data relating to extent of a problem
and resources available for it.
2. With the help of statistics resources are allocated to different sectors for their
development.
3. It helps in formation of the ‘blueprint’ or the detailed plan for the implementation
of the planning strategy.
4. It helps in identifying factors that are retarding the growth of the economy.
5. It helps in evaluation of planning. It collects information about the performance
of the plan; in knowing how far the different objectives have been fulfilled.
6. It makes recommendations for the future plans, so that, mistakes of the
present may not be repeated in future.
1.11 Limitations of statistics
The following are the important limitations of statistics:
1. Statistics does not deal with individuals. Since statistics are aggregate of facts,
the study of individual lies outside its scope.
2. Statistics deals with only quantitative characteristics. The qualitative
characteristics such as honesty, efficiency, intelligence cannot be studied
directly.
3. Statistical results are true only on an average. The conclusions obtained
statistically are not universally true; they are true only under certain conditions.
4. Statistics may be misused. As the statistical facts are convincing some people
misuse them for their benefit. The statistics may be manipulated in the following
ways:
a) Incomplete information may be provided e.g. for determining profit
information relating to difference in investment is withheld.
b) Changing the definition: Sometimes data is manipulated by changing the
definition e.g. definition of poverty or literacy may be changed to show better
performance.
c) Showing one aspect of data only e.g. we may show better performance on
the literacy front if we are looking at the percentage changes. If on the other
hand we are also looking at the number of illiterates we find their no.
increasing because of increase in population.
d) Sometimes data is manipulated to justify foregone conclusions.
e) The data may be collected in haphazard way. The sample may not be
representative of the universe. Wrong conclusions may be drawn because of
faulty reasoning.
1.12 Distrust of statistics
By distrust of statistics we mean lack of confidence in statistics. The distrust arises
because of misuse of statistics. As the statistical facts are convincing some people
misuse them for their benefit. The statistics may be manipulated in the following ways:
a) Incomplete information may be provided e.g. for determining profit information
relating to difference in investment is withheld.
b) Changing the definition. Sometimes data is manipulated by changing the definition
e.g. definition of poverty or literacy may be changed to show better performance.
c) Showing one aspect of data only e.g. we may show better performance on the
literacy front if we are looking at the percentage changes. If on the other hand we
are also looking at the number of illiterates we find their no. increasing because of
increase in population.
d) Sometimes data is manipulated to justify foregone conclusions.
e) The data may be collected in haphazard way. The sample may not be
representative of the universe. Wrong conclusions may be drawn because of
faulty reasoning.
Exercise-1
Q.1 (a) Define Statistics and Business Statistics.
(b) Discuss different types of Statistics.
Q.2: (a) Define Statistics in the plural sense.
(b) Define Statistics in the singular sense. What are the different stages of
statistical investigation?
(c) Differentiate between Statistical data and Statistical method.
Q.3: What are the different functions of Statistics?
Q.4: What is the importance of Statistics in Business and management?
Q.5: What is the importance of Statistics in Economics?
Q.6: What is economic planning? Discuss the role of Statistics in the economic
planning.
Q.7: What are the limitations of Statistics?
Q.8: What do you understand by distrust of Statistics? Why does this distrust arise?
Q.9: Define statistics. State its limitations in decision making of a business concern.
(NU 2010)
Q.10: What is business statistics? Explain some important functions of statistics. (NU
2011)
Q.11: Define business statistics. Discuss its applications and limitations in the
management of business enterprise. (NU 2012)
Q.12: What is statistics? How do you think the knowledge of statistics is essential in
management decision? Illustrate your answer with examples. (NU 2013)
Q.13: What is statistics? Discuss the importance of statistics in business decision
making. (NU 2014)
Q.14: Define Statistics and statistical methods. Explain the uses of statistical methods
in modern business organizations. (NU 2015)
Q.15: Critically examine the following statements:
(a) “Statistics can prove anything.”
(b) “Statistics can prove nothing.”
(c) “There are three degrees of lies – lies, damned lies and statistics.”
(d) “Figures would not lie but liars use figures to lie.”
Q.16: “Statistical methods are most dangerous tools in the hands of the inexperts.”
Discuss and explain the limitations of statistics.
Q.17: “Statistics are like clay of which you can make a God or Devil, as you please.”
Comment on your opinion.