Statistical Data and Data Summary
Statistical Data and Data Summary
The term statistics is generally used to mean numerical facts and figures such as agricultural
production during a year,rate of inflation and so on. However as a subject of study ,statistics refers to
the body of principles and procedures developed for the collection, classification, summarization and
interpretation of numerical data and for the use of such data Statistics is a set decision making
techniques which aids businessmen in drawing inferences from the available data.
MEANING OF STATISTICS
The term statistics refers to a science in which we deal with the techniques or methods for
collecting ,classifying, presenting ,analysing and interpreting the data.
Definition:
In simple words statistics is the study and manipulation of given data. It deals with the
analysis and computation of given numerical data.
Statistics is a branch of mathematics that deals with the collection, review, and analysis of
data. It is known for drawing the conclusions of data with the use of quantified models.
Statistical analysis is a process of collecting and evaluating data and summarizing it into
mathematical form.
Statistics can be defined as the study of the collection, analysis, interpretation, presentation,
and organization of data. In simple words, it is a mathematical tool that is used to collect and
summarize data.
Uncertainty and fluctuation in different fields and parameters can be determined only
through statistical analysis. These uncertainties are determined by the probability that plays a
very important role in statistics.
FEATUERES OF STATISTICS:-
1. Collection of Data
2. Organisation of Data
3. Presentation of Data
4. Analysis of Data
5. Interpretation of Data
1. Collection of Data:- Collection of relevant data concerning a problem is the
first step in statistical method. Depending upon the problem under study,it is
decided as to how,when and where and what kind of data are to be collected.
2. Organisation of Data:- The second step is to organise the collected data. With
a view to rendering the collected data more comparable and simple ,it is
classified on the basis of time ,place and quality,etc.
5. Interpretation of Data :- Conclusions are drawn after analysing the data. Two
or more kinds of data are compared and conclusions drawn.
Scope of Statistics:
Statistics can be used in many major fields such as psychology, geology, sociology, weather
forecasting, probability, and much more. The main purpose of statistics is to learn by analysis of data,
it focuses on applications, and hence, it is distinctively considered as a mathematical science.
1. NATURE OF STATISTICS
It is a science or Art. As a science, Statistics studies numerical data in symmetric manner and as
an art ,it makes use of data to solve the problems of real life.
2. SUBJECT MATTER OF STATISTICS
04 Homogenity of Data
To compare the data ,it is essential that whatever statistics are collected,the same must be
uniform in quality. Data of different qualities and kinds cannot be compared.
05 Results are true only on an average
Laws of statistics are true only on an average. They express tendencies. They are not
valid always and under all conditions.
Finance minister make use of data relating to revenue and expenditure while preparing budget.
Each government formulates its policies concerning family planning,establishment of new
industries etc. on the basis of the data. Government makes assessment and measures efficiency
of its different departments like health,education etc. Statistics prove helpful in taking decisions
with regard to the defence of the country,armed forces etc.
A successful businessman estimates demand for and supply of the commodity on the basis of
relevant data. It is necessary for the businessmen to know the nature and place of demand of
goods that they are dealing in and what is the future policy of the government and the possibility
of changes in the price.
3. Importance in political fields
Politicians play an important role in designing the economic, social and educations policies of
the country. It is very essential for the politicians be aware of the Statistical data. Policies of
the party in power can give wide publicity to the achievements of their government on the
basis of statistical data.
Insurance companies determine the rate of insurance premium on the basis of statistics
relating to average expectancy of life in the country. Expectancy of life is calculated on the
basis of Life [Link] tables depend on the theory of Probability.
5. Quality Testing:
Statistics samples are used to test the quality of all the products a Company produces.
6. Astronomy:
Statistical methods help scientists to measure the size, distance, etc. of the objects in
the universe.
7. Banking:
Banks have several accounts to deposit customers’ money. At the same time, Banks
have loan accounts as well to lend the money to the customers in order to earn more
profit from it. For this purpose, a statistical approach is used to compare deposits and
the requesting loans.
8. Science:
Statistical methods are used in all fields of science.
9. Weather Forecasting: Statistical concepts are used to compare the previous weather with
the current weather so as to predict the upcoming weather.
10. Mathematics:
Statistical methods like dispersion and probability are used to get more exact information.
11. Business:
Various statistical tools are used to make quick decisions regarding the quality of the product,
preferences of the customers, the target of the market etc.
12. Economics:
Economics is totally dependent on statistics because statistical methods are used to calculate
the various aspects like employment, inflation of the country. Exports and imports can be
analysed through statistics.
13. Medical:
Using statistics, the effectiveness of any drug can be analysed. A drug can be prescribed only
after analysing it through statistics.
IMPORTANT QUESTIONS
i) Primary data
ii) Secondary data.
PRIMARY DATA:
The data which are collected from the units or individual respondents directly for the purpose
of certain study or information are known as primary data. For instance, an enquiry is made
from each tax payer in a city to obtain their opinion about the tax collecting machinery. The
data obtained in a study by the investigator are termed as primary data. If an experiment
conducted to know the effect of certain fertilizer doses on the yield or the effect of a drug on
the patients, the observations taken on each plot or patient constitute the primary data.
The primary data is the information collected by researcher or investigator for the purpose of
the enquiry for the first time. The following are the methods using by which the primary data
can be collected.
i)Direct personal investigation
ii) Indirect oral investigation
iii)Questionnaires and schedules
Direct personal investigation: In this method the investigator directly meet to person and
collect data personally.
As the name says, the investigator himself goes to the field, meets the respondents and gets the
required information. Here investigator personally interviews the respondent either directly or through
phone or through electronic media. This method is suitable when the scope of investigation is small
and greater accuracy is needed.
In the present age of communication explosion, telephones and mobile phones are extensively used to
collect data from the respondents. This saves the cost and time of collecting the data with a good
amount of accuracy.
Suitability
This method is suitable particularly when:
a)the field of investigation is limited;
b)a greator degree of originality of the data is required;
c) information is to be kept secret;
d)investigation needs lot of expertise, care and devotion.
Merits
Originality:- Data have a high degree of originality according to this method.
Accuracy:- Data are fairly accurate when personally collected.
Reliable:- Because the information is collected by the investigator himself, relibility of the data is not doubted.
Uniformity:- There is a fair degree of uniformity in the data collected by the investigator himself from the
informants. Comparison becomes easy because of uniformity of data.
Other Information:- In direct contact with the informants ,the investigator may obtain any other related
information as well.
Flexible:- This method is fairly flexible because the investigator can always make necessary adjustments in his set
of questions.
Demerits
Not proper for wide areas:- Direct personal investigation becomes very difficult when the area of the study is
very wide.
Personal Bias:- This method is highly prone to the presonal bias of the investigator .As a result, the data may
loose their credebility.
Costly:- This method is very expensive in terms of the time,money and efforts evolved.
Wrong Conclusions:- In this method,area of investigation is generally small. The results are, therefore, less
representative. This may lead to wrong Conclusions.
Indirect Method Investigation: The indirect method is used in cases where it is delicate
or difficult to get the information from the respondents due to unwillingness or indifference. The
information about the respondent is collected by interviewing the third party who knows the
respondent well. Instances for this type of data collection include information on addiction,
marriage proposal, economics status, witnesses in court, criminal proceeding etc. the shortcoming
of this method is genuineness and accuracy of the information, as it completely depends on the
third party.
In this method the investigator appoints local agents or correspondents in different places. They
collect the information on behalf of the investigator in their locality and transmit the data to the
investigator or headquarters. This method is adopted by newspapers, government agencies and
trading concerns. This method is less accurate but quick and more expensive.
For Example,if a case of murder is to be investigated,it would be quite impossible to know the facts
by contacting the persons directly who are involved in [Link] such a case information is to be obtained
from third persons such as friends,neighbours etc.
Suitability
a)the field of investigation is large.
b)It is not possible to have direct contact with the concerned informants.
c)the concerned informants are not capable of giving information because of their ignorance.
d)Enquiry committees and commissions appointed by the Government generally adopted this method.
Merits
Wider Area :-This method can be applied even when the field of investigation is very wide.
Less costly:- This is relatively a less costly method.
Expert Opinion:- Using this method an investigator can seek opinion of the experts and thereby make his
information more reliable.
Free from Bias:- This method is relatively free from the personal bias of the investigator.
Simple:- This is relatively a simple method of data collection.
Demerits
Less Accurate:- The data collected by this method are relatively less accurate. This is because the information is
obtained from persons other than the concerned informants.
Wrong Conclusions:- This method may lead to doubtful conclusions due to ignorance and carelessness of the
witness.
Questionnaires and Schedules: A questionnaire contains a sequence of questions relevant
to the study arranged in a logical order. Preparing a questionnaire is a very interesting and challenging
job and required good experience and skill. Questionnaires include open-ended questions and close-
ended questions allow the respondent considerable freedom in answering. However, questions are
answered in details. Close-ended questions have to be answered by the respondent by choosing an
answer from the set of answers given under a question just by ticking.
Before starting the investigation, a question sheet is prepared which is called schedule. The schedule
contains all the questions which would extract a complete information from a respondent. The order
of questions the language of the questions and the arrangement of parts of the schedule are not
changed. However the investigator can explain the questions if the respondent faces any difficulty. It
contains direct questions as well as question in tabular form.
Suitability
a)the area of the study is very wide
b)when the informants are educated.
Merits
Economical :-This method is economical in terms of time,money and efforts involved.
Originality:- This method is original and,therefore,fairly [Link] is because the information is
supplied by the concerned persons themselves.
Wider Area:- This method can cover wider areas.
Demerits
Lack of Interest:-Generally, the informants do not take interest in questionaries and fail to return the
questionaries. Those who return, often send incomplete answers.
Lack of Flexibility:- This method lacks flexibility in the sense that when the questions are not properly
replied, these cannot be changed to obtain the required information.
Limited Use:- This method has limited use in that questionnaires are answered only by the educated
informants.
Biased:- If the informants are biased, the information will also be biased.
Less Accuracy:- The conclusions based on such investigations have only limited [Link] is
because some questions may be difficult and accurate answers may not be possible.
Advantages of primary data:
• There are numerous hassles involved in the collection of primary data like taking
a decision such as how, when, what and why to collect.
• The cost involved in the collection of data is very high.
• The collection of primary data is more time consuming.
SECONDARY DATA:
The data collected through various published or unpublished sources by certain people or
agency is known as secondary data. Now information contained in it is used again from records,
processed and statistically analysed to extract some information for other purpose, is termed as
secondary data. Such data are cheaper and more quickly obtainable than the primary data and
also may be available when primary data cannot be obtained at all.
Q.3] Write brief note on methods of collecting primary data with merits and demerits
Classification and Tabulation
INTRODUCTION:
After collecting, the desired data the first step is to be taken is to classify and tabulate the data. In order to
make the data simple and easily understandable, simplify them in such a way that irrelevant data are
removed and their significant features are standing out prominently. The procedure adopted for this
purpose is known as method of classification and tabulation. The classification and tabulation provide a
clear picture of the collected data and on that basis the further processing is decided. After study of the
importance and techniques of classification and tabulation that help to arrange the mass of collected data
in a logical and summarize manner. However, it is a difficult and cumbersome task for common man and
researcher to interpret the data. Too many figures are often confusing and may fail to convey the message
effectively to those for whom it is meant.
To overcome this inconvenience, the most appealing way in which statistical results may be presented is
through diagrams and graphs. A diagram is a visual form for presentation of statistical data, highlighting
their basic facts and relationships. If we draw diagrams on the basis of the data collected they will be
understood and appreciated by all. Every day we can find the presentation of stock market, cricket score
etc. in newspaper, television and magazines in the form of diagrams and graphs.
In this chapter we will discuss classification, tabulation and some of the major types of diagrams, graphs
and maps frequently used in presenting statistical data.
CLASSIFICATION OF DATA:
Classification is the grouping of related facts into different classes. Facts in one
class differ from those in another class with respect to some characteristics
called a basis of classification.
Classification of statistical data is comparable to the sorting operation. The
process of classification gives prominence to important information gathered
while dropping unnecessary details facilitates comparison and enables a
statistical treatment of the material collected.
Geographical
classification
• In geographical classification data are
classified on the basis of geographical or
locational differences between the
various items. For example when we
present the production of sugarcane
wheat ,rice, etc., for various States, this
would be called geographical
classification. Geographical classifications
are usually listed in an alphabetical order
for easy reference. Items may also be
listed by size of to emphasize the
important areas as in ranking the states
by population.
Chronological
classification
• When data are observed over
a period of time that type of
classification is known as
chronological classification.
Time Series are usually listed in
chronological order normally
starting with the earliest time
period.
Qualitative
classification Population
The process of preparing this type of distribution is very simple. we have just to count
the number of times a particular value is repeated, which is called the frequency of that
class. In order to facilitate counting, prepare a column of Tally. In another column, place
all possible values of the variable from the lowest to the highest.
In order to make the series more compact so that its characteristics can be easily
studied, data may be classified according to class intervals.
There are two methods of classifying the data according to class intervals.
(1)Exclusive Method
(2)Inclusive Method
Exclusive Method
Incomes (in Rs.) No. of Employees
When the class intervals are so fixed that the upper 4000-4500 10
limit of one class is the lower limit of the next class, 4500-5000 140
it is known as the Exclusive method of classification.
5000-5500 60
it is clear that the Exclusive method and ensures
continuity of data in as much as the upper limit of 5500-6000 110
one class is the lower limit of the next class. 6000-6500 80
Inclusive Method
Incomes (in Rs.) No. of Employees
Under the inclusive method of classification the 4000-4999 10
upper limit of one class is included in that class
5000-5999 140
itself.
6000-6999 60
7000-7999 110
8000-8999 80
Tabulation of Data
One of the simplest and most revealing devices for summarising data is the
statistical table. A table is a systematic arrangement of statistical data in columns
and rows. Rows are horizontal arrangements , whereas columns are vertical ones .
The purpose of a table is to simplify the presentation and to facilitate comparisons.
• Example: - Suppose we conduct a survey of 100 people asking about their favorite fruit
• What is your favorite fruit?
Example: - To analyze how two categorical variables interact— branch of engineering and
preferred internship domain
This Two way tabulation is useful for Career services teams to plan internship opportunities, Faculty to
understand student interests , Students to benchmark their choices
• Two Way Tabulation: - Importance
Graphical presentations are very simple for even a common person to understand. It is popular method of
presentation of data. With the help of graphs, two or more sets of data can be easily compared and analysed.
The trend of the data also can be seen from the graph.
A graph is drawn in a plane with two reference lines called the X-axis (horizontal) and the Y-axis (vertical). The
axes are perpendicular to each other and their point of intersection is called Origin. Every point in the plane is
identified by two coordinates (x, y). The first coordinate (x) represents the value of the variable on the X-axis
and the second coordinate (y) represents the value of the variable on the Y-axis.
Proper scale of measurement should be taken to accommodate the complete data on the graph. If needed
the origin can be shifted from (0, 0) to any other required value. Such a process is called shifting of origin.
This is the most common type of diagrams. They are called one-dimensional diagrams because only length of
the bar matters and not the width. For large number of observations lines may be drawn instead of bars to save
space.
Example 2: The following data shows the production of rice for the period 2010 to 2018. Represent the data by a
subdivided bar diagram.
Multiple Bar Diagram: Whenever the comparison between two or more related variables is to be made, multiple
bar diagram should be preferred. In multiple bar diagrams two or more groups of interrelated data are
presented. The technique of drawing such type of diagrams is the same as that of simple bar diagram. The only
difference is that since more than one components are represented in each group, so different shades, colors,
dots or crossing are used to distinguish between the bars of the same group.