0% found this document useful (0 votes)
14 views7 pages

Unit I Basic Statistics

The document provides an overview of basic statistics, including definitions, types of data (qualitative and quantitative), and scales of measurement (nominal, ordinal, interval, and ratio). It emphasizes the importance of statistical methods for data collection, organization, presentation, analysis, and interpretation, as well as the graphical presentation of data through various types of diagrams. Examples and exercises are included to illustrate the application of these concepts in real-world scenarios.

Uploaded by

mihaelk99hl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views7 pages

Unit I Basic Statistics

The document provides an overview of basic statistics, including definitions, types of data (qualitative and quantitative), and scales of measurement (nominal, ordinal, interval, and ratio). It emphasizes the importance of statistical methods for data collection, organization, presentation, analysis, and interpretation, as well as the graphical presentation of data through various types of diagrams. Examples and exercises are included to illustrate the application of these concepts in real-world scenarios.

Uploaded by

mihaelk99hl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

FSBBA III (SEM V)

Skill Enhancement Course


Data Analysis Techniques
Unit I BASIC STATISTICS
Introduction:
- The views commonly held about statistics are numerous, but often incomplete. It has different meanings to
different people depending largely on its use. For example, (i) for a cricket fan, statistics refers to numerical
information or data relating to the runs scored by a cricketer; (ii) for an environmentalist, statistics refers to
information on the quantity of pollution released into the atmosphere by all types of vehicles in different cities;
(iii) for the census department, statistics consists of information about the birth rate per thousand and the sex
ratio in different states; (iv) for a share broker, statistics is the information on changes in share prices over a
period of time; and so on.
- Croxton and Cowden gave definition of statistics. They have defined statistics in singular sense. This definition
also refers statistics as Statistical Method. According to Croxton and Cowden Statistics was define as a science
of collection, presentation, analysis and interpretation of numerical data.
- This definition has pointed out four stages of statistical investigation, to which one more stage ‘organization of
data’ rightly deserves to be added. Accordingly, statistics was defined as the science of collecting, organizing,
presenting, analyzing, and interpreting numerical data for making better decisions.

Collection Organization Presentation Analysis Interpretation

- Statistical methods, broadly, fall into the following two categories:


(i) Descriptive statistics (ii) Inferential statistics
- Descriptive statistics includes statistical methods involving the collection, presentation, and characterization of
a set of data in order to describe the various features of that set of data.
- Inferential statistics includes statistical methods which facilitate estimating the characteristic of a population or
making decisions concerning a population on the basis of sample results
Types of data:
- The collected data are of two types (i) Qualitative data (ii) Quantitative data
- Qualitative Data:
When the data are classified according to some qualitative phenomena, which are not capable of quantitative
measurement like honesty, beauty, employment, intelligence, occupation, sex, literacy, etc., are termed as qualitative
data. The qualitative phenomena under study are known as Attributes.
For example,
(i) Population has two classes like, presence and absence, male and female, honest or dishonest, employed or
unemployed, beautiful or not beautiful is called dichotomous attribute.

(ii) Population is classified into more than two classes like, Attribute “Intelligence” the various classes may be, say,
genius, very intelligent, average intelligent, below average and dull is called manifold Attribute.

1
(iii) Classify the population by sex into two classes, males and females. Then each of these is again classified according
to smoking, smokers and non-smokers, again each of these four classes are classified with respect to a third
attribute, religion, into two classes , Hindu and non-Hindu, is also called manifold Attributes.

- Quantitative data:
If the data are classified on the basis of phenomenon which is capable of quantitative measurement like age, height,
weight, prices, production, income, expenditure, sales, profits, etc., are called Quantitative data. The quantitative
phenomenon under study is known as Variable.
Variables are of two kinds: (i) Continuous variable. (ii) Discrete variable (Discontinuous variable).
(i) Those variables, which can take all the possible values (integral as well as fractional) in a given specified range,
are termed as continuous variables.
For example, the age of students in a school (Nursery to Higher Secondary) is a continuous variable because age
can take all possible values (as it can be measured to the nearest fraction of time : years, months, days, minutes,
seconds, etc.), in a certain range, say, from 3 years to 20 years.
More precisely a variable is said to be continuous if it is capable of passing from any given value to the next
value by infinitely small gradations.
(ii) On the other hand, those variables, which cannot take all the possible values within a given specified range, are
termed as discrete (discontinuous) variables.
For example, family size (members in a family), the population of a city, the number of accidents on the road,
the number of typing mistakes per page and so on.

Scales of measurements:
The data are categorized using different scales of measurements. Each level of measurement scale has specific
properties that determine the various use of statistical analysis. There are four different scales of measurement. The
data can be defined as being one of the four scales. The four types of scales are:
The four types of scales are:
1. Nominal Scale
2. Ordinal Scale
3. Interval Scale
4. Ratio Scale

Nominal Scale: 1st Level of Measurement


- Nominal Scale, also called the categorical variable scale, is defined as a scale used for labeling variables into
distinct classifications and does not involve a quantitative value or order.
- This scale is the simplest of the four variable measurement scales.
- Calculations done on these variables will be futile as there is no numerical value of the options.
- The sequence in which subgroups are listed makes no difference, as there is no relationship among
subgroups.
2
- A subgroup of nominal scale with only two categories (e.g. male/female) is called “dichotomous.”
- The analysis of gathered data will happened using percentages or mode.
- Nominal scale is often used in research surveys and questionnaires where only variable labels hold
significance. For instance,
i) A customer survey asking, “Which brand of smart phones do you prefer?”
Options: “Apple”- 1 , “Samsung”-2, “One Plus”-3.
ii) What is your Gender?
Options:”Male”-1, “Female”-2
iii) What is your Political preference?
Options: 1- Independent, 2- Democrat, 3- Republican
iv) Where do you live?
Options: 1- Suburbs, 2- City, 3- Town

2. Ordinal Scale: 2nd Level of Measurement


- Ordinal Scale is defined as a categorical variable measurement scale used to simply depict the order of
variables and not the difference between each of the categorical variables.
- These scales are generally used to depict non-mathematical ideas such as frequency, satisfaction, happiness,
a degree of pain etc.
- It is quite straightforward to remember the implementation of this scale as ‘Ordinal’ sounds similar to
‘Order’, which is exactly the purpose of this scale.
- Ordinal Scale maintains description qualities along with an order but is void of an origin of scale and thus, the
distance between variables can’t be calculated.
- Description qualities indicate tagging properties similar to the nominal scale, in addition to which, ordinal
scale also has a relative position of variables.
- Origin of this scale is absent, due to which there is no fixed start or “true zero”.
- The analysis of gathered data will happened using percentages or median,
- These scales are generally used in market research to gather and evaluate relative feedback about product
satisfaction, changing perceptions with product upgrades etc. For example, ordinal scale question such as:
How satisfied are you with our services?
Very Unsatisfied – 1 : Unsatisfied – 2 : Neutral – 3 : Satisfied – 4: Very Satisfied – 5
Here, the order of variables is of prime importance and so is the labeling. Very unsatisfied will always be
worse than unsatisfied and satisfied will be worse than very satisfied.

3.Interval Scale: 3rd Level of Measurement:


- Interval Scale is defined as a numerical scale where the order of the variables is known as well as the
difference between these variables.
- Variables, which have familiar, constant and computable differences, are classified using the Interval scale.
- Mean, median or mode can be used to calculate the central tendency in this scale.
- The only drawback of this scale is that there no pre-decided starting point or a true zero value.
- Interval scale contains all the properties of ordinal scale, in addition to which, it offers a calculation of the
difference between variables.
- The main characteristic of this scale is the equidistant difference between objects. For instance,
- 80 degrees is always higher than 50 degrees and the difference between these two temperatures is the same
as the difference between 70 degrees and 40 degrees.
- Also, the value of 0 is arbitrary because negative values of temperature do exist – which makes
Celsius/Fahrenheit temperature as a classic example of interval scale.
- Due to absence of absolute zero, one cannot tell by how much the temperature is higher or lower. For
example, you cannot say if 40 degrees is twice hot as 80 degree or if -20 degrees is half as cold as -40
degrees.
- All the techniques applicable to nominal and ordinal data analysis are applicable to Interval Data as well.
Apart from those techniques, there are a few analysis methods such as descriptive statistics, correlation
regression analysis that is extensively for analyzing interval data.

3
4. Ratio Scale: 4th Level of Measurement:
- Ratio Scale is defined as a variable measurement scale that not only produces the order of variables but also
makes the difference between variables known along with information on the value of true zero.
- It is calculated by assuming that the variables have an option for zero, the difference between the two
variables is the same and there is a specific order between the options.
- Ratio scale accommodates the characteristic of three other variable measurement scales, i.e. labeling the
variables, the significance of the order of variables and a calculable difference between variables (which are
usually equidistant).
- Because of the existence of true zero value, the ratio scale doesn’t have negative values. To decide when to
use a ratio scale, the researcher must observe whether the variables have all the characteristic of an interval
scale along with the presence of the absolute zero value.
- Ratio scale provides the most detailed information. Statisticians can calculate the central tendency using
statistical techniques such as mean, median, mode and methods such as geometric mean, the coefficient of
variation or harmonic mean can be used on this scale
- Best examples of ratio scales are weight and height.
The following questions fall under the Ratio Scale category:
i) What is your daughter’s current height? (ii) What is your weight in kilograms?
Less than 5 feet. Less than 50 kilograms
5 feet 1 inch – 5 feet 5 inches 51- 70 kilograms
5 feet 6 inches- 6 feet 71- 90 kilograms
More than 6 feet 91-110 kilograms

Summary of Levels of Measurement


Offers: Nominal Ordinal Interval Ratio
The sequence of variables is established – Yes Yes Yes
Mode Yes Yes Yes Yes
Median – Yes Yes Yes
Mean – – Yes Yes
Difference between variables can be evaluated – – Yes Yes
Addition and Subtraction of variables – – Yes Yes
Multiplication and Division of variables – – – Yes
Absolute zero    Yes

4
Graphical presentation of data:
– One of the important functions of statistics is to present complex and unorganized (raw) data in such a
manner that they would easily be understandable.
– According to King, ‘One of the chief aims of statistical science is to render the meaning of masses of figures
clear and comprehensible at a glance.’
– This is often best accomplished by presenting the data in a pictorial (or graphical) form.
Types of diagrams:
– There are a variety of diagrams used to represent statistical data. Different types of diagrams, used to
describe sets of data, are divided into the following categories:
(1) Dimensional diagrams
– One dimensional diagram such as histograms, frequency polygons, Frequency curve,
Cumulative frequency distribution (Ogive)and pie chart.
– Two-dimensional diagrams such as rectangles, squares, or circles.
– Three dimensional diagrams such as cylinders and cubes.
(2) Pictograms or Ideographs
(3) Cartographs or Statistical maps
 One dimensional diagram
(A) Histograms (Bar Diagrams): These diagrams are used to graph both ungrouped and grouped data for one
variable or attribute.
Listed below are the various types of histograms:
(i) Simple bar charts (ii) Grouped (or multiple) charts (iii) Deviation bar charts
(iv) Subdivided bar charts (v) Paired bar charts(vi) Sliding bar charts
(vii) Relative frequency bar charts(viii) Percentage bar charts

Ex-1 The data on the production of oil seeds in a particular year is presented in Table
Oil Seed Ground nut Rapeseed Coconut Cotton soya bean
Yield(million tonnes) 5.8 3.3 1.18 2.2 1
Present above data using bar diagram and pie chart.

Ex-2 The following data represent the estimated gross area under different cereal crops during a particular year.
Crop Paddy Wheat Jowar Bajra Ragi Barley Small millets Maize
Gross Area(‘000 hectares) 34321 18287 22381 15869 2656 4422 6258 6749
Draw a suitable chart to represent the data.

EX-3 The data on fund flow (Rs in crore) of an International Airport Authority during financial years 2020–21 to
2022–23 are given below:
2020-21 2021-22 2022-23
Non-traffic revenue 40.2 50.25 70.26
Traffic revenue 70.25 80.37 100.23
Expenditure 70.2 79.97 89.93
Profit before tax 40.25 50.65 80.56
Present above data using multiple diagram for comparing three years.
EX-4 The following data represent the gross income, expenditure (in Rs lakh), and net profit (in Rs lakh) during the
years 1999 to 2002.
1999-2000 2000-2001 2001-2002
Gross Income 570 592 652
Gross expenditure 510 560 610
Net Income 60 32632 22
Construct a diagram or chart you prefer to use here.

5
Ex-5 The following data represent the outlays (Rs crore) by heads of development. Represent the data by a suitable
diagram. (Subdivided bar, percentage bar and multiple bar)
Heads of Development Centre States
Agriculture 4765 7039
Irrigation and Flood control 6635 11395
Energy 9995 8295
Industry and Minerals 12770 2985
Transport and Communication 12200 5120
Social Service 8216 1420
Total 54581 36252
Ex-6 The following data indicate the rupee sales (in 1000’s) of three products according to region. Represent the data
by a suitable diagram. (Subdivided bar, percentage subdivided bar and multiple bar)
product↓ Sales(in ‘000 Rs.)
group→ North South East
A 70 75 90
B 90 60 100
C 50 60 40
Ex-7 Represent the data by a percentage subdivided bar diagram.
Items of Expenditure Family A (Income Rs.500) Family B(Income Rs.300)
Food 150 150
Clothing 125 60
Education 25 50
Miscellaneous 190 70
Saving or Deficit +10 ─30

Ex-8 Draw a bar chart for the following data showing the percentage of total population in villages and towns:
Percentage of total
population in
Village Town
Infants and young children 13.7 12.9
Boys and girls 25.1 23.2
Young men and women 32.3 36.5
Middle-aged men and women 20.4 20.1
Elderly persons 8.5 7.3
Subdivided bar, percentage subdivided bar and multiple bar)
Ex-9 Draw the line graph for following data.
year 2009 2010 2011 2012 2013 2014 2015 2016
Yield (in 12.8 13.9 12.8 13.9 13.4 6.5 2.9 14.8
million
tonnes)

Ex-10 The following table gives the cost of production (in arbitrary units) of a factory:
2009-10 2010-11 2011-12 2012-13 2013-14 2014-15 2015-16 2016-17 2017-18
Material 37 25 35 35 38 22 17 26 20
Labour 10 8 11 11 12 7 5 8 9
Overhead 13 10 16 17 20 12 9 12 15
Total 60 43 63 63 70 41 31 46 44
Represent the above data by a band graph.

6
Ex-11 An advertising company kept an account of response letters received each day over a period of 50 days. The
observations were:
0 2 1 1 1 2 0 0 1 0 1 0 0 1 0 1 1 0
2 0 0 2 0 1 0 1 0 1 0 3 1 0 1 0 1 0
2 5 1 2 0 0 0 0 5 0 1 1 2 0
Construct a frequency table and draw a line chart (or diagram) to present the data.
EX-12 A portfolio manager keeps a close watch on price-earnings ratios (defined as current market price divided by
earnings for the most recent four quarters) of 200 common stocks.
(a) Construct a frequency distribution table.
(b) Present the resulting frequency distribution as a histogram or frequency polygon and comment on the
pattern.
(c) Construct a cumulative frequency distribution and ogive
11.1 12.6 26.7 5.2 8.3 5.5 6.8 7.6
7.3 18.1 14.6 10.9 7.2 9.5 9.2 11.8
12 16.9 10.1 14.6 5.2 7.5 11.1 19.9
14.9 7.4 6 39.9 29.3 35.1 6.8 39
6.1 6.2 26.8 33.7 9.6 16.6 10.9 11.2
22.6 46 7.3 29.7 10.3 6.4 9.6 7.6
10.3 5 14.4 11.6 8.3 7.9 17.8 7.5
7.8 7.3 8 20.2 5.6 8.3 7.7 10.7
8.6 14.5 6 5.4 12.6 14.8 9.2 14.1
15.7 10.4 7 11 6.3 8.4 7.6 16.9
7.9 8.3 13.1 9.8 8.2 18 26.6 7.8
4.1 10.6 15.3 7.2 35.5 6.1 10.2 6.1
7.8 8.1 30 15 6.1 15.4 10.1 9.6
6.8 4.4 6.8 9.1 16.3 5.4 5.9 6.5
7.9 44.9 13.8 12.3 10.9 9.3 11.9 10
7.6 17.9 7.1 8.4 35.5 7.4 7.7 8.3
15.8 8.3 23.1 8.4 12.4 7.8 8.2 9.8
13.7 15.8 4.7 7.9 26.4 6.2 11.4 13.2
8.6 11.7 8.6 13.7 9.3 16.6 8.7 39.7
14 9.1 7.1 10.9 23.4 13.3 10.9 24
11.9 8.7 15.6 27.7 10.4 16.9 6.9 5.5
22.8 8.5 22.2 5.8 14.7 8 7.5 10.5
4.4 7.1 63.8 12.5 13.3 10.5 5.5 16
53.1 7.4 24.1 15.3 29.1 11 9.9 36.3
9.6 6.6 5.1 7.8 8.4 38.3 20.4 9.1

You might also like