0% found this document useful (0 votes)

50 views79 pages

Basic Statistics Overview for Students

The document is a course outline for Basic Statistics at Haramaya University, detailing key concepts such as definitions of statistics, types of variables, and measurement scales. It covers descriptive and inferential statistics, their applications in business, and methods for organizing and presenting data. The document is structured into chapters, providing foundational knowledge for students in the Department of Accounting and Finance.

Uploaded by

andualemaschalew90

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views79 pages

Basic Statistics Overview for Students

Uploaded by

andualemaschalew90

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 79

Basic Statistics Email:k7.kebede@gmail.

com 2024

HARAMAYA UNIVERSITY
COLLEGE OF COMPUTING AND INFORMATICS
DEPARTMENT OF STATISTICS
_____________________________________________
Basic Statistics- Stat 2131
For Department of Accounting and Finance

Set by:
Kindu Kebede Gebre(Assistant Professor )

©November, 2024
Basic Statistics Email:[email protected] 2024

CHAPTER ONE

1. Introduction to Statistics
1.1. Definition of Statistics

Definition of some basic Terms

Before getting involved in the subject matter in detail, let us define of the terms used extensively
in the field of statistics.

Data: are figures or facts from which conclusion can be made. Data are the numerical results of
any scientific measurement. Any value that is expressed in numbers is called data.
Population: the totality of all elements under study.
Sample: is a portion or part of the population taken so that some generalization about the
population can be made. It is the subset of the population which is assumed to be the
representative of the population.

Statistics can be defined in two senses: plural (as Statistical Data) and singular (as Statistical
Methods).

Plural sense: Statistics are collection of facts (figures). This meaning of the word is widely used
when reference is made to facts and figures on sales, employment or unemployment, accident,
weather, death, education, etc. E.g: Sales Statistics, Labor Statistics, Employment Statistics, etc.
In this sense the word Statistics serves simply as data. But not all data are statistics.

In order for the numerical data to be identified as statistics, it must be possessing a certain
identifiable characteristics as follows:-

1. Statistics are aggregate of facts:- single or isolate fact or figure are not statistics.
Example1 I earn birr 30000 per year? Not statistical statement.
Example 2 the average salary of professor at our university is 30000 per year? Yes it is
statistical statement. Because average is computed from many related figure of yearly
salary of many professor.
2. Statistics are numerical expression: All statistics are stated in numerical figure only.

Example 1: Ethiopia is developing country is not statistical statement.

Example 2: compare CGPA of statistics and probability course in Accounting and Finance
students with that of Statistics students. (It is statistical statement)

1
Basic Statistics Email:[email protected] 2024

3. Statistics must be placed in relation to other. Comparison must relate to the same subject
implies oranges cannot compare with apple.

Singular sense: Statistics is the science that deals with the methods of data collection,
organization, presentation, analysis and interpretation of data. It refers the subject area that is
concerned with extracting relevant information from available data with the aim to make sound
decisions. According to this meaning, statistics is concerned with the development and
application of methods and techniques for collecting, organizing, presenting, analyzing and
interpreting statistical data.

1.2. Classification of Statistics

Based on the scope of the decision, statistics can be classified into two; Descriptive and
Inferential Statistics.

Descriptive Statistics refers to the procedures used to organize and summarize masses of data.
It is concerned with describing or summarizing the most important features of the data. It deals
only the characteristics of the collected data without going beyond it. That is, this part deals with
only describing the data collected without going any further: that is without attempting to
infer(conclude) anything that goes beyond the data themselves.

The methodology of descriptive statistics includes the methods of organizing (classification,

tabulation, Frequency Distributions) and presenting (Graphical and Diagrammatic Presentation)
data and calculations of certain indicators of data like Measures of Central Tendency and
Measures of Dispersion (Variation).

Inferential Statistics includes the methods used to find out something about a population, based
on the sample. It is concerned with drawing statistically valid conclusions about the
characteristics of the population based on information obtained from sample. In this form of
statistical analysis, inferential statistics is linked with probability theory in order to generalize the
results of the sample to the population. Performing hypothesis testing, determining relationships
between variables and making predictions are also inferential statistics.
Example: Classify the following statements as Descriptive or Inferential Statistics

2
Basic Statistics Email:[email protected] 2024

a. The average income of Staff in commercial bank of America in this year is 5000$ years.
b. There is a strong association between income and expenditure level.
c. Of the students enrolled in Haramaya University in this year 74% are male and 26% are
female.
d. The price of wheat will be increased by 5% in the coming year.
e. The chance of winning the Ethiopian National Lottery in any day is 1 out of 167000.

Uses of Statistics

 To reduce and summarize masses of data and to present facts in numerical and
definite form. Statistics condenses and summarizes a large mass of data and presents
facts into a few presentable, understandable and precise numerical figures. The raw data,
as is usually available, is voluminous and haphazard. It is generally not possible to draw
any conclusions from the raw data as collected. Hence it is necessary and desirable to
express these data in a few numerical values.
 To facilitate comparison. Statistical devices such as averages, percentages, ratios, etc
are used for this purpose.
 For determining functional relationships between two or more phenomenon.
 Statistical techniques such as correlation analysis assist in establishing the degree of
association between two or more variables.
 For formulating and testing hypotheses. For instance, hypothesis like whether a new
medicine is effective in curing a disease, whether there is an association between
variables can be tested using statistical tools.
 For forecasting. Statistical methods help in studying past data and predicting future
trends.
1.3. Types of Variables and Measurement Scales
1.3.1. Variable
Variable is a characteristics or an attribute that can assume different values.
For example: income, Family size, Gender, etc.
Based on the values that variables assume, variables can be classified as
1. Qualitative variables are those variables that do not assume numeric values.
For example: Gender, marital status, religion, etc.

3
Basic Statistics Email:[email protected] 2024

2. Quantitative variables are variables assume numeric values. These variables are numeric in
nature.
For example: Expenditure, Family size, etc
Quantitative variables are again classified in to two; discrete and continuous variables.
 Discrete variable takes whole number values and consists of distinct recognizable
individual elements that can be counted. It is a variable that assumes a finite or countable
number of possible values. These values are obtained by counting (0, 1, 2. . .).
For example: Family size, Number of children in a family, number of cars at the traffic
light.
 Continuous variable takes any value including decimals. Such a variable can
theoretically assume an infinite number of possible values. These values are obtained by
measuring.
Example: Height, Weight, Net- income, and Age
Generally the values of a variable can be obtained either by counting for discrete variables, by
measuring for continuous variables or by making categories for qualitative variables.
Ex: Classify each of the following as Qualitative or Quantitative and if it is quantitative classify
as Discrete and Continuous.
a. Sales of automobiles in a dealer‟s show room.
b. The number of customers who come in each day.
c. Classification of wealth index based on income status (very poor, poor, rich, very rich)
d. Weight of newly born babies.

1.3.2. Scales of Measurement

Consider the following two cases.
 Mr A wears 5 when he plays football.
 Mr B wears 6 when he plays football.
Who plays better? What is the average shirt number?
 Mr A scored 5 in Stat quiz.
 Mr B scored 6 in Stat quiz.
Who did better? What is the average score?

4
Basic Statistics Email:[email protected] 2024

Based on the number on the shirts it is not possible to judge, whether Mr B plays better. But by
using the test score, it is possible to judge that Mr B did better in the exam. Also it not possible
to find the average shirt numbers (or the average shirt number is nothing) because the numbers
on the shirts are simply codes but it is possible to obtain the average test score.

Therefore scales of measurement

 Shows the information contained in the value of a variable.
 Shows also that what mathematical operations and what statistical analysis are permissible
to be done on the values of the variable.
1. Nominal variables: are those qualitative variables which show category of individuals. They
reflect classification into categories (name of groups) where there is no particular order or
qualitative difference to the labels. Numbers may be assigned to the variables simply for coding
purposes. It is not possible to compare individual basing on the numbers assigned to them. The
only mathematical operation permissible on these variables is counting.
These variables
 Have mutually exclusive (non-overlapping) and exhaustive categories.
 No ranking or order between (among) the values of the variable.
Examples: Gender, Religion, ID No, Ethnicity, Color

2. Ordinal variables: are also those qualitative variables whose values can be ordered and ranked.
Ranking and counting are the only mathematical operations to be done on the values of the
variables. But there is no precise difference between the values (categories) of the variable.
Examples: Academic qualifications (B.Sc., M.Sc., Ph.D.), Grade Scores (A, B, C, D, F), Wealth
index (very poor, poor, rich, very rich), Wealth Index (very poor, poor, rich, very rich)
3. Interval variables: are those quantitative variables when the value of the variables is zero it
does not show absence of the characteristics i.e. there is no true zero. Zero indicates low than
empty. There is a precise difference between the units of measurement (levels)
Examples: temperature, 00c does not mean there is no temperature but to say it is too cold.
4. Ratio variables: are those quantitative variables when the values of the variables are zero it
shows absence of the characteristics. Zero indicates absence of the characteristics.
Examples: Income, Amount of yield, Expenditure, Consumption.
All mathematical operations are allowed to be operated on the values of the variables.

5
Basic Statistics Email:[email protected] 2024

1.4. Statistics in Business Decisions

In a business setting, statistics is important for the following reasons:

Reason 1: Understand Consumer Behavior Using Descriptive Statistics

Descriptive statistics are used to describe datasets. Businesses in almost every field
use descriptive statistics to gain a better understanding of how their consumers
behave. For example, a grocery store might calculate the following descriptive
statistics:

 The mean number of customers who come in each day.

 The median sales order per customer.
 The standard deviation of the age of the customers who come in the store.
 The sum of the sales made each month.

On the other hand, a bank might calculate the following descriptive statistics:

 The percentage of customers who default on their loan.

 The mean number of new customers who join the bank each day.
 The sum of the total deposits made by all customers each month.


Using these metrics, the bank can get an idea of how their customers behave and
how they handle their money. Not all businesses build statistical models or perform
complex calculations, but just about every business uses descriptive statistics to
gain a better understanding of their customers.

Reason 2: Spot Trends Using Data Visualization

Another common way that statistics is used in business is through data

visualizations such as line charts, histograms, boxplots, pie charts and other charts.
These types of charts are often used to help a business spot trend. For example, a
small business might create the following combo chart to visualize the number of
new clients and total sales they make each month.

6
Basic Statistics Email:[email protected] 2024

Using this simple chart, the business can quickly see that both their sales and
number of new clients tends to increase the most in the final quarter of the year.
This can allow the business to be prepared with more staff, later hours, more
inventory, etc. during this time of year.

Reason 3: Understand the Relationship between Variables Using Regression

Models

Another way that statistics is used in business settings is in the form of linear
regression models. These are models that allow a business to understand the
relationship between one or more predictor variables and a response variable. For
example, a grocery store might track their total amount spent on print advertising,
their total amount spent on online advertising, and their total revenue. They might
then build the following multiple linear regression model:

Sales = 840.35 + 2.55(TV advertising) + 4.87(online advertising)

Here‟s how to interpret the regression coefficients in this model:

 For each additional dollar spent on TV advertising, the total revenue

increases by $2.55 (assuming online advertising is held constant).

7
Basic Statistics Email:[email protected] 2024

 For each additional dollar spent on online advertising, the total revenue
increases by $4.87 (assuming TV advertising is held constant).

Using this model, the grocery store can quickly see that their money is better spent
on online advertising as opposed to TV advertising.

Note: In this example, we only used two predictor variables (TV advertising and
online advertising), but in practice businesses often build regression models with
far more predictor variables.

Reason 4: Segment Consumers into Groups Using Cluster Analysis

Another way that statistics is used in business settings is in the form of cluster
analysis. This is a machine learning technique that allows a business to group
together similar people based on different attributes. Retail companies often use
clustering to identify groups of households that are similar to each other.

For example, a retail company may collect the following information on

households:

 Household income
 Household size
 Head of household Occupation
 Distance from nearest urban area

They can then feed these variables into a clustering algorithm to perhaps identify
the following clusters:

 Cluster 1: Small family, high spenders

 Cluster 2: Larger family, high spenders
 Cluster 3: Small family, low spenders
 Cluster 4: Large family, low spenders

The company can then send personalized advertisements or sales letters to each
household based on how likely they are to respond to specific types of
advertisements.

8
Basic Statistics Email:[email protected] 2024

CHAPTER TWO

2. Visual Description of Data

2.1. Methods of Data Organization and Presentation

2.1.1. Methods of Data Organization

In order to describe situations, draw conclusions or make inferences about the population even to
describe the sample, the collected data must organize into some meaningful way. The most
convenient way of organizing data is to construct a frequency distribution. Frequency
distribution is the organization of raw data in table form, using classes and frequencies.
Definition of some terms
Class: is a description of a group of similar numbers in a data set.
Frequency: is the number of times a variable value is repeated.
Class frequency: the number of observations belonging to a certain class.
There are three types of frequency distributions; categorical, ungrouped (discrete or frequency
array) and grouped (continuous) frequency distributions.
Categorical FD:-a FD in which the data is qualitative i.e. either nominal or ordinal. Each
category of the variable represents a single class and the number of times each category repeats
represents the frequency of that class (category).
Example:-The blood type of 25 students is given below
A B B AB O A
O O B AB B A B
B B O A O AB
A O O O AB O

Class(Blood type) Frequency(number of students)

A 5
B 7
AB 4
O 9
Total 25

9
Basic Statistics Email:[email protected] 2024

Exercise:-Construct FD for the following letter grade of 25 students

A B C C C
C B B A D
A C C A B
F C C A B

Ungrouped FD (Frequency Array):- A FD of numerical data (quantitative) in which each value

of a variable represents a single class (i.e. the values of the variable are not grouped) and the
number of times each value repeats represents the frequency of that class.
Example:-Number of children for 21 families.
2 3 5 4 3 3 2
3 1 0 4 3 2 2
1 1 1 4 2 2 2

Class(Number of children) Frequency(Number of families)

0 1
1 4
2 7
3 5
4 3
5 1
Total 21
Grouped (Continuous) FD: - A FD of numerical data in which several values of a variable are
grouped into one class. The number of observations belonging to the class is the frequency of the
class.

10
Basic Statistics Email:[email protected] 2024

Example:-Consider age group and number of persons

Class Limits Class Boundaries Frequency
(Age in years) (Age in years) (number of persons)
1-25 0.5-25.5 20
26-50 25.5-50.5 15
51-75 50.5-75.5 25
76-100 75.5-100.5 10
Total 70

Class Limits:-The lowest and highest values that can be included in a class are called Class
Limits. The lowest values are called Lower Class Limits and the highest values are called Upper
Class Limits.
Class limit for the first class 1-25
Lower class limit 1 and Upper class limit 25

Class Boundaries:-are class limits when there is no gap between the UCL of the first class and
the LCL of the second class. The lowest values are called Lower Class Boundaries and the
highest values are called Upper Class Boundaries.

Cass Boundary for the first class 0.5-25.5

Lower class boundary 0.5 and Upper class boundary 25.5

Class Width (Class Size):-the difference between UCB and LCB of a class. It is also the
difference between the lower limits of two consecutive classes or it is the difference between
upper limits of two consecutive classes.

W=UCB-LCB or W=LCLi-LCLi-1 or W=UCLi-UCLi-1

For the above Example W=25.5-0.5=25 or W=26-1=25 or W=50-25=25

Class Mark (Class Midpoint):-is the half way between the class limits or the class boundaries.

11
Basic Statistics Email:[email protected] 2024

LCL  UCL LCB  UCB

CM= or CM=
2 2

Class Limits Class Boundaries Class Mark Frequency

1-25 0.5-25.5 13 20
26-50 25.5-50.5 38 15
51-75 50.5-75.5 63 25
76-100 75.5-100.5 88 10
Total 70

Note that W=CMi-CMi-1

Relative frequency: - is the ratio of class frequency to the total frequency (total number of
observations).
Percentage frequency: - Relative frequency ×100

Class Class Boundaries Class Mark Frequency Relative Percentage

Limits frequency frequency
1-25 0.5-25.5 13 20 20/70
26-50 25.5-50.5 38 15 15/70
51-75 50.5-75.5 63 25 25/70
76-100 75.5-100.5 88 10 10/70
Total 70 70/70=1 100

Cumulative frequency: is the sum of frequencies (total number of observations) below or above
a certain value.
Less than Cumulative Frequency: is the total number of values of a variable below a certain
UCB.
More than Cumulative Frequency: - is the total number of values of a variable above certain
LCB.

12
Basic Statistics Email:[email protected] 2024

Class Class Class Frequency Less than More than

Limits Boundaries Mark Cum. Freq. Cum. Freq.
1-25 0.5-25.5 13 20 20 10+25+15+20=70
26-50 25.5-50.5 38 15 20+15=35 10+25+15=50
51-75 50.5-75.5 63 25 20+15+25=60 10+25=35
76-100 75.5-100.5 88 10 20+15+25+10=70 10
Total 70

Construction of Grouped Frequency Distribution

1. Arrange the data in an array form (increasing or decreasing order).

2. Find the Unit of Measurement (U). U is the smallest difference between any two distinct
values of the data.
3. Find the Range(R). R is the maximum numerical difference in the data set, i.e. the
difference between the largest and the smallest values of the variable.
4. Determine the number of classes (K) using Sturge‟s Rule. K=1+3.322logN where N is
the total number of observations.
R
5. Specify the class width (W). W=
K
6. Put the smallest value of the data set as the LCL of the first class. To obtain the LCL of
the second class add the class width W to the LCL of the first class. Continue adding until
you get K classes.
Let X be the smallest observation
LCL1=X
LCLi=LCLi-1+W for i=2, 3… K.
7. Obtain the UCLs of the FD by adding W-U to the corresponding LCLs.UCLi=LCLi+(W-
U) for i=1,2…K.
1 1
8. Generate the class boundaries.LCBi=LCLi- U and UCBi=UCLi+ U for i=1,2…K
2 2

13
Basic Statistics Email:[email protected] 2024

Example 1: Mark of 50 students out of 40

16 21 26 24 11 17 25 26 13 27 24 26 3 27 23 24 15 22 22 12 22 29 18 22 28 25 7
17 22 28 19 23 23 22 3 19 13 31 23 28 24 9 20 33 30 23 20 8 21 24

Construct grouped frequency distribution.

Solution:

1. The array form of the data (increasing order)

3 3 7 8 9 11 12 13 13 15 16 17 17 18 19 19 20 20 21 21 22 22 22 22 22 22
23 23 23 23 23 24 24 24 24 24 25 25 26 26 26 27 27 28 28 28 29 30 31 33
2. U=9-8=1
3. R=L-S=33-3=3
4. K=1+3.322logN=1+3.322log50=6.64≈7
5. W=R/K=30/6.64=4.5≈5
6. W-U=5-1=4

Class Class Class Frequency Relative Percentage LCF MCF

Limits Boundaries mark Frequency Frequency
3-7 2.5-7.5 5 3 0.06 6 3 50
8-12 7.5-12.5 10 4 0.08 8 7 47
13-17 12.5-17.5 15 6 0.12 12 13 43
18-22 17.5-22.5 20 13 0.26 26 26 37
23-27 22.5-27.5 25 17 0.34 34 43 24
28-32 27.5-32.5 30 6 0.12 12 49 7
33-37 32.5-37.5 35 1 0.02 2 50 1
total 50 1 100

Exercise: In a survey the age of 44 women at marriage was reported as follows. Construct the
appropriate FD for this data.
24 25 27 26 22 23 24 25 24 23 26 28 24 25 23 24 25 25 25 22 27 28
14
Basic Statistics Email:[email protected] 2024

27 24 25 24 25 28 26 25 24 28 24 25 25 24 25 24 26 27 27 25 28 26

Properties of Classes (Class Boundaries)

Classes should be:

 Complete and non-overlapping

Complete- it should include all the data set. Non-overlapping and no data should belong
to two classes.
 Clear and properly set
The W and K should be calculated properly and W should be the same for all classes.
 Standardized
A class should follow logical and chronological (increasing) order.
 The number of classes should be in between 5 and 20 i.e. 5≤K≤20. K depends on N. the
larger the N the more the K. But we need to condense the data set with minimum lose of
information in an easy manageable classes.
 Continuous
Even if there are no values in a class the class must be included in the frequency
distribution.
Advantages and disadvantages of frequency distributions
a. Advantages
 It condenses a large mass of data in to a comparatively small table.
 It attracts the attention of even a layman and gives him an insight into the nature of the
distribution.
 It helps for further statistical analysis, like central tendency, scatter, symmetry,… of the
data.
b. Disadvantages
 In the grouped frequency distributions, the identity of the observations is lost. We know
only the number of observations in a class and don not know what the values are.

15
Basic Statistics Email:[email protected] 2024

 Because the selection of the class width and the lower class limit of the first class are to a
certain extent arbitrary, different frequency distributions may be constructed for the same
data and hence may give contradictory impressions.

2.2.2. Data Presentation

2.2.2.1. Graphical Presentation of Data

1. Histogram: A graph in which the classes are marked on the X axis (horizontal axis) and the
frequencies are marked along the Y axis (vertical axis).
 The height of each bar represents the class frequencies and the width of the bar represents
the class width.
 The bars are drawn adjacent to each other.

Example: Construct a histogram to the following grouped data.

Class boundaries Frequency

99.5–104.5 2
104.5–109.5 8
109.5–114.5 18
114.5–119.5 13
119.5–124.5 7
124.5–129.5 1
129.5–134.5 1

16
Basic Statistics Email:[email protected] 2024

2. The Stem-and-Leaf Display and the Dot plot

a) The Stem-and-Leaf Display
A stem and leaf display is a graphical method of displaying data. It is particularly useful when
your data are not too numerous. In this section, we will explain how to construct and interpret
this kind of graph. Stem and leaf plots are one such way of representing data in an easier and
convenient way. Stem and leaf plots have several advantages that make them very handy for the
purpose of analyzing large sets of data easily.

A stem and leaf display is a graphical method of displaying data. It is particularly useful when
your data are not too numerous. In this section, we will explain how to construct and interpret
this kind of graph. As usual, an example will get us started. Consider Table 1 that shows the
number of touchdown passes (TD passes) thrown by each of the 31teams in the National
Football League in the 2000season.

Table 1: Number of touchdown passes

A stem and leaf display of the data is shown in Figure 1. The left portion of Figure 1
contains the stems. They are the numbers 3, 2, 1, and 0, arranged as a column to the left
of the bars. Think of these numbers as 10′s digits. A stem of 3, for example, can be used
17
Basic Statistics Email:[email protected] 2024

to represent the 10′s digit in any of the numbers from 30 to 39. The numbers to the right
of the bar are leaves, and they represent the 1′s digits. Every leaf in the graph therefore
stands for the result of adding the leaf to 10 times its stem.

Figure 1: Stem and leaf display of the number of touchdown passes

b). Dot Plot

A dot plot, also known as a strip plot or dot chart, is a simple form of data visualization
that consists of data points plotted as dots on a graph with an x- and y-axis. These types
of charts are used to graphically depict certain data trends or groupings. A dot plot is
similar to a histogram in that it displays the number of data points that fall into each
category or value on the axis, thus showing the distribution of a set of data.

A dot plot is used to represent any data in the form of dots or small circles. It is similar to a
simplified histogram or a bar graph as the height of the bar formed with dots represents the
numerical value of each variable. Dot plots are used to represent small amounts of data. For
example, a dot plot can be used to collect the vaccination report of newborns in an area, which is
represented in the following table.

18
Basic Statistics Email:[email protected] 2024

Now let's see the number of newborn babies who got a vaccine in each colony. Colony A has a
total of 7 dots, which means that seven babies have been vaccinated. Similarly, colony B has
three babies, colony C has five babies, and colony D has one baby who has been vaccinated. The
other way to represent it through a dot plot is given below:-

3. Frequency Polygon: A graph that consists of line segments connecting the intersection
of the class marks and the frequencies.
 Can be constructed from Histogram by joining the mid-points of each bar.

19
Basic Statistics Email:[email protected] 2024

Example: Construct frequency polygon for the following Grouped frequency Distribution.

4. Cumulative Frequency (Ogive) curves: is a smooth free hand curve of frequency polygon.
Example: Construct Ogive curve for the following Grouped frequency Distribution.
Class boundaries Frequency
99.5–104.5 2
104.5–109.5 8
109.5–114.5 18
114.5–119.5 13
119.5–124.5 7
124.5–129.5 1
129.5–134.5 1

20
Basic Statistics Email:[email protected] 2024

5. Line Graph and Pictograms

a. Line Graph

A line graph also known as a line plot or a line chart is a graph that uses lines to connect
individual data points. A line graph displays quantitative values over a specified time interval.
In finance, line graphs are commonly used to depict the historical price action of an asset or
security.

Line graphs use data point "markers," which are connected by straight lines. These
data points, connected by straight lines, aid in visualization. While line graphs are
used across many different fields for different purposes, they are especially helpful
when it is necessary to create a graphical depiction of changes in values over time.

Types of Line Graphs

There are three main types of line graphs. Although each type is fundamentally rooted in the
same principles, each has its own unique situation where it is best to implement and use.

Simple Line Graph

A simple line graph is the most basic type of line graph. In this graph, only one dependent
variable is tracked, so there is only a single line connecting all data points on the graph. All
points on the graph relate to the same item, and the only purpose of the graph is to track the

21
Basic Statistics Email:[email protected] 2024

changes of that variable over time. This graph cannot be used to compare the variable to another
variable because only variable is charted.

In the example below, the x-axis is time and the y-axis is the year-over-year change in price for
all consumer goods in the United States. This graph of the Consumer Price Index shows the
annual rate of inflation and, since it is analyzing just one set of data (all items), there is only one
line.

Multiple Line Graph

In a multiple line graph, more than one dependent variable is charted on the graph and
compared over a single independent variable (often time). Different dependent variables are
often given different colored lines to distinguish between each data set. Each line relates to only
the points in its given data set; lines do not cross between dependent variables.

For example, the line graph below shows the Consumer Price Index again. However, this graph
shows the change in price for three different categories: medical care
(red), commodities (green), and shelter (blue). In this graph, we can see the growth in price for
commodities was higher than the other two categories in July 2022. However, shelter or medical
expenses were typically the groups that experienced higher inflation over the past decade.

22
Basic Statistics Email:[email protected] 2024

b. Pictograms

A pictogram is one of the simplest and most popular forms of data visualization out there.
Besides making your data look nice, pictograms can make your data more memorable. Visually
stacking icons to represent simple data can improve a reader‟s recall of that data and even their
level of engagement with that data. Pictograms can also be a fun addition to any info-graphic.
Pictograms are types of charts and graphs that use icons and images to represent data.

Also known as “pictographs”, “icon charts”, “picture charts”, and “pictorial unit charts”,
pictograms use a series of repeated icons to visualize simple data. The icons are arranged in a
single line or a grid, with each icon representing a certain number of units (usually 1, 10, or 100).
A feature of many great info-graphics, they‟re often used to make otherwise boring facts or data
points more compelling, as seen in the statistical info-graphic below.
When to use a pictogram
Pictograms can come in handy quite often when visualizing data in info-
graphics, reports, presentations, and even resumes!

You can use a pictogram whenever you want to make simple data more visually interesting,
more memorable, or more engaging.
Whether you want to show the magnitude of an important stat or visualize a fraction or
percentage, you can use pictograms to add visual impact to simple data.
Use a pictogram to show ratings or changes
We know that pictograms are great for showing simple proportions or percentages.

23
Basic Statistics Email:[email protected] 2024

6. The Scatter Diagram

A Scatter Diagram is also called a Scatter Plot or an x-y graph. This type of chart is designed

to express the relationship between two data points or variables. You have to plot two data

points along the x and y-axes. The y-axis displays the dependent variable of your data, while

the x-axis shows the independent variable.

Types of Scatter Diagrams

One classifies Scatter Diagrams based on their slope type and correlation. The following are the

various types of Scatter Diagrams:-

Strong Correlation\ Positive Correlation

In this Scatter Diagram type, the dependent variables are displayed on the y-coordinate, and you

mark the data as a dot. Still, you can show your independent variables on the x-coordinate. When

you carefully examine this Scatter Diagram type, you will see that the dots follow a linear

pattern. All you have to do is to join them using a straight line. Below is an example of a Scatter

Diagram with a strong correlation.

24
Basic Statistics Email:[email protected] 2024

The dots‟ straight-line alignment shows a strong relationship between your data points. Experts

term it a Scatter Diagram with a high degree of correlation.

Moderate Correlation\ Negative Correlation

Experts term this Scatter Plot type with a low degree of correlation. The data points are somehow

non-linear, and it can be challenging to use a straight line. Your data points appear as dots and

are usually close to each other. A Scatter Diagram with moderate correlation will appear as

shown below.

Scatter Diagram with No Correlation

This Scatter Diagram type has no degree of alignment or correlation. In most instances, your data

points scatter all over the diagram, which can prove difficult to draw a straight line? It becomes

impossible for you to establish a relationship between your variables. A Scatter Diagram with no

correlation appears, as shown below.

25
Basic Statistics Email:[email protected] 2024

7. Tabulation and Contingency Tables

A contingency table displays frequencies for combinations of two categorical variables. Analysts
also refer to contingency tables as cross-tabulation and two-way tables. Contingency tables
classify outcomes for one variable in rows and the other in columns. The values at the row and
column intersections are frequencies for each unique combination of the two variables. Use
contingency tables to understand the relationship between categorical variables. For example, is
there a relationship between gender (male/female) and type of computer (Mac/PC)?

The contingency table example below displays computer sales at our fictional store. Specifically,
it describes sales frequencies by the customer‟s gender and the type of computer purchased. It is
a two-way table (2 X 2).

In this contingency table, columns represent computer types and rows represent genders. Cell
values are frequencies for each combination of gender and computer type. Totals are in the
margins. Notice the grand total in the bottom-right margin. At a glance, it‟s easy to see how two-

26
Basic Statistics Email:[email protected] 2024

way tables both organize your data and paint a picture of the results. You can easily see the
frequencies for all possible subset combinations along with totals for males, females, PCs, and
Macs. For example, 66 males bought PCs while females bought 87 Macs. Furthermore, there are
117 females, 106 males, 96 PC sales, 127 Mac sales, and a grand total of 223 observations in the
study.

2.2.2.2. Diagrammatical Presentation of Data

Bar Diagram:-It is the simplest and most commonly used diagrammatic representation of a
frequency distribution. It is appropriate to present Qualitative Data (nominal\ordinal).
It uses a serious of separated and equally spaced bars in which the width of the bars is constant
and height of bars corresponds to the frequency of the category. The bars are separated by
constant distance.
a. Simple Bar Diagram: is a diagram in which categories of a variable are marked on the X
axis and the frequencies of the categories are marked on the Y axis. It is applicable for
discrete variables, that is, for data given according to some period, places and timings.
These periods and timings are represented on the base line (X-axis) at regular interval and
the corresponding frequencies are represented on the Y-axis.
 The width of the rectangle represents nothing (it is meaningless), but it should be equal for
all rectangles.
 Each rectangle is separated by an equal space.
 It can also represent some magnitude (on the Y axis) over time, space, groups, etc.(on the X
axis).

Example1:

MaritalStatus Number of individuals

Single 100
Married 70
Divorced 30
Total 200

27
Basic Statistics Email:[email protected] 2024

100

Frequ en cy

0
Single Married Divorced

b. Component Bar Diagram: is used when there is a desire to show a total or aggregate is
divided into its component parts. The bars represent total value of a variable with each total
broken into its component parts and different colors are used for identification. In such type
of diagrams, a bar is subdivided in to parts in proportion to the size of the sub division.
These subdivided rectangles are shaded differently by lines, dots and colors so that they will
be very easy to compare the components. Sometimes the volumes of different attributes may
be greatly different. For making meaningful comparisons, the components of the attributes
are reduced to percentages. In that case each attribute will have 100 as its maximum volume.
This sort of component bar diagram is known as percentage bar-diagram. Each rectangle
represents total value of a variable and is broken into its component parts.

Example:

Marital Status Male Female Total

Single 90 10 100
Married 30 40 70
Divorced 1 29 30

28
Basic Statistics Email:[email protected] 2024

c. Multiple Bars Diagram: used to display data on more than one variable. In the multiple
bars diagram two or more sets of inter-related data are interpreted.

Example:

Year Coffee Butter Sugar Total

1997 120 127 75
1998 25 98 87
1999 100 120 75
2000 198 98 60

d. Deviation Bar Diagram: When the data contains both positive and negative values such as
data on net profit, net expense, percent change,etc

29
Basic Statistics Email:[email protected] 2024

Example:

Commodity Net profit

Soap 80
Sugar -95
Coffee 125

8. Pie chart: - Pie chart is popularly used in practice to show percentage break down of data. A
pie chart is a circle representing a set of data by dividing the circle into sectors proportional to
the number of items in the categories or a pie chart is a circle representing the total, cut into
slices in proportional to the size of the parts that make up the total. It gives the proportional
sizes of different data groups as slice of a pie or a circle.
Example:

MaritalStatus Number of individuals Percentage Degree

Single 100 50 180
Married 70 35 126
Divorced 30 15 54
Total 200 100 360

30
Basic Statistics Email:[email protected] 2024

31
Basic Statistics Email:[email protected] 2024

CHAPTER THREE

3. Statistical Description of Data

3.1. Statistical Description: Measures of Central Tendency

Usually the collected data is not suitable to draw conclusions about the mass from which it has
been taken. Even though the data will be ,somewhat summarized after it is depicted using
frequency distributions and presented by using graphs and diagrams, still we cannot make any
inferences about the data since we have many groups. Hence, organizing a data into a FD is not
sufficient, there is a need for further condensation, particularly when we want to compare two or
more distributions we may reduce the entire distribution into one number that represents the
distribution we need. A single value which can be considered as a typical or representative of a
set of observations and around which the observations can be considered as centered is called an
„Average‟ (or average value or center of location). Since, such typical values tend to lie centrally
within asset of observations when arranged according to magnitudes; averages are called
Measures of Central Tendency.

3.2. Objectives of Measures of Central Tendency

1. To condense a mass of data in to one single value. That is to get a single value which is best
representative of the data (that describes the characteristics of the entire data). Measures of
central tendency, by condensing masses of in to one single value enable us to get an idea of
the entire data. Thus one value can represent thousands of data even more.
2. To facilitate comparison. Statistical devices like averages, percentages and ratios used for this
purpose. Measures of central tendency, by condensing masses of in to one single value,
facilitates comparison. For example, to compare two classes A and B, instead of comparing
each student result, which is infeasible, we can compare the average mark of the two classes.

32
Basic Statistics Email:[email protected] 2024

There are many types of measures of central tendency, each possessing particular properties and
each being typical in some unique way. The most frequently encountered ones are :-
 Computed averages: Mean (Arithmetic Mean. Geometric Mean and Harmonic Mean)
 Positional averages: Median and Quantiles (Quartiles, Deciles, Percentiles)
 Mode

3.3. Properties of Good Measures of Central Tendency

A measure of central tendency is good or satisfactory if it possesses the following characteristics.

1. It should be calculated based on all observations.
2. It should not be affected by extreme values. It should be as close to the maximum number of
observed values as possible.
3. It should be defined rigidly which means it should have a definite value (it should be
unique).
4. It should always exist.
5. It should be easy to understand calculate. It should not be subject to complicated and tedious
calculations, though the advent of electronic calculators and computers has made it possible.
6. It should be capable of further algebraic treatment. By algebraic treatment, we mean that the
measures should be used further in the formulation of other formulae or it should be used for
further statistical analysis.

Summation Notation
n
The sum X1+X2+…+Xn is denoted by the Greek letter ∑ (sigma) as X
i 1
i = X1+X2+…+Xn and

it is called the Summation Notation.

Properties of the summation notation:
n n n
  ( X i  Yi ) =  X i +  Yi
i 1 i 1 i 1

n
 X Y
i 1
i i  X 1Y1  X 2Y2  ...  X nYn

33
Basic Statistics Email:[email protected] 2024

n n
 (X
i 1
i  c)   X i  nc
i 1

n n
  CX
i 1
i =C  X i , where C is a constant.
i 1

n
  a =n a where a is a constant.
i 1
n
From now onwards we will use ∑X in place of X
i 1
i just for simplicity.

3.1.1. Mean

1. Arithmetic Mean

Simple Arithmetic Mean:-is the sum of all observations divided by total number of observations.
For a sample of n observations X1X2,…,Xn the sample mean is denoted by X (X-bar) and
calculated as follows.

X=
 X = X 1  X 2  ....  X n
n n

For a frequency array (ungrouped FD), X =

 fX = f X 1 1  f 2 X 2  ....  f K X K
f f1  f 2  ...  f K

For grouped FD, X represents class mark.

Example1: The high temperatures for a 7-day week during December in Haramaya University
were 29 , 31 , 28 , 32 , 29 , 27 , and 55 . find the mean high temperature for the
week.

Solution: X = = =33 .

The mean or average, high temperature for the week was 33 .

Example2: The amounts of drops of water in drip irrigation were registered from 43 sample drip
holes in one day and the data are as follows:

34
Basic Statistics Email:[email protected] 2024

Class Interval Frequency

3-7 3
8-12 4
13-17 6
18-22 13
23-27 17
Solution:

Class interval Frequency Class mark(Mi)

3-7 3 5
8-12 4 10
13-17 6 15
18-22 13 20
23-27 17 25

The given table is grouped FD, so we can apply X =

 fM
f
X= = =19.30. The average drip holes in drip irrigation is 19.30

Properties of Arithmetic Mean

 The algebraic sum of the deviations of each value from the arithmetic mean is zero. That is
∑(X- X ) =0.
 The sum of the squares of the deviations from the mean is less than the sum of the squares of
the deviations about the other score in the distribution.
That is ∑(X- X ) 2≤∑(X-A) 2, A≠ X
 If a constant C is added or subtracted from each value in a distribution, then the new mean
will be X new= X old  C respectively.
 If each value of a distribution is multiplied by a constant C, the new mean will be the original
mean multiplied by C.

35
Basic Statistics Email:[email protected] 2024

 Combined Mean: If there are p different groups (having the same unit of measurement) with
mean X 1 , X 2 ,…, X p and number of observations n1,n2,…np respectively, then the mean of all

the groups i.e. the combined mean is given by X C

  

XC =
 nX =
n1 X 1  n2 X 2  ....  n p X p
n n1  n2  ...  n p

Weighted Arithmetic Mean:

While calculating the simple arithmetic mean we had given equal importance to all values. But
there are cases where the relative importance is not the same for all items. When this is case, it is
necessary to assign them weights (i.e. relative importance) and then calculate a weighted
arithmetic mean. Let X1X2,…,Xn be the values and W1,W2,…,Wn be the corresponding weights

then the weighted arithmetic mean denoted by X W is given by XW =

WX =
W
W1 X 1  W2 X 2  ....  Wn X n
W1  W2  ...  Wn

Example: If a final examination in a course is weighted three times as much as a quiz and a
student has a final examination grade of 85 and quiz grades of 70 & 90, find the mean grade of a
student.

Solution: let X1=1st quiz=70, X2=2ndquiz=90 and X3=final=85 with the corresponding weights‟

W1=1, W2=1 and W3=3

XW =
WX = = =83, so the average grade of a student is 83.
W

Arithmetic mean fulfills almost all characteristics of good measures of central tendency with the
exception that it is highly affected by extreme values. And it cannot be calculated for a FD with
open-ended classes (a FD with no lower class boundary of the first class or with no upper class
boundary of the last class or with both).

36
Basic Statistics Email:[email protected] 2024

ii. Geometric Mean

Geometric mean is the nth root of the product of the n values.

GM= n X = n X 1 X 2 ... X n

But this formula is used if n is small. If it is large, it is difficult to calculate the nth root. Thus to
facilitate the computation, we make use of logarithms.
1
GM=Antilog( ∑logX)
n
1
For ungrouped FD, GM=Antilog ( ∑flogX)
f
For grouped FD, X represents class mark.
If the variable values are measures as ratios, proportions or percentage and some values are
larger in magnitude and others are small, then the geometric mean is a better representative of
the data than the simple average. In a “geometric series”, the most meaning full average is the
geometric mean. The arithmetic mean is very biased toward the large numbers in the series.
The geometric mean is important in determining the average rate of growth, percentages, ratios
and portions.
The disadvantage of GM is that it cannot be calculated if one or more observations are zero or
negative. It is also affected by extreme values but not to the extent of AM.
Exercise:
1. Find the geometric mean of A) 1, 2, 3, 4, 5. B) 1, 2, 3, 4, 100. Is there a great difference
between the GM of A and that of B?
2. The price of a commodity increased by 5% from 1989 to 1990, 8% from 1990 to 1991 and by
77% from 1991 to 1992. Find the average price increase.
3. A machine depreciated by 10% each in the first two years and by 40% in the third year. Find
out the average rate of depreciation.
4. Decadal percentage growth of population in country A is given below. Find the average rate
growth.

37
Basic Statistics Email:[email protected] 2024

Year 1921 1931 1941 1951 1961 1971 1981

% Increase 8.25 19.08 32.09 41.49 25.89 37.91 46.02

iii. Harmonic Mean

Harmonic Mean is another specialized average which is useful in averaging variables expressed
as rate per unit of time, such as speed, number of units produced per day. It is the reciprocal of
the arithmetic mean of the numbers.
n n
HM= =
1 1 1 1
X 
X1 X 2
 ... 
Xn

For ungrouped FD, HM=

f =
f1  f 2  ...  f K
f f1 f f
X  2  ...  K
X1 X 2 XK
For grouped FD, X represents class mark.

Weighted harmonic mean, HM=

W =
W1  W2  ...  Wn
W W1 W2 W
X 
X1 X 2
 ...  n
Xn
Harmonic mean is not affected by extreme values. But it cannot be calculated when one or more
observations are zero.

Relationships between AM, GM and HM

 For n observations AM ≥ GM ≥ HM
 For two positive observations GM = AM * HM

Example 1: Find the H.M of 2, 4 and 8.

n
Solution: X HM = = = =3.43
1
 X

38
Basic Statistics Email:[email protected] 2024

Example 2: In a small company two typists are employed, typist A types one page in 10 minutes
and typist B types one page in 20 minutes.

a) Both are asked to types 10 pages. What is the average time taken for typing one page?
b) Both are asked to types for one hour. What is the average time taken by them by them for
typing for one page?
( ) ( )
Solution: a) X HM= =15 minute

b) X HM= 13 min. & 20 sec.

Exercise:
1. Find the harmonic mean of A) 1, 2, 3, 4, 5. B) 1, 2, 3, 4, 100. Is there a great difference
between the HM of A and that of B?
2. A driver traveled 400 km per day for three days at a speed of 60, 50 and 40 kilometers per
hour. Find the average speed of the driver.
3. A student reads the first 100 pages of a book at a rate of 5 pages per hour, the next 100 pages
at a rate of 8 pages per hour. What is the student‟s average reading speed?
4. Suppose a train moves 100 km with a speed of 40 km per hour, then 150 km with a speed of
50 km per hour and the next 135 km with a speed of 45 km per hour. Calculate the average
speed of the train.
5. In a factory a mechanic takes 15 days to fabricate a machine, the second mechanic takes 18
days, the third takes 30 days and the fourth takes 90 days. Find the average number of days
taken by the workers to fabricate the machine.
6. Suppose a train moves 5 hours at a speed of 40 km per hour, then 3 hours at a speed of 50 km
per hour and the next 5 hours with a speed of 45 km per hour. Calculate the average speed of
the train.

3.1.2. Median

Median is the half-way point in a data set. It divides a data set into two equal parts such that half
of the numbers have a value less than the median and have will have values greater than the
median. Graphically median is the intersection of the less than and more than cumulative
frequency curves.

39
Basic Statistics Email:[email protected] 2024

The median of a set of n observations X1X2,…,Xn arranged in ascending order of magnitude is

the middle value if n is odd or the arithmetic mean of the two middle values if n is even. That is
n n
( ) th value  (  1) th value
~ n  1 th ~
If n is odd X = ( ) valueand if n is even X = 2 2
2 2
Median for continuous grouped data: for grouped frequency distributions median is given by the
n
 FX~ 1
~
formula X = L X~  ( 2 )w
f X~

Where n=∑f= sum of frequencies

L X~ is the LCB of the median class.

FX~ 1 is the less than cumulative frequency just before the median class.

f X~ is frequency of the median class.

First obtain the less than cumulative frequencies. From the cumulative frequencies select the
n
minimum one which contains the value . Then the median class is the class corresponding to
2
n
this minimum cumulative frequency which contains the value .
2
Median is not influenced by extreme values. It can be calculated for FD with open-ended classes,
even it can be located if the data is incomplete.

Examples:
Find the median of the following data sets.
180, 201, 220, 191, 219, 209 and 220.
Solution: 4th value=209
62, 63, 64, 65, 66, 66, 68 and 78.
Solution: (4th value+5th value)/2= (65+66)/2=65.5

40
Basic Statistics Email:[email protected] 2024

Find the median weight of the 40 males college students at state university and Interpretation the
result.

Weight Frequency LCF

118-126 3 3

127-135 5 8

136-144 9 17

145-153 12 29

154-162 5 34

163-171 4 38

172-180 2 40

Total 40

Solution: The median class is the class having the less than cumulative frequency containing the
value n/2=40/2=20. This implies, 145-153 is the median class.
L X~ =144.5, n=40, FX~ 1 =17, f X~ =12 and w=9

n
 FX~ 1
~ 2
X = L X~  ( ) w =144.5+ (20-17)* =146.8.
f X~

3.1.3. Mode

The mode denoted by X̂ , is the most frequently occurring value in a set of observations or it is
the value with the highest frequency. A data set may have one mode (uni-modal), two modes (bi-
modal), more than two modes (multi-modal) or no mode at all (i.e. when all observations are
equally frequent).
Ungrouped (individual series): Arrange the data in ascending order and take the value
appearing most frequently (the most frequent value).

41
Basic Statistics Email:[email protected] 2024

Grouped (continuous) series: In a frequency distribution, the mode is located in the class with
highest frequency and that class is the modal class.
f Xˆ  f Xˆ 1
Then the formula for mode is X̂ = L Xˆ  ( )w
( f Xˆ  f Xˆ 1 )  ( f Xˆ  f Xˆ 1 )

Mode is not affected by extreme values and can be calculated for open-ended classes. But it
often does not exist and is value may not be unique.
Example 1: The study of the relationship between age and varies function (such as acuity and
depth perception) reported the following observation on area of sclera lamina (mm2) from human
optic nerve heads (experimental eye research 1988): 2.75, 2.62, 2.74, 3.85, 2.34, 2.74, 3.93, 4.21,
3.88, 4.33, 3.46, 4.52, 2.43, 3.65, 2.78, 3.56, 3.01. Find mean, median, mode,Q1, D5, P75.
Solution: Check the answer (mean=3.341, median=3.46, mode=2.71, Q1=2.74, D5=3.46 &
P75=3.93)

Example 2: Find the mode & interpret the result of 40 male college students.
Solution: the most frequency appears at class interval 145-153, so
L X~ =144.5, n=40, FX~ 1 =9, FX~ 1 =5 f X~ =12 and w=9

f Xˆ  f Xˆ 1
X̂ = L Xˆ  ( ) w =144.5+ =144.5+2.7=147.2
( f Xˆ  f Xˆ 1 )  ( f Xˆ  f Xˆ 1 )

Interpretation: The mode of the 40 males‟ college students is 147.2.

3.2. Statistical Description: Measures of Dispersion

In the third chapter, we concentrated on a central value (measures of central tendency), which
gives an idea of the whole mass that is a complete set of values. However the information so
obtained is neither exhaustive nor comprehensive, as the mean does not lead us to know whether
the observations are close to each other or far apart. Median is a positional average and has
nothing to do with the variability of the observations in a data set. Mode is the largest occurring
value independent of the other values in the set. This leads us to conclude that a measure of
central tendency is not enough to have a clear idea about the data unless all observations are the
same. Moreover two or more data sets may have the same mean and/or median but they may be
quite different. So MCT alone do not provide enough information about the nature of the data.

42
Basic Statistics Email:[email protected] 2024

To illustrate this let us consider the following three data sets: the price of a certain commodity in
four cities in five different months.

Month

January February March April May

A 30 30 30 30 30
City
B 28 29 31 30 32

C 15 5 55 45 30

D 3 5 37 30 75

Now if we calculate the mean and median for each of the city, we will come up with the value
30. This value implies that, the price of the commodity in the four cities A, B, C and D, on
average, is the same. That is the average price of the commodity in the four cities is the same.
But by inspection, it is apparent that the price of the commodity in the cities differs remarkably
from one another. For city A, it is right, for city B more or less it is ok, but for city C and D it is
not realistic to say the price of the commodity is 30. This means, just only by looking at the
average we cannot talk about the data set confidently. So, along with the average values
(measures of central tendency), we have to study the dispersion of the data.

Dispersion or variation may be defined as the extent of dispersion value around the measures of
central tendency. Thus measure of dispersion tells us the extent to which the values of a variable
vary about the measure of central tendency.

3.2.1. Objectives of Measures of Variation

1. To have an idea about the reliability of the measure of central tendency. If the degree
of dispersion is large, an average is less reliable. If the value of the dispersion is small, it
indicates that a central value is a good representative of all the values in the data set.

43
Basic Statistics Email:[email protected] 2024

2. To compare two or more sets of data with regard to their variability. Two or more
data sets can be compared by calculating the same measure of dispersion having the same
unit of measurement. A set with smaller value possess less variability or is more uniform
(or more consistent).
3. To provide information about the structure the data. A value of a measure of
dispersion gives an idea about the spread of the observations. Further, one can surmise
about the limits of the expansion of the values in the data set.
4. To pave way to the use of other statistical measures. Measures of dispersion,
especially variance and standard deviation, lead to many statistical techniques like
correlation, regression, analysis of variance.

3.2.2. Types of Measures of Variation

Absolute measures of variation: A measure of variation is said to be an absolute form when it

shows the actual amount of variation of an item from a measure of central tendency and are
expressed in concrete units in which the data have been expressed.
Relative measure of variation: It is the quotient obtained by dividing the absolute measure by a
quantity in respect to which absolute deviation has been computed. Relative measure of variation
is a pure number and used for making comparisons between different distributions.

Absolute Measures Relative Measures

Range Coefficient of Range
Quartile Déviation Coefficient of Quartile Deviation
Mean Deviation Coefficient of Mean Deviation
Variance and Standard Deviation Coefficient of Variation
Standard Scores

1. Range

It is the simplest and crudest measure of dispersion. Range is defined as the difference between
the largest and the smallest values in the data.

Ungrouped Data: R=L-S

44
Basic Statistics Email:[email protected] 2024

Grouped Data: R=UCLlast- LCLfirst

Coefficient of Range (CR)
LS
For raw data: CR=
LS
UCLlast  LCL first
For grouped data: CR=
UCLlast  LCL first

Range hardly satisfies any property of good measure of dispersion as it is based on two
extreme values only, ignoring the others. It is not liable to further algebraic treatment.

2. Quartile Deviation

 Sometimes known as Semi-interquartile Range (SIR)

 Interquartile Range=Q3-Q1
Q3  Q1
QD=
2
Q3  Q1
Coefficient of QD=
Q3  Q1
QD involves only the middle 50% of the observations by excluding the observations below the
lower quartile and the observations above the upper quartile. Note that QD does not take into
account all the individual values occurring between Q1 and Q3. It means that, no idea about the
variation of even 50% mid values is available from this measure. Anyhow it provides some idea
if the values are uniformly distributed between Q1 and Q2. It can be cal calculated for open-
ended classes.

3. Mean Deviation

It is the arithmetic mean of the absolute values of the deviation from some measures of central
tendency usually the mean and the median of a distribution. Hence we have mean deviation
~
about the mean MD( X ) and mean deviation about the median MD( X ).
~
Ungrouped Data: MD( X )=
 |XX| ~ | X  X |
MD( X )=
n n
~
Grouped Data: MD( X ) =
 f |XX| ~
MD( X ) =
 f |X X |
f f
45
Basic Statistics Email:[email protected] 2024

Coefficient of Mean Deviation

~
MD( X ) ~ MD( X )
MD( X )= MD( X )= ~
X X
MD is not affected by extreme values. Its main drawback is that the algebraic negative signs of
the deviations are ignored. MD is minimum when the deviation is taken from median.

4. Variance and Standard Deviation

The Variance and Standard Deviation are the most superior and widely used measures of
dispersions and both measure the average dispersion of the observations around the mean.
For a population containing N elements, the population variance (  2 ) is calculated by using the

formula  2
=
(X  X ) 2

for ungrouped data and  2

=
 f (X  X ) 2

for grouped data.

N f
For a sample of n elements, the sample variance (S2) is calculated by using the formula S2=

(X  X ) 2

for ungrouped data and S2 =

 f (X  X ) 2

for grouped data.

n 1  f 1
The first main demerit of variance is that its unit is the square of the unit of measurement of the
variable values. For example the sample variance of 2m, 6m and 4m is 4m2. The interpretation is
on average, each value differs from the mean by 4m2, which is completely wrong because one
thing the unit of measurement of variance is not the same as that of the data set; secondly the
variation of the data is exaggerated from two to four since it is taking the square of the
deviations.
Thus the other disadvantage of variance is, the variation of the data is exaggerated because the
deviation (difference) of each value from the mean is squared. Also it gives more weight the
extreme values as compared to those which are near to the mean value.
Standard Deviation: Standard deviation is the positive square root of variance.

Population Standard Deviation (δ) =  2

Sample Standard Deviation (S) = S 2

Standard deviation is considered to be the best measure of dispersion because the unit of
measurement is the same as the data set and the exaggeration made by variance will be
eliminated by taking the square root of it.

46
Basic Statistics Email:[email protected] 2024

If the standard deviation of the data is small the values are concentrated near the mean and if it
large the values are scattered away from the mean.
Interpretation of the Standard Deviation
If the data are a sample and the distribution is normal or bell-shaped (or close to it!) or
approximately normally distributed, then the following conclusions can be reached:

 approximately 68% of the scores in the sample fall within one standard deviation of the mean i.e.
X  S will include approximately 68% of the data
 approximately 95% of the scores in the sample fall within two standard deviations of the mean
i.e. X  S will include approximately 95% of the data
 Approximately 99% of the scores in the sample fall within three standard deviations of the mean
i.e. X  S will include approximately 99.73% of the data.

Even if standard deviation is better than variance, there is however on difficulty with it. If there
are two or more distributions of different variables (having different units of measurement), there
variability cannot be compared by comparing the values of the standard deviation.

Examples:

1) Compute the variance (S2) and standard deviation(S) for the following11, 12, 13, 14, 15, 16,
17, 18, 19, 20 and 21.
n n

 x i  ( x i ) 2 / n
2

i 1 i 1 2926  (176) 2 / 11
S2    11
n 1 10

So, S  S 2  11  3.316
2) Computing the variance & standard deviation for the data given below.
Observation(Xi) 32 36 40 44 48 Total

Frequency(fi) 2 5 8 4 1 20

fiXi 64 180 320 176 48 788

fiXi2 2048 6480 12800 7744 2304 31376

47
Basic Statistics Email:[email protected] 2024

fx  ( f i xi ) 2 /  f i
2
31376  (788) 2 / 20
   17.31
2 i i
S
f i 1 19

So, S  S 2  17.31  4.16

3) Calculate the variance and standard deviation for the following grouped frequency
distribution.
Class intervals Frequency(fi) mi fimi fimi2

1-3 1 2 2 4

3-5 9 4 36 144

5-7 25 6 150 900

7-9 35 8 280 2240

9-11 17 10 170 1700

11-13 10 12 120 1440

13-15 3 14 42 588

fm  ( f i mi ) 2 /  f i
2
7016  (800) 2 / 100
   6.22
2 i i
S
f i 1 99

2
=6.22. So, S=√ =2.49
Properties of Variance and Standard Deviation

1. The variance and standard deviation always non-negative

2. If every value is multiplied by a constant C the new variance is S2new=C2S2old and standard
deviation is Snew=CSold

3. When a constant C is added (subtracted) to or (from) each and every value, the standard
deviation and variance remains the same.

48
Basic Statistics Email:[email protected] 2024

5. Coefficient of Variation

All absolute measures of dispersion have units. If two or more distributions differ in their units
of measurement, there variability cannot be compared by any of the absolute measure given
before. Also, the size of these measures of dispersion depends up on the size of the values. That
is if the size of the values is larger, the value of the absolute measures will also be larger. Hence,
in situations where either the two or more data sets have different units of measurement, or their
means differ sufficiently in size, absolute measures fails to be appropriate.
It is a relative measure of standard deviation. The coefficient of variation is the ratio of the
standard deviation to the mean and it is expressed as percent.

CV= ×100%, for population

S
CV= ×100%, for sample
X
It is used for comparing the variability of two or more distributions. The distribution having less
CV is said to be less variable or more consistent or more uniform.
Since absolute measures depend on the units of measurement of the data, they fail to be
appropriate for comparing two or more groups if
1. The groups have different units of measurement.
2. The size of the data between the groups is not the same.
When either of these two conditions happens we have to use relative measures of variation. CV
is a unit less measure of variation and also takes into account the size of the means of the
distributions.
EX: Given Data Set A: 2 Meters, 4 Meters, 6 Meters
Data Set B: 1000 Liters, 800 Liters, 900Liters
Compare the variability of the two data sets using standard deviation and coefficient of variation.
6. Standard Score(Z-score)
It used to determine how many standard deviations a given value is above or below the
mean which is depend on whether the z-score is negative or positive.
for Population

for Sample

49
Basic Statistics Email:[email protected] 2024

Example: Suppose Ablakat scored 90 on a basic statistics test in which the mean and standard
deviation of the class were 70 and 10 respectively. In the second test, Meklit scored 60 on which
the mean and standard deviation of the class were 56 and 4 respectively. Who is better of relative
to her class?

Solution:
Ablakat ==2.0 Meklit ==1.0
The score of Ablakat (90) in her class is 2 standard deviation above the mean whereas the score of
Meklit (60) in her class is 1 standard deviation above the mean. This implies that the Ablakat‟s score
is the better relative score when considered in the context of Meklit‟s score.

3.2.3. Statistical Measures of Association

Measure of association, in statistics, any of various factors or coefficients used to quantify a

relationship between two or more variables. Measures of association are used in various fields of
research but are especially common in the areas of epidemiology and psychology, where they
frequently are used to quantify relationships between exposures and diseases or behaviors. A
measure of association may be determined by any of several different analyses,
including correlation analysis, and regression analysis.

Although the terms correlation and association are often used interchangeably, correlation in a
stricter sense refers to linear correlation, and association refers to any relationship between
variables.) The method used to determine the strength of an association depends on
the characteristics of the data for each variable. Data may be measured on an interval/ratio scale, an
ordinal/rank scale, or a nominal/categorical scale. These three characteristics can be thought of as
continuous, integer, and qualitative categories, respectively.

Statistical Methods for Exploring Relationship among Variables

In this lesson we will deal with a bi-variate data i.e. data involving two variables.

50
Basic Statistics Email:[email protected] 2024

1. Simple Linear Regression

Regression may be defined as the estimation of the unknown value of one variable from the known
values of one or more variables. The variable whose values are to be estimated is known as
dependent or explained variable while the variable which are used in determining the value of the
dependent variable are called independent or predictor variables.

The regression study that involves only two variables is called simple regression and the regression
analysis that studies more than two variables is called multiple regression. If the relationship
between the two variables can be described by a straight line then the regression is known as linear
regression otherwise it is called non-linear.

The regression analysis involving only two variables and having a linear relationship is called
Simple Linear Regression. This linear relationship between the two variables is represented by a
straight line.

Regression Line (Line of Regression): is the line that gives the best estimate of one variable for
any given value of another variable. The regression line which is used to estimate the values of Y for
any given value of X is called regression line of Y on X.

Regression Equation: is a mathematical equation that defines the relationship between two
variables.

Regression of Y on X

Model: Y= α + βX + Є

Where Y is the dependent variable

X is the independent variable

α is the intercept

β is the slope

Є is the error term

Its parameters are interpreted as follows:

 α is the value of the dependent variable when the value of the independent variable is zero.
 β is the increment in the value of the dependent variable when the value of the independent
variable increased by 1 unit. There is a direct linear relationship between the two variables
ifβ is positive, there is an indirect linear relationship between the two variables if β is
negative, and there is no linear relationship between the two variables if β is zero.

51
Basic Statistics Email:[email protected] 2024

a) Method of Estimation

The objective in the above model is to estimate the regression parameters (α and β) using the sample
data. The most common and widely used method of estimation is called Ordinary Least Squares
(OLS) which minimizes error sum of the squares.

The estimated regression model is, therefore,

^ ^
Yˆ     X

Yˆ Estimated value of the dependent variable.

X actual value of the independent variable.

^
 is the estimated intercept.

^
 is the estimated slope.

The estimated of the parameters can be obtained:

^ n XY   X  Y
  
n X 2  ( X ) 2
^ ^

, and   Y   X

2. Correlation

Most of the variables in economics and business area show relationship. For example, price and
supply, income and expenditure, advertising expenditure and sales. Thus in order to know the degree
or direction of such a relationship between variables, correlation analysis is important. Correlation is
a statistical tool desired towards measuring the degree of the relationship (degree of association)
between the variables. If the changes in one variable affect the change in the other variable, then the
variables are correlated. Correlation that involves only two variables is called simple correlation.

Covariance: is a measure of the joint variation between two variables, i.e. it measures the way in
which the values of the two variables vary together. If the covariance is zero, there is no linear
relationship between the two variables.

If it is negative, there is an indirect linear relationship between them. If the covariance is positive,
there is a direct linear relationship between the variables. The sample covariance between two
variables is defined as:

52
Basic Statistics Email:[email protected] 2024

1   X  Y 
S xy  
n  1 
 XY  n 


Pearson’s Coefficient of Correlation (r)

The coefficient of correlation is a measure of the degree or strength of the linear association between
two variables. It is defined as a ratio of the covariance between the two variables and the product of
the standard deviations of the two variables. The sample correlation coefficient is denoted by r and
the population correlation coefficient is denoted by ρ.

S xy n XY   X  Y
r 
SxSy n X 2  ( X ) 2 n Y 2  ( Y ) 2

The value of r is always in between -1 and 1.

Interpretation of r: The value of the correlation coefficient can be positive, zero or negative,
depending on the sign of the covariance between the two variables. But, it lies the limits -1 and +1;
that is, -1≤r≤1.

 If the value of r is -1 or +1, there is a perfect negative or perfect positive linear relationship
between the variables, respectively.
 If the value of r is approximately -1 or +1, there is a strong negative or strong positive linear
relationship between the variables, respectively.
 If r is -0.5 (or approximately -0.5) or 0.5 (or approximately 0.5), there is moderate negative
or moderate positive linear relationship between the variables, respectively.
 If the value of r is near zero, there is no linear relationship between the two variables.

Coefficient of determination (r2)

So far, we were concerned with the problem of estimating the parameters of the regression model
and the correlation coefficient between two variables. We now consider the goodness of fit of the
estimated model to a set of data; that is, we shall find out how “well” the estimated model fits the
data.

The coefficient of determination tells how well the estimated model fits the data. For simple linear
regression (two variables case), it is defined as the square of the sample correlation coefficient, and
denoted by r2. Hence r2 measures the proportion or percentage of the variation in the dependent
variable explained by the independent variable. Generally, r2 is a nonnegative quantity which lies in
the limits 0 and 1, i.e., 0≤r2≤1. If it approaches to 1, it means a good fit and if it approaches 0, no
relationship between the variables.

53
Basic Statistics Email:[email protected] 2024

Examples:

a. Given the following data on supply (X) and sales (Y) of a certain commodity

Supply (X) 60 62 65 70 73 75 71
Sales (Y) 10 11 13 15 16 19 14

a) Estimate the regression equation sales on supply and interpret the coefficients.
b) Calculate the correlation coefficient between supply and sales, and interpret it.
c) Find the coefficient of determination and interpret it.
d) Predict the amount of sales of the commodity if the supply amount is 80.

b. The following summary results are obtained from price and demand of a
commodity

∑price=30 ∑demand=40 ∑(price)(demand)=214

∑(price)2=220 ∑(demand)2=340 n=5

a) Identify the dependent and independent variable.

b) Estimate the regression equation.
c) Interpret the estimated coefficients.
d) Calculate the correlation coefficient between price and demand, and interpret it.
e) Find the coefficient of determination and interpret it.

2
S2 S
c. Given n = 25, X = 3.95, Y = 2.03, S x = 85.35, y =98.75, xy = 90

a) Fit the regression equation Y on X.

b) Interpret the estimated coefficients.
c) Calculate the correlation coefficient and interpret it.
d) Find the coefficient of determination and interpret it.

Solution: 1

n=7,  , X Y  XY  6764
X  476 Y  98 2
 32564 2
 1428
, , and

^ ^
a)Yˆ     X

54
Basic Statistics Email:[email protected] 2024

^ n XY   X  Y
  
n X 2  ( X ) 2
^ ^

=0.51 and   Y   X = -20.68

^ ^
Yˆ     X  20.68  0.51X

n XY   X  Y
b) r 
n X 2  ( X ) 2 n Y 2  ( Y ) 2
=0.9545

c) Coefficientof det er min ation  r 2  0.911

^ ^
d )Yˆ     X  20.68  0.51 80  20.12

3. Logistic Regression

Logistic regression analysis studies the association between a categorical dependent variable and a
set of independent (explanatory) variables. The name logistic regression is used when the dependent
variable has only two values, such as 0 and 1 or Yes and No. The name multinomial logistic
regression is usually reserved for the case when the dependent variable has three or more unique
values, such as Married, Single, Divorced, or Widowed. Although the type of data used for the
dependent variable is different from that of multiple regressions, the practical use of the procedure is
similar.

When we want to look at a relationship between categorical dependent variable and a set of
explanatory variables (one or more), we can use the logistic regression framework. Multiple linear
regressions may be used to investigate the relationship between a continuous dependent variable,
such as income, blood pressure or examination score. However, socio-economic variables are very
often categorical, rather than interval scale. In many cases research focuses on models where the
dependent variable is categorical. For example, the dependent variable might be „unemployed‟ or
„not‟, and we could be interested in how this variable is related to sex, age, ethnic group, etc. In this
case we could not carry out a multiple linear regression as many of the assumptions of this technique
will not be met, as will be explained theoretically below. Instead we would carry out a logistic
regression.

If there is a categorical explanatory variable with two categories, then it is appropriate to include it in
the model as if it was binary logistic regression. However, if there is a categorical explanatory
variable with more than two categories, then it is appropriate to include it in the model as if it was
multinomial Logistic regression. For example, that one of the explanatory variable is marital status
with three categories: "Single", "Married", "Separated".

55
Basic Statistics Email:[email protected] 2024

4. Chi-square test for independence

The chi-square distribution can only take positive values and is highly skewed. We use the chi-
square distribution when we analyse categorical data. The chi-square test can also be used to test the
association of two variables, and for goodness of fit test.

Test of association

Example: A researcher wishes to determine whether there is a relationship between the gender of an
individual and the amount of alcohol consumed. A random sample of 68 people was selected and the
following data were obtained.

56
Basic Statistics Email:[email protected] 2024

57
Basic Statistics Email:[email protected] 2024

CHAPTER FOUR

4. Probability and Probability Distribution

Introduction

As a general concept, probability is the measure of a chance that something will occur. It is a
numerical measure with a value between 0 (0%) and 1 (100%) where the probability of 0 indicates
that the given event cannot occur and a probability of 1(100%) assures certainty of such an
occurrence.

Introduction to Set

Set: is a collection of elements or objects of interest.

 Empty set (denoted by Ф or {})
 A set containing no element.
 Universal set (denoted by S)
 A set containing all possible elements.
 Complement (Not)
 The complement of a set A is A‟: a set containing all elements
of S that are not in A.
 Intersection (And) ( AnB)
 A set containing all elements in A and B.
 Union (Or) (AuB)
 A set containing all elements in A or B or both.
Mutually exclusive or disjoint sets
 Sets having no element in common, having no intersection, whose intersection
is empty set. AnB=Ф.

Basic Concepts and Terms

1. Experiment: it is an activity or a trial that leads to well-defined results called outcomes, but it is
uncertain to which result will occur.
2. Outcome is particular result of an experiment.

58
Basic Statistics Email:[email protected] 2024

3. Sample space: It is the set of all possible outcomes for the experiment. Each possible outcome
is called sample point. It is denoted by S.
Examples: Define the sample space for the following probability experiments.
 Tossing a coin: S={H, T}
 Tossing two coins: S={HH, HT, TH, TT}
 Rolling a die: S={1, 2, 3, 4, 5, 6}
4. Event: An event is a subset of the sample space in other words; an event is a set containing
sample points of a certain sample space under consideration.
Example: If we roll a fair die, then the experiment is rolling the die.
The sample space S for this experiment is
S= {1, 2, 3, 4, 5, 6}
If we are interested to the outcomes of even numbers, then the event or out interest is E= {2, 4, 6}.
Elementary or simple event: An event having only one- simple point is an elementary or simple
event.
Mutually exclusive events: Two events E1 and E2 are said to be mutually exclusive events if there is
no sample point which is common to both events E1 and E2. That means, E1 n E2=. Mutually
exclusive events are events, which cannot happen at the same time. Example: consider the
experiment of tossing two coins. Let E1 be an event with not heads shown, E2 be an event with one
head shown and E3 be an event with two heads shown. Are E1, E2 and E3 mutually exclusive?
Solution
S= {HH, HT, TH, TT}
E1= {TT}
E2= {HT, TH}
E3= {HH}
E1 n E2=E2 n E3=E1 n E3=
Thus, E1 and E2, E2 and E3, E1 and E3 are mutually exclusive events.
Independent events: Two events E1 and E2 are said to be independent if the occurrence of E1 has no
effect on the occurrence of E2. That means the knowledge of event E1 has occurred given no
information about the occurrence of the event E2. If two events are not independent, they are said to
be dependent.

59
Basic Statistics Email:[email protected] 2024

Equally likely outcome: In a certain experiment if each outcome in the sample space has the same
chance to be occurred, then we say that the outcome is equally likely outcomes. Example: in
throwing a fair die all possible outcomes are equally likely comes/occurred. That means the elements
of the sample space have the same chance to occur.

4.1. PROBABILITY DISTRIBUTIONS

4.2. Random Variable

Random Variable is a variable whose values are determined by chance or with some probability. It
is denoted by capital letter. The set consisting of all possible values of a random variable is called
range space (Rx).
Discrete random variable: If the number of possible values of a random variable X (that is, R x) is
finite or countable infinite.
Continuous random variable: If the random variable assumes an uncountable infinite number of
possible values.

4.3. Probability Distribution

Probability Distribution is a listing of all possible values of a random variable together with their
corresponding probabilities. Based on the type of a random variable, a probability distribution can be
discrete or continuous.

4.3.1. Discrete Probability Distribution

With each possible value x i of a discrete random variable, a number p( xi )  P( X  xi ) , called

probability of x i is associated. The number p ( xi ) , i  1,2,... must satisfy the following conditions.

0  p ( xi )  1

∑P(X=xi) =1

This function p defined above is called probability mass function (pmf) of the random variable X.
the collection of pairs ( xi , p( xi )), i  1,2,... is called the probability distribution of X.

Examples:

60
Basic Statistics Email:[email protected] 2024

1. Construct a probability distribution for the number of heads observed in tossing a coin two
times.
2. Construct a probability distribution for the number of heads observed in tossing a coin three
times.
3. Construct a probability distribution for the number of girls if a family plans to have four
children.

Solutions:

1. S={HH, HT, TH,TT}

Let X be the number of heads observed in tossing a coin two times. Rx={0, 1, 2}

x 0 1 2 Total
P x  14 2/ 4 ¼ 1

2. S={HHH, HHT, HTH,HTT, THH, THT, TTH, TTT}

Let X be the number of heads observed in tossing a coin three times. Rx={0, 1, 2, 3}

x 0 1 2 3 Total
P x  18 38 38 18 1

4.3.2. Continuous Probability Distribution

A continuous probability distribution is represented by the probability density function (pdf), having
the following characteristics: suppose X is continuous on an interval [a, b].
i. f(x)≥0, for all x Є(a,b)
b
ii.  f ( x)dx  1
a
b
iii. P(a  X  b)   f ( x)dx
a

Examples:
1. Show that each of the following functionis pdf.
1,0  x  1
a. f ( x)  
0, otherwise

61
Basic Statistics Email:[email protected] 2024

e  x , x  0
f ( x)  
b.
0, otherwise
2. Find the value of b for the following function to be a pdf.
bx 2 ,0  x  1
f ( x)  
0, otherwise

4.4. Expectations of a Random Variable

The mean of a random variable X is known as the expected value of X, denoted by E(X). It is
defined as:

 xP( x) , if X is a discrete r.v.


  E( X )  

 xf ( x)dx , if X is a continous r.v.

The variance of the random variable X is the expected value of the square of the deviation of X from
its mean.

 ( x   ) P( x) , if X is a discrete r.v.
 2

  E( X   )  
2 2


 ( x   ) 2 f ( x)dx , if X is a continousr.v.

  2  E ( X   ) 2  E ( X  E ( X )) 2  E ( X 2 )  ( E ( X )) 2
Examples:
1. Find the mean number of heads observed in tossing a coin three times.
2. Find the average number of girls if a family plans to have four children.
3. Find the mean of the following probability distributions.
1,0  x  1
a. f ( x)  
0, otherwise

Solution:

1. S={HHH, HHT, HTH,HTT, THH, THT, TTH, TTT}

Let X be the number of heads observed in tossing a coin three times. Rx= {0, 1, 2, 3}

x 0 1 2 3 Total
P x  18 38 38 18 1

62
Basic Statistics Email:[email protected] 2024

  E ( X )   xp( x)
 0 1 / 8  1 3 / 8  2  3 / 8  3 1 / 8
 1.5

4.5. Common Discrete Distributions

4.5.1. The Binomial Distribution

Binomial distribution is one of the simplest and most frequently used discrete probability
distribution and is very useful in many practical situations involving either /or types of events.

Properties of Binomial Experiment

1. Each trial has only two mutually exclusive outcomes or outcomes that can be reduced to two.
One of the outcomes is labeled as Success and the other as Failure.
2. The outcome of each trial is independent.
3. The probability of Success remains the same from trial to trial.
4. The experiment (trial) is performed for fixed number of times, say n.

Let X be the number of successes. Then X follows a binomial distribution with parameters n,
number of experiments performed and p, probability of success, and write as X~Bin(n,p).Then, the
n
probability of getting exactly x successes in n trials is given by: P( X  x)    p x q n  x , x  0,1,2,...n .
 x
Where p is the probability of success
q=1-p is the probability of failure
n is number of trials
x is number of successes.
This is called the Binomial Distribution. The mean of a binomial distribution is E(X)=np and
variance is V(X)=npq.

Examples:
1. Suppose a coin is tossed 10 times. What is the probability of getting
a) Exactly 3 heads

63
Basic Statistics Email:[email protected] 2024

b) No head
c) At most 3 heads
d) At least 3 heads
e) More than 3 heads
Find the average and variance of the number of heads.
2. The probability of a man kicking into the goal is 2/3. If a person kicks 5 times, what is the
probability of scoring
a) At least one goal.
b) At most 3 goals.
Find the average, variance and standard deviation of the number of goals.

Solution:

Let X be the number of heads observed in tossing a fair coin 10 times, Rx= {0, 1, 2,…, 10}

p  P( Success)  P( Head )  1 / 2 )q  1  p  1 / 2

X ~ Bin(n  10, p  0.50)

n
 P( X  x)    p x q n  x , x  0,1,2,...,10
 x
10 
  0.5 x 0.510 x
x
10 
  0.510
x

10  1 
10

a) P( X  3)    
 3  2 

10  1 
10

b) P( X  3)    
 0  2 
c) P( X  3)  P( X  0)  P( X  1)  P( X  2)  P( X  3)
d) P( X  3)  P( X  3)  P( X  4)  ...  P( X  10)  1  P( X  3)
e) P( X  3)  P( X  4)  P( X  5)  ...  P( X  10)  1  P( X  3)
4.5.1.1. Application of Binomial Distribution

64
Basic Statistics Email:[email protected] 2024

When is binomial probability useful?

Evaluating the binomial distribution of events can be essential in many practical applications. For
instance, statistical analysis in computer programming, data science and business analytics may all
use the binomial distribution of occurrence to evaluate various outcomes. Because binomial
distribution measures two distinct outcomes, this probability is also useful in financial analysis and
forecasting. Consider several more instances when it's useful to apply the binomial distribution
probability:

 Calculating the probability of obtaining a certain number of successful outcomes versus

unsuccessful outcomes
 Measuring the likelihood of obtaining one result over another after a certain number of trials
 Evaluating the probability of positive versus negative results when performing regression
analysis

4.5.2. The Poisson distribution

The Poisson distribution is discrete probability distribution. It differs from binomial distribution in
the sense that it is not possible to count the number of failures even though the number of successes
is known.
Properties of Poisson distribution:
1. The probability of success, p, is very small.
2. The experiment is performed indefinitely (n is very large).
3. The average number of events per unit of time (  ) is known.

Thus, the random variable X (number of successes) has a Poisson distribution with parameter  ,
e  x
X~Poisson (  ) and the probability of getting x successes is given by P( X  x)  , x  0,1,2,.... .
x!
where  is the average number of events per unit of time.
If X is a Poisson random variable, then E(X) =  and V(X)=  .

Examples:
1. On average a typist commits 3 errors per page. Find the probability that she will make
a) No mistake.
b) More than one mistake.

65
Basic Statistics Email:[email protected] 2024

2. Customer arrive at a photocopying machine at an average rate of two every 10 minutes. What
is the probability that there will be
a) No arrivals during any period of ten minutes.
b) Exactly one arrival during these time period.
c) More than two arrivals during this time period.

Solution:

Let X be the number of errors committed,  3

3 x e 3
X  poisson3  p X  x  
x!

30 e 3
a) P X  3  P X  0 
0!
b) P X  1  P X  2  P( X  3)  ...  1  P( X  1)
4.5.2.1. Application of Poisson distribution

The Poisson distribution can be practically applied to several business operations that are common
for companies to engage in. As noted above, analyzing operations with the Poisson distribution can
provide company management with insights into levels of operational efficiency and suggest ways to
increase efficiency and improve operations. Here are some of the ways that a company might utilize
analysis with the Poisson distribution.

 Check for adequate customer service staffing. Calculate the average number of customer
service calls per hour that requires more than 10 minutes handling. Then, calculate the
Poisson distribution to find the probable maximum number of calls per hour that might come
in requiring more than ten minutes handling. Assuming that the maximum number of 10+
minute‟s calls occurs, evaluate whether customer service staffing is adequate to handle all the
calls without making customers wait on hold.
 Use the Poisson formula to evaluate whether it is financially viable to keep a store open
24 hours a day. Calculate the average number of sales made by the store during the
overnight shift – the period from midnight to 8 A.M. using the distribution formula then;
calculate the probable lowest number of sales that might be made during the overnight shift.

Finally, determine whether that lowest probable sales figure represents sufficient revenue to cover all
the costs (wages and salaries, electricity, etc.) of keeping the store open during that time period,
while also providing a reasonable profit.

 Review and evaluate business insurance coverage. Determine the average number of
losses or claims that occur each year and that are covered by the company‟s business

66
Basic Statistics Email:[email protected] 2024

insurance. Then do a Poisson probability calculation to determine the maximum and

minimum numbers of claims that might reasonably be filed during any one year.

Review the cost of your insurance and the coverage it provides. Consider whether perhaps you‟re
overpaying – that is, paying for a coverage level that you probably don‟t need, given the probable
maximum number of claims. Alternatively, you may find that you‟re underinsured – that if what the
Poisson distribution shows as the probable highest number of claims actually occurred one year,
your insurance coverage would be inadequate to cover the losses.

4.5.3. Hyper geometric distribution

Hyper-geometric distribution is a distinct probability distribution that defines the “m” successes
probability (some random draws for the object drawn that has some specified feature) in “n” no of
draws, without any replacement, from a given population size “N” that includes accurately “m”
objects having that feature, where the draw may succeed or may fail. The hyper-geometric
distribution arises when one samples from a finite population, thus making the trials dependent on
each other, thus making the trials dependent on each other. There are five characteristics of a hyper-
geometric experiment.

1. You take samples from two groups.

2. You are concerned with a group of interest, called the first group.
3. You sample without replacement from the combined groups. For example, you want to
choose a softball team from a combined group of 11 men and 13 women. The team consists
of ten players.
4. Each pick is not independent, since sampling is without replacement. In the softball example,
the probability of picking a woman first is 13/24. The probability of picking a man second
is 11/23 if a woman was picked first. It is 10/23 if a man was picked first. The probability of
the second pick depends on what happened in the first pick.
5. You are not dealing with Bernoulli Trials.

If a random variable X follows a hyper-geometric distribution, then the probability of

choosing m objects with a certain feature can be found by the following formula.

67
Basic Statistics Email:[email protected] 2024

Where:-

N: population size
M: number of objects in population with a certain feature
n: sample size
x: number of objects in sample with a certain feature

Example1

There are 4 Queens in a standard deck of 52 cards. Suppose we randomly pick a card from a deck,
then, without replacement, randomly pick another card from the deck. What is the probability that
both cards are Queens? To answer this, we can use the hyper-geometric distribution with the
following parameters.

Solution

N: population size = 52 cards

m: number of objects in population with a certain feature = 4 queens
n: sample size = 2 draws
x: number of objects in sample with a certain feature = 2 queens

Plugging these numbers in the formula, we find the probability to be:

P(X=2) = mCx (N--mCn-x) / NCn = 4C2 (52-4C2-2) / 52C2 = 6*1/ 1326 = 0.00452.

68
Basic Statistics Email:[email protected] 2024

Example 2

An urn contains 3 red balls and 5 green balls. You randomly choose 4 balls. What is the probability
that you choose exactly 2 red balls?

To answer this, we can use the hyper-geometric distribution with the following parameters:

 N: population size = 8 balls

 m: number of objects in population with a certain feature = 3 red balls
 n: sample size = 4 draws
 x: number of objects in sample with a certain feature = 2 red balls
Plugging these numbers into the Hyper-geometric Distribution Calculator, we find the
probability to be 0.42857.
Application of Hyper geometric distribution

The hyper-geometric test uses the hyper-geometric distribution to measure the statistical
significance of having drawn a sample consisting of a specific number of successes (out of total
draws) from a population of size containing successes.

4.6. Common Continuous Distributions

4.6.1. Uniform Distribution

The uniform distribution is a symmetric probability distribution where all outcomes have an equal
likelihood of occurring. All values in the distribution have a constant probability, making them
uniformly distributed. This distribution is also known as the rectangular distribution because of its
shape in probability distribution plots.

The uniform distribution is a probability distribution in which every value between an interval
from a to b is equally likely to occur.

Features of the Uniform Distribution

The uniform distribution gets its name from the fact that the probabilities for all outcomes are the
same. Unlike a normal distribution with a hump in the middle or a chi-square distribution, a uniform
distribution has no mode. Instead, every outcome is equally likely to occur. Unlike a chi-square
distribution, there is no skewness to a uniform distribution. As a result, the mean and
median coincide. Since every outcome in a uniform distribution occurs with the same relative
frequency, the resulting shape of the distribution is that of a rectangle.

69
Basic Statistics Email:[email protected] 2024

If a random variable X follows a uniform distribution, then the probability that X takes on a value
between a and b can be found by the following formula:-

Use of Uniform Distribution

Analysts can use the uniform distribution to approximate new processes when there is insufficient
data to estimate the actual distribution of outcomes. In other cases, analysts use this distribution
because it‟s a close approximation and the formula is simple.

70
Basic Statistics Email:[email protected] 2024

4.6.2. Normal Distribution

The most often used continuous probability distribution is the normal distribution. This distribution
plays a very important role in statistical theory and practice, particularly in the area of statistical
inference and statistical quality control. Its importance is due to the fact that in practice, the
experimental results, very often seem to follow the normal distribution or bell shaped curve.
A random variable X is said to have a normal distribution if its probability density function is given
by

1 x  2
1   
2  
f ( x)  e ,    x  ,      ,   0
 2
Where   E ( X ),  2  Variance ( X )
 and  2 are the Parameters of the Normal Distributi on.

Properties of Normal Distribution:

1. It is bell shaped and is symmetrical about its mean and it is mesokurtic. The maximum ordinate
is at x   and is given by
1
f ( x) 
 2
2. It is asymptotic to the axis, i.e., it extends indefinitely in either direction from the mean.
3. It is a continuous distribution.
4. It is a family of curves, i.e., every unique pair of mean and standard deviation defines a different
normal distribution. Thus, the normal distribution is completely described by two parameters:
mean and standard deviation.
5. It is unimodal, i.e., values mound up only in the center of the curve.
6. Mean  Median  mod e
Note: To facilitate the use of normal distribution, the following distribution known as the standard
normal distribution was derived by using the transformation

71
Basic Statistics Email:[email protected] 2024

X 
Z

1
1 2z 2

 f ( z)  e
2
Properties of the Standard Normal Distribution:

Same as a normal distribution, but also...

 Mean is zero
 Variance is one
 Standard Deviation is one
 The total area under the (standard) normal curve is 1. Hence, the area to the right and left of
the center value (µ=0) of the standard normal distribution is 0.5 (as it is symmetric about 0).

Examples:

1. Find the area under the standard normal distribution which lies
a) Between Z  0 and Z  0.96

Solution:

Area  P(0  Z  0.96)  0.3315

b) Between Z  1.45 and Z  0

Solution:

Area  P (1.45  Z  0)
 P (0  Z  1.45)
 0.4265

72
Basic Statistics Email:[email protected] 2024

c) To the right of Z  0.35

Solution:

Area  P( Z  0.35)
 P(0.35  Z  0)  P( Z  0)
 P(0  Z  0.35)  P( Z  0)
 0.1368  0.50  0.6368

d) To the left of Z  0.35

Solution:

Area  P( Z  0.35)
 1  P( Z  0.35)
 1  0.6368  0.3632

e) Between Z  0.67 and Z  0.75

Solution:

Area  P(0.67  Z  0.75)

 P(0.67  Z  0)  P(0  Z  0.75)
 P(0  Z  0.67)  P(0  Z  0.75)
 0.2486  0.2734  0.5220

f) Between Z  0.25 and Z  1.25

Solution:

73
Basic Statistics Email:[email protected] 2024

Area  P (0.25  Z  1.25)

 P (0  Z  1.25)  P (0  Z  0.25)
 0.3934  0.0987  0.2957

2. Find the value of Z if

a) The normal curve area between 0 and z(positive) is 0.4726

Solution

P (0  Z  z )  0.4726 and from table

P (0  Z  1.92)  0.4726
 z  1.92.....uniqueness of Areea .

b) The area to the left of z is 0.9868

Solution

P ( Z  z )  0.9868
 P ( Z  0)  P (0  Z  z )
 0.50  P (0  Z  z )
 P (0  Z  z )  0.9868  0.50  0.4868
and from table
P (0  Z  2.2)  0.4868
 z  2.2

3. A random variable X has a normal distribution with mean 80 and standard deviation 4.8. What is
the probability that it will take a value

a) Less than 87.2

b) Greater than 76.4
c) Between 81.2 and 86.0

74
Basic Statistics Email:[email protected] 2024

Solution

X is normal with mean,   80, s tan dard deviation ,   4.8

X  87.2  
a) P( X  87.2)  P(  )
 
87.2  80
 P( Z  )
4.8
 P( Z  1.5)
 P( Z  0)  P(0  Z  1.5)
 0.50  0.4332  0.9332

X   76.4  
b) P( X  76.4)  P(  )
 
76.4  80
 P( Z  )
4.8
 P( Z  0.75)
 P( Z  0)  P(0  Z  0.75)
 0.50  0.2734  0.7734

81.2   X  86.0  
c) P(81.2  X  86.0)  P(   )
  
81.2  80 86.0  80
 P( Z )
4.8 4.8
 P(0.25  Z  1.25)
 P(0  Z  1.25)  P(0  Z  1.25)
 0.3934  0.0987  0.2957

4.6.2.1 Application of Normal Distribution

Companies use different statistical methodologies and calculations to help them make strategic
decisions to optimize operations and return on investment. One method of analysis employs normal
distribution charts or graphs to determine where different values in a given dataset relate to the data's
average. If you're considering a career in accounting, finance, business or analysis, understanding
how it works is an essential skill. In this article, we discuss what normal distribution is, which
industries and positions use it and review how it can help improve a business's decision making.

75
Basic Statistics Email:[email protected] 2024

This type of distribution can help finance professionals, such as market researchers and stock market
traders, determine whether the price of the assets is fair. A price above the curve indicates an
overvaluation of an asset in comparison with similar commodities or resources. When a price falls
below the average, the asset has been under-priced. Determining if a company has an asset they have
overvalued, underpriced or priced fairly can help other companies and traders make effective
decisions.

Many industries and companies incorporate this type of distribution analysis into their business
decision-making processes. It can provide valuable insights into customer behaviours, market trends
and purchasing patterns. Among the industries to use this type of distribution analysis are:

 sales and marketing

 accounting
 finance and stock market analysis
 politics
 logistics
 manufacturing

4.6.3. Exponential Probability Distributions

The exponential distribution is a probability distribution that is used to model the time we must
wait until a certain event occurs.

This distribution can be used to answer questions like:

 How long does a shop owner need to wait until a customer enters his shop?
 How long will a laptop continue to work before it breaks down?
 How long will a car battery continue to work before it dies?
 How long do we need to wait until the next volcanic eruption in a certain region?

In each scenario, we‟re interested in calculating how long we‟ll have to wait until a certain event
occurs. Thus, each scenario could be modeled using an exponential distribution.

If a random variable X follows an exponential distribution, then the probability density

function of X can be written as:

76
Basic Statistics Email:[email protected] 2024

Properties of the Exponential Distribution

The exponential distribution has the following properties:

 Mean: 1 / λ
 Variance: 1 / λ2

Example1

Suppose the mean number of minutes between eruptions for a certain geyser is 40 minutes. We
would calculate the rate as λ = 1/μ = 1/40 = .025.

We could then calculate the following properties for this distribution:

 Mean waiting time for next eruption: 1/λ = 1 /.025 = 40

 Variance in waiting times for next eruption: 1/λ2 = 1 /.0252 = 1600

Example2

A new customer enters a shop every two minutes, on average. After a customer arrives, find the
probability that a new customer arrives in less than one minute.

Solution 1: The average time between customers is two minutes. Thus, the rate can be calculated as:

 λ = 1/μ
 λ = 1/2
 λ = 0.5

We can plug in λ = 0.5 and x = 1.

77
Basic Statistics Email:[email protected] 2024

 P(X ≤ x) = 1 – e-λx
 P(X ≤ 1) = 1 – e-0.5(1)
 P(X ≤ 1) = 0.3935

The probability that we‟ll have to wait less than one minute for the next customer to arrive is 0.3935.

4.6.4. Application of Exponential Probability Distributions

Why did we have to invent Exponential Distribution?

To predict the amount of waiting time until the next event (i.e., success, failure, arrival, etc.).
For example, we want to predict the following:

 The amount of time until the customer finishes browsing and actually purchases something in
your store (success).
 The amount of time until the hardware on AWS EC2 fails (failure).
 The amount of time you need to wait until the bus arrives (arrival).
Exponential distributions are commonly used in calculations of product reliability, or the length of
time a product lasts.

There are many applications of exponential functions in business and economics. Below are
examples where an exponential function is used to model and predict cost and revenue:-

 If a populations growth is proportional to the number in the population, then we say that the
population grows exponentially.
 If the decay of a substance is inversely proportional to the amount of substance then the
substance will follow an exponential decay model.
 Compound Interest Formula will follow an exponential.

Basic Statistics Material
No ratings yet
Basic Statistics Material
97 pages
Satatistics
No ratings yet
Satatistics
40 pages
Basic Statistics Material-1
No ratings yet
Basic Statistics Material-1
91 pages
Probability and Statistics Acct Y II T III HU SWE 12 May, 2024
No ratings yet
Probability and Statistics Acct Y II T III HU SWE 12 May, 2024
123 pages
Chapter 1
No ratings yet
Chapter 1
34 pages
Statistics For Management I
No ratings yet
Statistics For Management I
82 pages
Basic Statistics For Acc & Finance
No ratings yet
Basic Statistics For Acc & Finance
45 pages
Functions and Limitations of Statistics
No ratings yet
Functions and Limitations of Statistics
34 pages
Statatics Cha 1
No ratings yet
Statatics Cha 1
8 pages
Basic Statistics 2025 (1-5)
No ratings yet
Basic Statistics 2025 (1-5)
158 pages
Basic Statistics PDF
100% (1)
Basic Statistics PDF
43 pages
Chapter One: 1. Basic Concepts, Methods of Data Collection and Presentation
No ratings yet
Chapter One: 1. Basic Concepts, Methods of Data Collection and Presentation
111 pages
UNIT
No ratings yet
UNIT
25 pages
Introduction to Statistical Concepts
No ratings yet
Introduction to Statistical Concepts
30 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
21 pages
Different STATISTICAL TOOL
No ratings yet
Different STATISTICAL TOOL
13 pages
Business Statistics
No ratings yet
Business Statistics
123 pages
Introduction to Basic Statistics
No ratings yet
Introduction to Basic Statistics
7 pages
Chapter 1-4 For Fundametals of Biostat
No ratings yet
Chapter 1-4 For Fundametals of Biostat
36 pages
Chapter 1 - Introduction To Statistics
No ratings yet
Chapter 1 - Introduction To Statistics
22 pages
CH 1
No ratings yet
CH 1
12 pages
Basic Statistics Material Final
No ratings yet
Basic Statistics Material Final
86 pages
Chapter 1-1
No ratings yet
Chapter 1-1
18 pages
Introduction to Statistics Concepts
No ratings yet
Introduction to Statistics Concepts
12 pages
Intro to Statistics for Students
No ratings yet
Intro to Statistics for Students
29 pages
Chapter One
No ratings yet
Chapter One
8 pages
Tutorial Introduction To Statistics
No ratings yet
Tutorial Introduction To Statistics
16 pages
Introduction to Basic Statistics
No ratings yet
Introduction to Basic Statistics
53 pages
B.S.A Notes Unir 1 MBA 1
No ratings yet
B.S.A Notes Unir 1 MBA 1
49 pages
Chapter 1
No ratings yet
Chapter 1
22 pages
Chapter 1 - Introduction To Statistics
No ratings yet
Chapter 1 - Introduction To Statistics
22 pages
Introduction to Statistics: Key Concepts
No ratings yet
Introduction to Statistics: Key Concepts
15 pages
Introduction to Statistics Overview
100% (1)
Introduction to Statistics Overview
8 pages
CHAPTER ONE Stat I
No ratings yet
CHAPTER ONE Stat I
6 pages
Elementary Statistics and Probability: By: Carmela O. Zamora-Reyes Lorelei B. Ladao - Saren
100% (2)
Elementary Statistics and Probability: By: Carmela O. Zamora-Reyes Lorelei B. Ladao - Saren
27 pages
Introduction To Statistics Material 2023
No ratings yet
Introduction To Statistics Material 2023
85 pages
Chapter One Definition of Statistics
100% (1)
Chapter One Definition of Statistics
17 pages
Statistics I
No ratings yet
Statistics I
15 pages
Chapter 1 and 2
No ratings yet
Chapter 1 and 2
29 pages
Basic Statistics Ch1 - 4
No ratings yet
Basic Statistics Ch1 - 4
69 pages
Yo Fams
No ratings yet
Yo Fams
33 pages
Business Statistics
No ratings yet
Business Statistics
109 pages
Document From MK
No ratings yet
Document From MK
71 pages
Statics 1 and 2
No ratings yet
Statics 1 and 2
21 pages
Sta 321
No ratings yet
Sta 321
7 pages
Statistics 1-1
No ratings yet
Statistics 1-1
4 pages
Intro. SWE and Agro Eco
No ratings yet
Intro. SWE and Agro Eco
64 pages
Gizaw
No ratings yet
Gizaw
78 pages
Engineering Data Analysis
No ratings yet
Engineering Data Analysis
64 pages
Intro to Statistics Basics
No ratings yet
Intro to Statistics Basics
17 pages
Lesson 1 Nature of Statistics
No ratings yet
Lesson 1 Nature of Statistics
4 pages
Basic Statistics For Ecomists 1
No ratings yet
Basic Statistics For Ecomists 1
10 pages
Chpt1 4
No ratings yet
Chpt1 4
19 pages
Stat - Lesson 1 Concepts and Definitions
No ratings yet
Stat - Lesson 1 Concepts and Definitions
5 pages
Probability and Statistics For Engineers
No ratings yet
Probability and Statistics For Engineers
106 pages
Business Statistics
No ratings yet
Business Statistics
186 pages
Stat 111 Lecture Note 01 PDF
No ratings yet
Stat 111 Lecture Note 01 PDF
14 pages
ECONOMIC STATISTICS Lecture Notes
No ratings yet
ECONOMIC STATISTICS Lecture Notes
18 pages
Statistics For Managers
No ratings yet
Statistics For Managers
51 pages
Kimberly Akimbo Study Guide 2f430b6ac3
No ratings yet
Kimberly Akimbo Study Guide 2f430b6ac3
26 pages
Office Tech's Impact on Secretaries
No ratings yet
Office Tech's Impact on Secretaries
9 pages
Crochet Pattern "Hedgehog Lucky"
No ratings yet
Crochet Pattern "Hedgehog Lucky"
15 pages
06 Gingerbread House Big Windows
No ratings yet
06 Gingerbread House Big Windows
3 pages
Plan Learn English
No ratings yet
Plan Learn English
2 pages
Annex 1 Topic List - 40 Ciclo - Def
No ratings yet
Annex 1 Topic List - 40 Ciclo - Def
58 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
56 pages
Chemist Retail Business Guide
No ratings yet
Chemist Retail Business Guide
31 pages
Performing Gender Through Code Switching A Multimodal Discourse Analysis of 20250704 021401 0000
No ratings yet
Performing Gender Through Code Switching A Multimodal Discourse Analysis of 20250704 021401 0000
11 pages
Sustainability Scorecard 2016
No ratings yet
Sustainability Scorecard 2016
30 pages
Pathogenesis of Odontogenic Cysts: Article
No ratings yet
Pathogenesis of Odontogenic Cysts: Article
5 pages
NLU: Challenges and Applications
No ratings yet
NLU: Challenges and Applications
3 pages
Install MongoDB on Arch Linux Guide
No ratings yet
Install MongoDB on Arch Linux Guide
2 pages
Understanding Heatmaps and Chernoff Faces
No ratings yet
Understanding Heatmaps and Chernoff Faces
4 pages
CHP 23 - Radioactive Decay
No ratings yet
CHP 23 - Radioactive Decay
28 pages
5 Ci Sinif Word Definition-4-2025
No ratings yet
5 Ci Sinif Word Definition-4-2025
2 pages
Unit 10 Modal Auxiliray Verbs in The Past - Other Uses
No ratings yet
Unit 10 Modal Auxiliray Verbs in The Past - Other Uses
5 pages
Microsoft Fabric Data Engineer Interview Roadmap
No ratings yet
Microsoft Fabric Data Engineer Interview Roadmap
2 pages
PRC Actual Delivery Case
No ratings yet
PRC Actual Delivery Case
2 pages
CS 4405-01 Mobile Applications - AY2025-T1 WRITTEN ASSIGNMENT UNIT 3
No ratings yet
CS 4405-01 Mobile Applications - AY2025-T1 WRITTEN ASSIGNMENT UNIT 3
11 pages
4 HN 7136 To 8255
No ratings yet
4 HN 7136 To 8255
63 pages
Understanding Quality Function Deployment
No ratings yet
Understanding Quality Function Deployment
12 pages
Simple & Multiple Regression
No ratings yet
Simple & Multiple Regression
12 pages
Liquefaction of Natural Gas Using Single Stage Mixed Refrigerant PRICO Process
No ratings yet
Liquefaction of Natural Gas Using Single Stage Mixed Refrigerant PRICO Process
8 pages
Fallacies
No ratings yet
Fallacies
6 pages
Indian Monsoon
No ratings yet
Indian Monsoon
14 pages
WI 750 001 Doc Numbering
No ratings yet
WI 750 001 Doc Numbering
3 pages
"I Will Walk Among You": 00i-291 Harper 3p.indb 1 10/1/18 8:09 AM
No ratings yet
"I Will Walk Among You": 00i-291 Harper 3p.indb 1 10/1/18 8:09 AM
303 pages
Propane Safety Sheet
No ratings yet
Propane Safety Sheet
4 pages
ABC SDN BHD Q1 2018 Sales Report
No ratings yet
ABC SDN BHD Q1 2018 Sales Report
6 pages