Statistical and Mathematical
Methods for Data Analysis
Dr. Syed Faisal Bukhari
Associate Professor
Department of Data Science
Faculty of Computing and Information Technology
University of the Punjab
Textbooks
Probability & Statistics for Engineers & Scientists,
Ninth Edition, Ronald E. Walpole, Raymond H. Myer
Elementary Statistics: Picturing the World, 6th
Edition, Ron Larson and Betsy Farber
Elementary Statistics, 13th Edition, Mario F. Triola
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Reference books
Probability and Statistical Inference, Ninth Edition,
Robert V. Hogg, Elliot A. Tanis, Dale L. Zimmerman
Probability Demystified, Allan G. Bluman
Schaum's Outline of Probability, Second Edition,
Seymour Lipschutz, Marc Lipson
Python for Probability, Statistics, and Machine Learning, José
Unpingco
Practical Statistics for Data Scientists: 50 Essential Concepts,
Peter Bruce and Andrew Bruce
Think Stats: Probability and Statistics for Programmers, Allen
Downey
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
References
Readings for these lecture notes:
Probability & Statistics for Engineers &
Scientists, Ninth edition, Ronald E. Walpole,
Raymond H. Myer
Probability Demystified, Allan G. Bluman
[Link]
mbers
These notes contain material from the above
three resources.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Distribution of points
Midterm = 35 points
Final term = 40 points
Sessional points = 25 points
I. Quizzes = 2 × 6 = 12 points
II. Journal/conference paper presentation = 8 points
III. Mini project (its report should be in an IEEE
journal/conference paper format) = 5 points
Or
The weightage of the project will be increased up to
15 points
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore 5
“There is only one thing that makes a dream
impossible to achieve: the fear of failure.”
― Paulo Coelho, The Alchemist
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Subjective Probability
A third type of probability is called subjective
probability. Subjective probability is based upon an
educated guess, estimate, opinion, or inexact
information.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Sample Spaces
There are two specific devices that will be used to
find sample spaces for probability experiments. They
are tree diagrams and tables.
A tree diagram consists of branches corresponding
to the outcomes of two or more probability
experiments that are done in sequence.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Sample Spaces
In order to construct a tree diagram, use branches
corresponding to the outcomes of the first
experiment. These branches will emanate from a
single point.
Then from each branch of the first experiment
draw branches that represent the outcomes of the
second experiment.
You can continue the process for further
experiments of the sequence if necessary.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Tree Diagram [1]
Example: A coin is tossed and a die is rolled. Draw a
tree diagram and find the sample space.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Solution:
Since there are two outcomes (heads and tails for
the coin), draw two branches from a single point and
label one H for head and the other one T for tail.
From each one of these outcomes, draw and label
six branches representing the outcomes 1, 2, 3, 4, 5,
and 6 for the die.
Trace through each branch to find the outcomes of
the experiment.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Tree Diagram [2]
Example: A coin is tossed and a die is rolled. Draw a
tree diagram and find the sample space.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Tree Diagram [3]
Example: A coin is tossed and a die is rolled. Find the
probability of getting
a. A head on the coin and a 3 on the die.
b. A head on the coin.
c. A 4 on the die.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Solution:
a. P(H3) = 1Τ12 = 0.0833 (or 8.33%)
b. P(head on the coin) = 6Τ12 = 1Τ2 = 0.5 (or 50%)
c. P(4 on the die) = 2Τ12 = 1Τ6 = 0.1667 (16.67%)
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Tree Diagram [4]
Example: Three coins are tossed. Draw a tree
diagram and find the sample space.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Solution
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Tree Diagram [5]
Example: Three coins are tossed. Find the probability of
getting
a. Two heads and a tail in any order.
b. Three heads.
c. No heads.
d. At least two tails.
e. At most two tails.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Solution:
a. P(2 heads and a tail) = 3/8 = 0.375 (or 37.5 %)
b. P(HHH) = 1Τ8 = 0.125 (or 12.5 %)
c. P(TTT) = 1Τ8 = 0.125 (or 12.5 %)
d. P(at least two tails) = 4Τ8 = 1Τ2 = 0.5 (or 50 %)
e. P(at most two tails) = 7Τ8 = 0.875 (or 87.5 %)
⇒ 𝑨𝒕 𝒎𝒐𝒔𝒕 𝒕𝒘𝒐 𝒕𝒂𝒊𝒍𝒔 𝒎𝒆𝒂𝒏 𝒏𝒐 𝒕𝒉𝒓𝒆𝒆 𝒕𝒂𝒊𝒍𝒔
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Tree Diagram [6]
Example: A box contains a red ball (R), a blue ball
(B), and a yellow ball (Y). Two balls are selected at
random in succession. Draw a tree diagram and find
the sample space if the first ball is replaced before
the second ball is selected.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Solution
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Tree Diagram [7]
Example: A box contains a red ball (R), a blue ball
(B), and a yellow ball (Y). Two balls are selected at
random in succession. Draw a tree diagram and find
the sample space if the first ball is not replaced
before the second ball is selected.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Solutions
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Tree Diagram [9]
Example An experiment consists of flipping a coin
and then flipping it a second time if a head occurs. If
a tail occurs on the first flip, then a die is tossed
once. To list the elements of the sample space
providing the most information, construct the tree
diagram
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Solution
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Tree Diagram [10]
Example Suppose that three items are selected at
random from a manufacturing process. Each item is
inspected and classified defective, D, or nondefective,
N. To list the elements of the sample space providing
the most information, construct the tree diagram.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Solution
S = {HH, HT, T1, T2, T3, T4, T5, T6}.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Tables [1]
Another way to find a sample space is to use a table.
Example: Find the sample space for selecting a card from
a standard deck of 52 cards.
There are four suits—hearts and diamonds, which are
red, and spades and clubs, which are black. Each suit
consists of 13 cards—ace through king. Face cards are
kings, queens, and jacks.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
A standard deck of 52 cards
Heart
Diamond
Spade
Club
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Tables [2]
Example: A single card is drawn at random from a
standard deck of cards. Find the probability that it is
a. The 4 of diamonds.
b. A queen.
Solution:
a. P(The 4 of diamonds) = 1Τ52 = 0.0192 (or
1.9231%)
b. P(A queen) = 4Τ52 = 1Τ13 = 0.0769 (or 7.6923 %)
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Tables [3]
A table can be used for the sample space when two
dice are rolled.
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore
Tables [4]
Example: When two dice are rolled, find the
probability of getting a sum of nine.
Solution:
Let A be the event of getting a ‘‘sum of 9’’
P(A) = 4Τ36 = 1Τ9 = 0.1111 (or 11.11 %)
Dr. Faisal Bukhari, Department of Data Science, PU, Lahore