0% found this document useful (0 votes)
19 views67 pages

1 Probabilities - Lecture Slides

The document provides an overview of probability theory, including definitions, set theory, and calculations involving probabilities. It covers various concepts such as independent and dependent events, the sum and multiplication rules, and practical examples to illustrate these principles. Additionally, it discusses the application of probabilities in IT and media, along with a questionnaire for further engagement.

Uploaded by

Polaris Star
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views67 pages

1 Probabilities - Lecture Slides

The document provides an overview of probability theory, including definitions, set theory, and calculations involving probabilities. It covers various concepts such as independent and dependent events, the sum and multiplication rules, and practical examples to illustrate these principles. Additionally, it discusses the application of probabilities in IT and media, along with a questionnaire for further engagement.

Uploaded by

Polaris Star
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Data & A.I.

Probability theory
Wim De Keyser
Rony Baekeland
Tom Magerman
Quote

“If there is a 50-50 chance that


something can go wrong,
then 9 times out of 10 it will.”

Paul Harvey (1918-2009)


Agenda
1. What is a probability?
2. Set theory and probabilities
3. Cross tables and probabilities
4. Calculating with probabilities
5. Probabilities and IT
6. Probability in the media
7. Questionnaire
What is a probability?
What is a probability?
An experiment that produces different outcomes
despite the same initial situation
e.g.: roll a die, determine the spin of an electron, test how long
the power supply of a server works, ...
• However, there are regularities in the results if you take many
measurements :
– certain values are more common than others
– all values occur more or less equally
– the average of the values is around a certain value
– ...
What is a probability?
• Different interpretations of the term ‘probability’
are possible:
– result from delimited probability experiment
Draw a ball from a bag containing 20 red and 30 blue balls. What
is the probability that it is a red ball?
– generalisation from probability experiment to population
Of the 20 iPads randomly tested, 2 were broken. How many
returned iPads can we expect in the shops?
– probability of an individual measurement
A patient wants to undergo an operation. What is the chance that
this operation will be successful?
Set theory and
probabilities
Probabilities and Set theory
• Laplace
– Set (collection) of all possible outcomes : U
– event = set (collection) of desired outcomes : G
– Probability of event P(G) = #G / #U

G
Probabilities and Set theory
• Example: roll 1 dice event Possible outcomes #G
1 (1) 1
2 (2) 1
3 (3) 1
4 (4) 1
5 (5) 1
6 (6) 1
#U (= TOTAL) 6
Probabilities and Set theory
• Example: roll 1 dice event Possible outcomes #G
1 (1) 1
2 (2) 1
3 (3) 1
4 (4) 1
5 (5) 1
6 (6) 1
#U (= TOTAL) 6

P(G=3 OR G=4) = #(G=3 OR G=4) / #U = 2/6 = 1/3)


Probabilities and Set theory
• Example 1 event Possible outcomes #G
2 (1,1) 1
3 (1,2); (2,1) 2
4 (1,3); (2,2); (3,1); 3
5 (1,4); (2,3), (3,2); (4,1) 4
6 (1,5); (2,4); (3,3); (4,2); (5,1) 5
7 (1,6); (2,5); (3,4); (4,3); (5,2); (6,1) 6
8 (2,6); (3,5); (4,4); (5,3); (6,2) 5
9 (3,6); (4,5); (5,4); (6,3) 4
10 (4,6); (5,5); (6,4) 3
11 (5,6); (6,5) 2
12 (6,6) 1
19-9-2023 #U (= TOTAL) 36
Probabilities and Set theory
• Example 2:
U = { students ACS1 of the year 2021-2022 },
#U = 95
G = { students with ‘pass’ of the year 2021-2022 },
#G = 25
– What is the probability that a randomly selected
student from U will pass?
– What is the probability that a randomly selected
student will pass this year?
– What is the probability that you will pass?
Probabilities and Set theory
• Other ‘Laplace’- examples:

 But…

Does each field have an equal


chance of being 'visited' during the
game?
Extra
Two player game:
• Each player takes one of the four dice shown below.
• Both roll their dice. Whoever rolls the highest number gets a point.
• Whoever gets 10 points first wins.

Which die would you take?


Extra
Simulate the game in Python:
• Create the data frame ‘dice’ with a column for each dice – use the color as the name –
containing the six values of the dice
• Write the function rollDice that for a given color of dice at random chooses one of the
sides
• Write the function diceBattle that receives two colors as a parameter together with
the target (= score to be achieved. Has the value 10 -see previous slide- but can also
be another value) and as a result returns the color that won the 'battle'. Also print each
step (which values were thrown, what is the intermediate score) of the game.
• Write the function diceChallenge that
receives two colors as parameters
along with the target and the number
of rounds. Print how many of the rounds
were won by which color
Cross tables and probabilities
Cross tables and probabilities
• You can sometimes also easily read off probabilities from
cross tables:
White label Private label
Bad cooling fan 1498 1513 3011
Good cooling fan 504 6485 6989
2002 7998 10000

• What are the chances that a white label PC has a bad


cooling fan?
U = { pc’s with white label}, G = { Bad cooling fan}
• This probability is: 1498/2002 = 0.75
Calculating with
probabilities
 General
 The ‘sum’ rule
 The ‘multiplication’ rule
 Dependent and independent events
 Law of total probability
 Bayes’ Theorem
Calculating with probabilities
• the probability of an opposite event: U

ഥ) = 1 - P(G) G ഥ
G
P(G

is sometimes easier to determine/calculate

Remarks:
- the opposite event is also called the complementary
event
- G and G
ഥ are mutually exclusive events
Calculating with probabilities
• Example: probability of rolling at least 4 eyes with 2
dice:
Event Possible outcomes #G
– U = {...} 2 (1,1) 1

– P(G) = probability to roll 4, 3 (1,2); (2,1) 2

5,..,12 eyes 4 (1,3); (2,2); (3,1); 3


5 (1,4); (2,3), (3,2); (4,1) 4
– P(G) = 1 - P(G)
ഥ 6 (1,5); (2,4); (3,3); (4,2); (5,1) 5

– P(G
ഥ) = probability to roll 2 7 (1,6); (2,5); (3,4); (4,3); (5,2); (6,1) 6
8 (2,6); (3,5); (4,4); (5,3); (6,2) 5
or 3 eyes 9 (3,6); (4,5); (5,4); (6,3) 4
ഥ = {...}
– G 10 (4,6); (5,5); (6,4) 3
11 (5,6); (6,5) 2
– P(G
ഥ) is ...
12 (6,6) 1
– so: 1-P(Gഥ) = ... #U (= TOTAL) 36
The “sum rule”
• An event G can consist of sub-events Gi (i=1,..n)
• Suppose the subevents Gi (i=1,..n) do not overlap, they
are mutually exclusive events (they cannot occur
together or in other words GiGj =   i=1..n, j=1..n
where ij )
• We wonder what the probability is if one of these events
occurring : U
P(G) = P(G1 OR G2 OR ... OR Gn) G
= P(G1  G2  ...  Gn) G1 G2 … Gn
= P(G1) + P(G2) + ... + P(Gn)
Calculating with probabilities
• Example: probability of rolling at least 4 eyes with 2
dice:
Event Possible outcomes #G
– U = {...} 2 (1,1) 1

– P(G) = probability to roll 4, 3 (1,2); (2,1) 2

5,..,12 eyes 4 (1,3); (2,2); (3,1); 3


5 (1,4); (2,3), (3,2); (4,1) 4
– P(G) = 1 - P(G)
ഥ 6 (1,5); (2,4); (3,3); (4,2); (5,1) 5

– P(G
ഥ) = probability to roll 2 7 (1,6); (2,5); (3,4); (4,3); (5,2); (6,1) 6
8 (2,6); (3,5); (4,4); (5,3); (6,2) 5
or 3 eyes 9 (3,6); (4,5); (5,4); (6,3) 4
ഥ = {...}
– G 10 (4,6); (5,5); (6,4) 3
11 (5,6); (6,5) 2
– P(G
ഥ) is ...
12 (6,6) 1
– so: 1-P(Gഥ) = ... #U (= TOTAL) 36
The “sum rule”
• Example: choose one card from a deck of cards.
What is the probability that the card is an ace or
a 4?
– Probability of an ace = 4/52

– Probability of a 4 = 4/52

– Probability of an ace OR a 4= 4/52 + 4/52 = 8/52


= 0.154
Exclusive events! = 15.4%
The “sum rule”
• Suppose the sub-events overlap (are not exclusive).
Suppose there are only 2 sub-events G1 and G2
– P(G) = P(G1  G2) = P(G1) + P(G2) - P(G1  G2)

• Example: deck of cards 3 1 12 36


– choose 1 card
– What is the probability that this card is an ace or a heart card?
G = Gace  Ghearts but the intersection is not empty ( )
#𝐺𝑎𝑐𝑒 = 4, #𝐺ℎ𝑒𝑎𝑟𝑡𝑠 = 13, # 𝐺𝑎𝑐𝑒 𝑎𝑛𝑑 𝐺ℎ𝑒𝑎𝑟𝑡 = 1

– Probabilty of G is now (4+13-1) / 52 = 16/52 = 30.8%


The “multiplication rule”
Consider an event consisting of sub-events
What is the probability that all sub-events will happen at the
same time?
• If all subevents are “independent” from one an another,
then:
P(G) = P(G1 AND G2 AND ... AND Gn)
= P(G1  G2  ...  Gn)
= P(G1) x P(G2) x ... x P(Gn)
The “multiplication rule”
• Example: counter-strike
– 3 players
• player1 schoots 1 hit out of 5
• player2 schoots 1 hit out of 4
• player3 schoots 1 hit out of 3
– 1 terrorist tries to get through a tunnel
– the players can each shoot 1 time

What is the probability of the terrorist getting through the


tunnel alive?
The “multiplication rule”
What is the probability of the terrorist getting
through the tunnel alive?
This is the probability that player1 misses (G1) AND player2
misses (G2) AND player3 misses (G3)
P(G) = P(G1 AND G2 AND G3)
= P(G1  G2  G3)
= P(G1) x P(G2) x P(G3)
= 4/5 x 3/4 x 2/3 = 24/60
= 0.4 = 40%

 The terrorist has 40% chance of getting through the tunnel alive
The “multiplication rule”
Example: bag with 20 blue and 30 red marbles. What is the
probability to draw 3 red marbles in a row?

= P(first=red) AND P(second=red) AND P(third=red)


= P(first=red) x P(second=red) x P(third=red)
= 30/50 x 30/50 x 30/50
= 3/5 x 3/5 x 3/5
= 27/125 = 0.216
Dependent events
• The sub-events G1, G2,..Gn may also be “dependent”.
– In that case: P(G1  G2  ...  Gn)
≠ P(G1) x P(G2) x ... x P(Gn)
– Example: group of students
• The probability that a student is a girl is 0.48
• The probability of a student wearing glasses is 0.2
• What is the probability that a random student is
a girl with glasses?
Problem: Perhaps all spectacle wearers are boys.
In that case is P(G) = 0 while P(Ggirl) x P(Gglasses) = 0.096
There is a dependency between boys and spectacle wearers
Dependent events
• To calculate the probability that a student is a spectacle
wearing girl, we use the following formula :
P(Ggirl  Gglasses) = P(Gglasses | Ggirl) x P(Ggirl)

where P(Gglasses | Ggirl) is the probability that the student


wears glasses, given that it is a girl (= a conditional
probability).
So if there are no girls with glasses, then this is 0.
Suppose that 10% of the girls wear glasses, boys
girls
then P(Ggirl  Gglasses) = 0.1 x 0.48 = 0.048 glasses
Independent events
𝑃 𝐴𝐵 = 𝑃 A x 𝑃 𝐵
Dependent events
𝑃 𝐴𝐵 = 𝑃 A ∣ 𝐵 x 𝑃 𝐵

A
B
Dependent events
• You can invert the previous formula 𝑃 𝐴𝐵 = 𝑃 A ∣ 𝐵 x 𝑃 𝐵

𝑃 𝐴𝐵
to define P(A|B): 𝑃 A∣𝐵 =
𝑃 𝐵

𝑃 𝐴𝐵 𝑃 𝐵𝐴 𝑃 B∣𝐴 x 𝑃 𝐴


• Note: 𝑃 A ∣ 𝐵 = 𝑃 𝐵 =
𝑃 𝐵
=
𝑃 𝐵

A
B
Dependency
• When are two events independent?
Answer: when 𝑃 𝐴 ∣ 𝐵 = 𝑃 𝐴
So: 𝑃 𝐴 x 𝑃 𝐵 = 𝑃 𝐴 ∣ 𝐵 x 𝑃 𝐵 = 𝑃 𝐴 𝑎𝑛𝑑 𝐵 = 𝑃 𝐴  𝐵

• What can you decide with regard to dependent/


independent when 𝑃 𝐴  𝐵 = 0 knowing that
𝑃 𝐴 ≠ 0 𝑎𝑛𝑑 𝑃 𝐵 ≠0 ?
𝑃 𝐴𝐵 = 0 𝑃 𝐴 ∣ 𝐵 x 𝑃 𝐵 = 0 bu𝑡 𝑃 𝐵 ≠ 0 𝑃 𝐴 ∣ 𝐵 = 0
bu𝑡 𝑃 𝐴 ≠ 0 𝑃 𝐴 ∣ 𝐵 ≠ 𝑃 𝐴
Dependency
• Let us take –again- the example of randomly taking a card from a
deck of cards.
• Are the events 'the card is an ace' and 'the card is a heart'
independent?
P(an ace) x P(a heart) ? P(an ace and a heart)
– P(an ace) = 4 / 52
– P(a heart) = 13 / 52 3 1 12 36
– P(an ace and a heart) = 1 / 52
– P(an ace) x P(a heart) = 4 / 52 x 13 / 52 = 1 /52

– Conclusion ?
The “multiplication rule”
Example: bag with 20 blue and 30 red marbles. What is the
probability to draw 3 red marbles in a row?

= P(first=red) AND P(second=red) AND P(third=red)


= P(first=red) x P(second=red) x P(third=red)
= 30/50 x 30/50 x 30/50
= 3/5 x 3/5 x 3/5 = 27/125 = 0.216

(if the events are independent => if the marbles are put
back in the bag after each draw so P(second=red | first=red) =
P(second=red)
The “multiplication rule”
If the marbles are not put back in the bag after each draw:
P(second=red | first=red) ≠ P(second=red)
so events are dependent
(it is less likely to draw a red marble on the second draw if the first
draw was a red marble and is not put back in the bag)

= P(first=red) AND P(second=red | first=red) AND P(third=red |


first=red AND second=red)
= 30/50 x 29/49 x 28/48
= 0.600 x 0.592 x 0.583
= 0.207
Law of total probability
• Suppose that G1, G2, ..., Gn do not overlap, then
the following formula can be formulated: G1
G2 …
Gn
B
𝑃 𝐵 = σ𝑛𝑖=1 𝑃 𝐵 and 𝐺𝑖 = σ𝑛𝑖=1 𝑃 𝐵 ∣ 𝐺𝑖 x 𝑃 𝐺𝑖

• Example:
– 3 politicians have a 30%, 20% and 50% chance
of being elected, respectively (V1, V2, V3) V1 V2 V3
Tax reduction
– the chances of the politicians reducing
taxes are 50%, 40% and 30% respectively
– what is the probability that the taxes will be lowered?
𝑃 𝐵𝑟𝑒𝑑𝑢𝑐𝑡𝑖𝑜𝑛 = 𝑃 𝐵𝑟𝑒𝑑𝑢𝑐𝑡𝑖𝑜𝑛 ∣ 𝑉1 x 𝑃 𝑉1 + 𝑃 𝐵𝑟𝑒𝑑𝑢𝑐𝑡𝑖𝑜𝑛 ∣ 𝑉2 x 𝑃 𝑉2 + 𝑃 𝐵𝑟𝑒𝑑𝑢𝑐𝑡𝑖𝑜𝑛 ∣ 𝑉3 x 𝑃 𝑉3
= 0.5 x 0.3 + 0.4 x 0.2 + 0.3 x 0.5 = 0.38
Definition G
G2 …
G1 Gn
• If we slice the set G into
• non-overlapping subsets
• G1,G2,G3,…Gn
Then we call G1,G2,G3,…Gn a partition of G
Bayes’ theorem (aka Bayes' law aka Bayes' rule)
• If {G1, G2, ..., Gn} is a partitioning of G
the according to Bayes’ theorem:

𝑃 𝐵 ∣ 𝐺𝑘 x 𝑃 𝐺𝑘 𝑏𝑒𝑐𝑎𝑢𝑠𝑒 𝑃 𝐵𝐺𝑘 = 𝑃 𝐵 ∣ 𝐺𝑘 x 𝑃 𝐺𝑘
𝑃 𝐺𝑘 ∣ 𝐵 =
𝑃 𝐵 𝑎𝑛𝑑 𝑃 𝐺𝑘 𝐵 = 𝑃 𝐺𝑘 ∣ 𝐵 x 𝑃 𝐵

• Suppose that after the elections, taxes are reduced.


What is the probability that this is because politician 3
was elected?
𝑃 𝐵𝑟𝑒𝑑𝑢𝑐𝑡𝑖𝑜𝑛 ∣ 𝑉3 x 𝑃 𝑉3
𝑃 𝑉3 ∣ 𝐵𝑟𝑒𝑑𝑢𝑐𝑡𝑖𝑜𝑛 =
𝑃 𝐵𝑟𝑒𝑑𝑢𝑐𝑡𝑖𝑜𝑛
= (0.3 x 0.5)/ 0.38 = 0.395
Bayes’ theorem - exercise
train bus car
Bob commutes to work every day toolate
• Probability of being late :
– train (10%), bus (20%), car (40%)
• One day Bob is late. What are the chances that he
came by car?
Suppose all vehicles are used with equal probability
𝑃 𝑡𝑜𝑜𝑙𝑎𝑡𝑒 ∣ 𝑐𝑎𝑟 x 𝑃 𝑐𝑎𝑟
𝑃 𝑐𝑎𝑟 ∣ 𝑡𝑜𝑜𝑙𝑎𝑡𝑒 =
𝑃 𝑡𝑜𝑜𝑙𝑎𝑡𝑒
𝑃 𝑡𝑜𝑜𝑙𝑎𝑡𝑒 = 𝑃 𝑡𝑜𝑜𝑙𝑎𝑡𝑒 ∣ 𝑐𝑎𝑟 x 𝑃 𝑐𝑎𝑟 = 0.4 x 1Τ3
+𝑃 𝑡𝑜𝑜𝑙𝑎𝑡𝑒 ∣ 𝑏𝑢𝑠 x 𝑃 𝑏𝑢𝑠 + 0.2 x 1Τ3
+𝑃 𝑡𝑜𝑜𝑙𝑎𝑡𝑒 ∣ 𝑡𝑟𝑎𝑖𝑛 x 𝑃 𝑡𝑟𝑎𝑖𝑛 + 0.1 x 1Τ3
Bayes’ theorem - exercise
train bus car
toolate
• Suppose: probabilities of chosen vehicle:
car (10%), train (80%), bus (10%)

𝑃 𝑡𝑜𝑜𝑙𝑎𝑡𝑒 ∣ 𝑐𝑎𝑟 x 𝑃 𝑐𝑎𝑟


𝑃 𝑐𝑎𝑟 ∣ 𝑡𝑜𝑜𝑙𝑎𝑡𝑒 =
𝑃 𝑡𝑜𝑜𝑙𝑎𝑡𝑒

𝑃 𝑡𝑜𝑜𝑙𝑎𝑡𝑒 = 𝑃 𝑡𝑜𝑜𝑙𝑎𝑡𝑒 ∣ 𝑐𝑎𝑟 x 𝑃 𝑐𝑎𝑟 = 0.4 x 0.1


+𝑃 𝑡𝑜𝑜𝑙𝑎𝑡𝑒 ∣ 𝑏𝑢𝑠 x 𝑃 𝑏𝑢𝑠 +0.2 x 0.1
+𝑃 𝑡𝑜𝑜𝑙𝑎𝑡𝑒 ∣ 𝑡𝑟𝑎𝑖𝑛 x 𝑃 𝑡𝑟𝑎𝑖𝑛 +0.1 x 0.8
Example - Monty Hall problem
During a game programme, the winning candidate may
choose from three shielded rooms A, B and C. One of
these three rooms contains a car, the other two rooms
are empty. The candidate chooses one of the three
rooms whereupon the game moderator opens one of the
other two rooms and shows it to be empty. The
candidate can now stick to his original choice or choose
the other still protected room. What would the candidate
do to maximise his chance of winning the car?
Suppose the candidate chooses room A and the game
moderator opens the empty room B. (reasoning is
analogous when the candidate chooses another room
and the game moderator shows another empty room)
Example - Monty Hall problem
Consider the following events:
– A car (room A contains the car)  P(A car) = 1/3
– B car (room B contains the car)  P(B car) = 1/3
– C car (room C contains the car)  P(C car) = 1/3

A car B car C car


B shown

The candidate chooses room A


The game moderator opens the empty room B
Example - Monty Hall problem
Consider the event B shown (The game moderator opens the
empty room knowing that the candidate had chosen room A) :

• The following conditional probabilities can be determined:


– P(B shown | A car) = 0.5
– P(B shown | B car) = 0
– P(B shown | C car) = 1

• P(B shown) = P(B shown | A car) x P(A car)


+ P(B shown | B car) x P(B car)
+ P(B shown | C car) x P(C car)
= 0.5 x 1/3 + 0 x 1/3 + 1 x 1/3
= 0.5
Example - Monty Hall problem

• Apply Bayes’ theorem:

P(A car | B shown) = P(B shown | A car) x P(A car) / P(B shown)
= (0.5 x 1/3 ) / 0.5
= 1/3

P(C car | B shown) = P(B shown | C car) x P(C car) / P(B shown)
= (1 x 1/3 ) / 0.5
= 2/3
 The candidate has the best chance of winning the car by
changing his choice and opting for space C!
Summary

• 𝑃(𝐴) = 1 − 𝑃(𝐴)
• 𝑃 𝐴 AND 𝐵 = 𝑃 𝐴 𝐵 x 𝑃 𝐵 𝑜𝑟 𝑃 𝐵 𝐴 x 𝑃 𝐴
• 𝑃(𝐴 OR 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 AND 𝐵)
𝑃(𝐵∣𝐴) x 𝑃(𝐴)
• 𝑃(𝐴 ∣ 𝐵) = 𝑃(𝐵)
• 𝑃 𝐴 = 𝑃 𝐺1 x 𝑃 𝐴 𝐺1 + 𝑃 𝐺2 x 𝑃 𝐴 𝐺2 +. . . +𝑃 𝐺𝑛 x 𝑃(𝐴 ∣ 𝐺𝑛 )
Summary

Two events are independent if

• 𝑃 𝐴 𝐵 =𝑃 𝐴
• 𝑃 𝐴 AND 𝐵 = 𝑃 𝐴 x 𝑃 𝐵

Because 𝑃 𝐴 AND 𝐵 = 𝑃 𝐴 𝐵 x 𝑃 𝐵 𝑜𝑟 𝑃 𝐵 𝐴 x 𝑃 𝐴
Probabilities & IT
 Spam filter
 Availability
Spam filter
Spam is the brand name of a certain type of canned meat (marketed by Hormel since 1937). It was a regular item on the
American soldier's menu during WWII.
Monthy Python used spam in a sketch -1970- (see [Link] to denounce the then current ban on
surreptitious advertising on television. It became the symbol of unwanted advertising and later of unwanted e-mails. The 1st
spam mail was sent in 1978 via ARPAnet (see [Link] )

A Spam filter wants to determine whether an email is


spam (= "bad quality") or ham (= "good quality").
if P(spam)/P(ham) is larger than a certain treshold value  spam

• spam???
• Application of Bayes’ Theorem!
Spam filter
• Given: email with words w1, w2, ..., wn
• Question: Probability of spam and of ham
– P(SPAM | w1 and w2 and ... and wn)
– P(HAM | w1 and w2 and ... and wn)
• we know
𝑃 𝑤1, 𝑤2, . . . , 𝑤𝑛 ∣ 𝑆𝑃𝐴𝑀 x 𝑃 𝑆𝑃𝐴𝑀
𝑃 𝑆𝑃𝐴𝑀 ∣ 𝑤1, 𝑤2, . . . , 𝑤𝑛 =
𝑃 𝑤1, 𝑤2, . . . , 𝑤𝑛
𝑃 𝑤1, 𝑤2, . . . , 𝑤𝑛 ∣ 𝐻𝐴𝑀 x 𝑃 𝐻𝐴𝑀
𝑃 𝐻𝐴𝑀 ∣ 𝑤1, 𝑤2, . . . , 𝑤𝑛 =
𝑃 𝑤1, 𝑤2, . . . , 𝑤𝑛
Spam filter
• In order to know whether a mail is spam, we
calculate :
𝑃 𝑆𝑃𝐴𝑀 ∣ 𝑤1, 𝑤2, . . . , 𝑤𝑛 𝑃 𝑤1, 𝑤2, . . . , 𝑤𝑛 ∣ 𝑆𝑃𝐴𝑀 x 𝑃 𝑆𝑃𝐴𝑀
=
𝑃 𝐻𝐴𝑀 ∣ 𝑤1, 𝑤2, . . . , 𝑤𝑛 𝑃 𝑤1, 𝑤2, . . . , 𝑤𝑛 ∣ 𝐻𝐴𝑀 x 𝑃 𝐻𝐴𝑀

• we "learned" 𝑛

𝑃 𝑤1, 𝑤2, . . . , 𝑤𝑛 ∣ 𝑆𝑃𝐴𝑀 ≈ ෑ 𝑃 𝑤𝑖 ∣ 𝑆𝑃𝐴𝑀


𝑖=1
𝑃 𝑤1, 𝑤2, . . . , 𝑤𝑛 ∣ 𝐻𝐴𝑀
For your information: In March 2021, 45.1% of
𝑃 𝑆𝑃𝐴𝑀 total email traffic was spam.
([Link]
𝑃 𝐻𝐴𝑀 = 1 − 𝑃 𝑆𝑃𝐴𝑀 pam-email-traffic-share/)
Availability
• What is the availability of an IT element (e.g. server, router,
...)?

– Reliability: the probability that the element will not fail

– Maintainability: the probability that the element was


successfully restored after a failure

– Availability at a given time: the probability that a component


will not fail at a given time and will not be restored after a
failure.
Availability
Defined in Service Level Agreements (SLAs) between the
"supplier" of the IT component/IT service and the "customer".
Availability
Problem for IT department: "customer" demands SLA on IT
service (e.g. total availability of an application) while IT
suppliers guarantee the availability of individual components.
How do you determine the availability?
Availability
The availability of an IT service depends on the availability of
each of its components.
Example: the availability of a computer system consisting of:
Web Server + Application Server + Database Server

A chain is only as strong as it’s weakest link


But an IT infrastructure is weaker than its weakest link!

Availability service = 0.999 x 0.9999 x 0.99999 = 0.9989


Note: 99.89 % =  10 hours downtime / year
Infrastructure Availability piramide

 Determine the availability for each layer Real world example, no exam material.
Screen shots are only available in Dutch
Probabilties in the
media
Probability in the media

The story of the CIA officer who tried to measure what we mean when we
say how likely something is to happen.
This is a chart showing an experiment with 23 Nato officers asking what they
understood by different terms expressing probability such as “almost
certainly”, “we doubt” and “almost no chance”.
The visualisation, which surfaced on Reddit this week, was inspired by the
work of pioneering intelligence analyst Sherman Kent.
The officers were asked to assign a percentage of probability to what they
understood by the different phrases. The dots on the chart are their answers,
while the grey shaded areas are roughly the ranges that Kent said should
cover the probability of each description.
As you can see there is considerable discrepancy between how the different
officers answered.
:

[Link]
Probability in the media
[Link]
equation-formula-alien-life-calculation-
2018-7?international=true&r=US

• Nearly half of Americans believe aliens have


visited Earth, according to a poll.
• The Drake equation explores the chances that
detectable alien civilizations exist using seven
variables.
• While some predictions that use the equation are
optimistic, a comprehensive new study suggests a
strong likelihood that we’re alone in the Milky
Way galaxy.
• There’s also a roughly 38% chance that humans are
completely alone in the visible universe.

Astrophysicist Frank Drake drew up the famous formula on a chalkboard in 1961. This was at the dawn of a worldwide search for extraterrestrial
intelligence (SETI), and his thinking continues to influence the use of astronomical observatories to this day.
The equation is more of an argument wrapped in seven variables. Multiplied together, these variables yield a calculation of the possibility that
humanity might someday hear from an intelligent civilization.
Probability in the media – Bayes’ Theorem

…Bayes’s theorem is written, in mathematical notation, as P(A|B) = (P(B|A)P(A))/P(B). It looks complicated.


But you don’t need to worry about what all those symbols mean: it’s fairly easy to understand when you think of an example.
Imagine you undergo a test for a rare disease. The test is amazingly accurate: if you have the disease, it will correctly say so 99% of the time; if
you don’t have the disease, it will correctly say so 99% of the time. But the disease in question is very rare; just one person in every 10,000 has it.
This is known as your “prior probability”: the background rate in the population.
So now imagine you test 1 million people. There are 100 people who have the disease: your test correctly identifies 99 of them. And there are
999,900 people who don’t: your test correctly identifies 989,901 of them.
But that means that your test, despite giving the right answer in 99% of cases, has told 9,999 people that they have the disease, when in fact
they don’t. So if you get a positive result, in this case, your chance of actually having the disease is 99 in 10,098, or just under 1%. If you took this
test entirely at face value, then you’d be scaring a lot of people, and sending them for intrusive, potentially dangerous medical procedures, on
the back of a misdiagnosis.
Without knowing the prior probability, you don’t know how likely it is that a result is false or true. If the disease was not so rare – if, say, 1% of
people had it – your results would be totally different. Then you’d have 9,900 false positives, but also 9,990 true positives. So if you had a
positive result, it would be more than 50% likely to be true.
This is not a hypothetical problem. One review of the literature found that 60% of women who have annual mammograms for 10 years have at
least one false positive; another study found that 70% of prostate cancer screening positives were false. An antenatal screening procedure for
foetal chromosomal disorders which claimed “detection rates of up to 99% and false positive rates as low as 0.1%” would have actually returned
false positives between 45% and 94% of the time, because the diseases are so rare, according to one paper….

[Link]
QUESTIONNAIRE

Questionnaire
Questionnaire

• Download the file 'Questionnaire [Link]'


(see Canvas)
• Put the file in your Python workspace
• Load the data in the data frame studenq
>>> import pandas as pd
>>> studenq = pd.read_csv('Questionnaire [Link]',
delimiter=';', decimal='.')
Questionnaire
Assume we surveyed ALL students of ACS with the questionnaire.

Q.1.a. What is the probability that a randomly selected student


had 2 hours or less of mathematics in the final year of secondary
school?

Q.1.b. What is the probability that a randomly selected student


had 3 hours or more of mathematics in the final year of
secondary school?
Questionnaire
Assume we surveyed ALL students of ACS with the questionnaire.

Q.2 What is the probability that a randomly selected student is a


non-smoker or eats more than 3 pieces of fruit?
Questionnaire
Assume we surveyed ALL students of ACS with the questionnaire.

Q.3 What is the probability of a randomly selected smoking


student eating 1 or less pieces of fruit?
Questionnaire
Assume we surveyed ALL students of ACS with the questionnaire.
Q.4 Are the events 'having a driver licence' and ‘right writing
hand' independent?

You might also like