0% found this document useful (0 votes)

20 views54 pages

Module 4 - Problems

The document outlines various concepts in probabilistic models, including Bayesian Networks, Markov Models, and Decision Networks. It provides a detailed example of a Bayesian Network involving a burglar alarm scenario, illustrating how to calculate conditional probabilities using joint distributions. Additionally, it introduces Markov Models with a weather example, explaining transition probabilities between different weather states.

Uploaded by

manansakhiya3112

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views54 pages

Module 4 - Problems

Uploaded by

manansakhiya3112

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

Module 4

Problems
Outline
• Bayesian Networks • Bandit Problems
• Markov Models
• MDPs
• Dempster-Shafer Theory
• Decision Networks
• Bayesian Decision Theory
• MDPs
• Value Iteration
• Policy Iteration
Bayesian Networks
You have a new burglar alarm at home. It's fairly reliable at detecting burglaries but
also sometimes goes on because of a minor earthquake. You have two neighbors,
John and Mary, who promise to call you when they hear the alarm. John always
calls when he hears the alarm, but he sometimes confuses the phone ringing with
the alarm and calls by mistake. Mary likes loud music and sometimes misses the
alarm.
The variables are:
• B (Burglary): True or False
• E (Earthquake): True or False
• A (Alarm): True or False
• J (John Calls): True or False
• M (Mary Calls): True or False
Bayesian Networks
The causal relationships are:
• Burglary and Earthquake can cause the Alarm to go off.
• The Alarm going on can cause John to call.
• The Alarm going on can cause Mary to call.

Conditional Probability Tables • 𝑃(𝐽 ∣ 𝐴)

• P(J=true ∣ A=true) = 0.90
• Prior Probabilities: • P(J=true ∣ A=false) = 0.05
• P(B=true)=0.001 • 𝑃(𝑀 ∣ 𝐴)
• P(E=true)=0.002 • P(M=true ∣ A=true) = 0.70
• Conditional Probabilities: • P(M=true ∣ A=false) = 0.01
• 𝑃(𝐴 ∣ 𝐵, 𝐸)
• P(A=true∣B=true,E=true) = 0.95
• P(A=true∣B=true,E=false) = 0.94
• P(A=true∣B=false,E=true) = 0.29
• P(A=true∣B=false,E=false) = 0.001
Bayesian Networks
Find the probability of a burglary given that John and Mary have both called:
𝑷(𝑩 = 𝒕𝒓𝒖𝒆 ∣ 𝑱 = 𝒕𝒓𝒖𝒆, 𝑴 = 𝒕𝒓𝒖𝒆).
𝑃 𝐵,𝐽,𝑀
• Using the formula for conditional probability, we get: 𝑃(𝐵 ∣ 𝐽, 𝑀) =
𝑃(𝐽,𝑀)
• In Bayesian network, we can expand the joint probability using the chain rule, :
𝑃(𝐵, 𝐸, 𝐴, 𝐽, 𝑀) = 𝑃(𝐵) × 𝑃(𝐸) × 𝑃(𝐴 ∣ 𝐵, 𝐸) × 𝑃(𝐽 ∣ 𝐴) × 𝑃(𝑀 ∣ 𝐴)
• Numerator, 𝑷(𝑩, 𝑱, 𝑴):
• Summing out the unobserved variables, A and E: 𝑃(𝐵, 𝐽, 𝑀) =
σ𝐴,𝐸 𝑃(𝐵, 𝐸, 𝐴, 𝐽, 𝑀)
• This expands to: 𝑷 𝑩, 𝑱, 𝑴 = σ𝑨𝑬 𝑷(𝑩)𝑷(𝑬)𝑷(𝑨 ∣ 𝑩, 𝑬)𝑷(𝑱 ∣ 𝑨)𝑷(𝑴 ∣ 𝑨)
• To calculate this, we need to consider all possible states of A and E:
𝑃(𝐵 = 𝑇𝑟, 𝐽 = 𝑇𝑟, 𝑀 = 𝑇𝑟) = 𝑃 𝐵 ෍ ෍ 𝑃(𝐸)𝑃(𝐴 ∣ 𝐵, 𝐸)𝑃(𝐽 ∣ 𝐴)𝑃(𝑀 ∣ 𝐴)
𝐸 𝐴
Bayesian Networks
Case 1: The Alarm is ON (A=true)
• 𝑃(𝐵 = 𝑡𝑟𝑢𝑒, 𝐸, 𝐴 = 𝑡𝑟𝑢𝑒, 𝐽 = 𝑡𝑟𝑢𝑒, 𝑀 = 𝑡𝑟𝑢𝑒)
• Summing over both states of E:
• 𝑷(𝑩) × 𝑷(𝑬) × 𝑷(𝑨 ∣ 𝑩, 𝑬) × 𝑷(𝑱 ∣ 𝑨) × 𝑷(𝑴 ∣ 𝑨)
• 𝑃(𝐵) × 𝑃(𝐸 = 𝑡𝑟𝑢𝑒) × 𝑃(𝐴 = 𝑡𝑟𝑢𝑒 ∣ 𝐵 = 𝑡𝑟𝑢𝑒, 𝐸 = 𝑡𝑟𝑢𝑒) × 𝑃(𝐽 = 𝑡𝑟𝑢𝑒 ∣
𝐴 = 𝑡𝑟𝑢𝑒) × 𝑃(𝑀 = 𝑡𝑟𝑢𝑒 ∣ 𝐴 = 𝑡𝑟𝑢𝑒)
• =0.001×0.002×0.95×0.90×0.70=0.000001197
• 𝑃(𝐵) × 𝑃(𝑬 = 𝒇𝒂𝒍𝒔𝒆) × 𝑃(𝐴 = 𝑡𝑟𝑢𝑒 ∣ 𝐵 = 𝑡𝑟𝑢𝑒, 𝑬 = 𝒇𝒂𝒍𝒔𝒆) × 𝑃(𝐽 =
𝑡𝑟𝑢𝑒 ∣ 𝐴 = 𝑡𝑟𝑢𝑒) × 𝑃(𝑀 = 𝑡𝑟𝑢𝑒 ∣ 𝐴 = 𝑡𝑟𝑢𝑒)
• =0.001×(1−0.002)×0.94×0.90×0.70≈0.0005886
• Total for Case 1 = 0.000001197 + 0.0005886 = 0.000589797
Bayesian Networks
Case 2: The Alarm is OFF (A=false)
• 𝑃(𝐵 = 𝑡𝑟𝑢𝑒, 𝐸, 𝐴 = 𝑓𝑎𝑙𝑠𝑒, 𝐽 = 𝑡𝑟𝑢𝑒, 𝑀 = 𝑡𝑟𝑢𝑒)
• Summing over both states of E:
• 𝑃(𝐵) × 𝑃(𝑬 = 𝒕𝒓𝒖𝒆) × 𝑃(𝑨 = 𝒇𝒂𝒍𝒔𝒆 ∣ 𝐵 = 𝑡𝑟𝑢𝑒, 𝑬 = 𝒕𝒓𝒖𝒆) × 𝑃(𝐽 =
𝑡𝑟𝑢𝑒 ∣ 𝑨 = 𝒇𝒂𝒍𝒔𝒆) × 𝑃(𝑀 = 𝑡𝑟𝑢𝑒 ∣ 𝑨 = 𝒇𝒂𝒍𝒔𝒆)
• =0.001×0.002×(1−0.95)×0.05×0.01=0.0000000005 (negligible)

• 𝑃(𝐵) × 𝑃(𝑬 = 𝒇𝒂𝒍𝒔𝒆) × 𝑃(𝑨 = 𝒇𝒂𝒍𝒔𝒆 ∣ 𝐵 = 𝑡𝑟𝑢𝑒, 𝑬 = 𝒇𝒂𝒍𝒔𝒆) × 𝑃(𝐽 =

𝑡𝑟𝑢𝑒 ∣ 𝑨 = 𝒇𝒂𝒍𝒔𝒆) × 𝑃(𝑀 = 𝑡𝑟𝑢𝑒 ∣ 𝑨 = 𝒇𝒂𝒍𝒔𝒆)
• =0.001×(1−0.002)×(1−0.94)×0.05×0.01≈0.000000002994 (negligible)
• Total for Case 2 ≈0.000000003494
Bayesian Networks
• Numerator P(B=true,J=true,M=true) is the sum of both cases:
0.000589797+0.000000003494 ≈ 0.0005898
Bayesian Networks
Calculation of the Denominator: 𝑷(𝑱 = 𝒕𝒓𝒖𝒆, 𝑴 = 𝒕𝒓𝒖𝒆)
• The denominator is the total probability of John and Mary calling,
regardless of whether a burglary occurred.
• We calculate this by summing the numerator (the burglary case) and
the case where there is no burglary.
• We already have the numerator value.
• Now we need to calculate 𝑃(𝐵 = 𝑓𝑎𝑙𝑠𝑒, 𝐽 = 𝑡𝑟𝑢𝑒, 𝑀 = 𝑡𝑟𝑢𝑒).
Bayesian Networks
• Case C: Alarm is ON (A=true)
• 𝑃 𝐵𝑐 , 𝐽, 𝑀, 𝐴 = 𝑃 𝐵𝑐 𝑃 𝐸 𝑃 𝐴 𝐵𝑐 , 𝐸 𝑃 𝐽 𝐴 𝑃 𝑀 𝐴 +
𝑃 𝐵𝑐 𝑃 𝐸 𝑐 𝑃 𝐴 𝐵𝑐 , 𝐸 𝑐 𝑃 𝐽 𝐴 𝑃 𝑀 𝐴
• B=false, E=true: 0.999×0.002×0.29×0.90×0.70=0.000363065
• B=false, E=false: 0 .999×0.998×0.001×0.90×0.70=0.000628362
• Sum of Case C: 0.000363065+0.000628362=0.000991427
• Case D: Alarm is OFF (A=false)
• 𝑃(𝐵𝑐 , 𝐽, 𝑀, 𝐴𝑐 ) = [𝑃(𝐵𝑐 )𝑃(𝐸)𝑃(𝐴𝑐 ∣ 𝐵𝑐 , 𝐸)𝑃(𝐽 ∣ 𝐴𝑐 )𝑃(𝑀 ∣ 𝐴𝑐 )] +
[𝑃(𝐵𝑐 )𝑃(𝐸 𝑐 )𝑃(𝐴𝑐 ∣ 𝐵𝑐 , 𝐸 𝑐 )𝑃(𝐽 ∣ 𝐴𝑐 )𝑃(𝑀 ∣ 𝐴𝑐 )]
• B=false,E=true: 0.999×0.002×(1−0.29)×0.05×0.01=0.000000699
• B=false,E=false: 0.999×0.998×(1−0.001)×0.05×0.01=0.000004975
• Sum of Case D: 0.000000699+0.000004975=0.000005674
• Total for no burglary:
• 𝑃(𝐵 = 𝑓𝑎𝑙𝑠𝑒, 𝐽, 𝑀) =0.000991427+0.000005674=0.000997101
Bayesian Networks
Total Denominator:
P(J,M) = P(B,J,M) + P(Bc,J,M)
P(J,M) = 0.0005898245 + 0.000997101 = 0.0015869255
Bayesian Networks
Step 3: Combine the results
• Final Conditional Probability Calculation
𝑃 𝐵 = 𝑡𝑟𝑢𝑒, 𝐽 = 𝑡𝑟𝑢𝑒, 𝑀 = 𝑡𝑟𝑢𝑒
𝑃 𝐵 = 𝑡𝑟𝑢𝑒 𝐽 = 𝑡𝑟𝑢𝑒, 𝑀 = 𝑡𝑟𝑢𝑒 =
𝑃 𝐽 = 𝑡𝑟𝑢𝑒, 𝑀 = 𝑡𝑟𝑢𝑒
= 0.0005898/0.0015869≈0.3716
= 37.16%
Markov Models
Weather Model Example
 Imagine a simplified weather system with three possible states:
 S1: Sunny
 S2: Cloudy
 S3: Rainy
 We want to model the probability of the weather on any given day,
based on the weather of the previous day.
 This dependency on only the previous state makes it a first-order Markov
model.
Markov Models
 Transition Probabilities:
 The relationships between states are defined by a transition matrix.
 Each entry in the matrix, 𝑃𝑖𝑗 , represents the probability of moving from
state i to state j in one step (one day).
 Let's use the following probabilities:

To: Sunny (S1) Cloudy (S2) Rainy (S3)

From: Sunny (S1) 0.7 0.2 0.1
From: Cloudy (S2) 0.4 0.4 0.2
From: Rainy (S3) 0.3 0.6 0.1
Markov Models
 Transition Probabilities:
 P11 =0.7: If today is sunny, there's a 70% chance it will be sunny tomorrow.
 P12 =0.2: If today is sunny, there's a 20% chance it will be cloudy tomorrow.
 P23 =0.2: If today is cloudy, there's a 20% chance it will be rainy tomorrow.
 Notice that the sum of the probabilities in each row must equal 1, since the
system must transition to one of the states.

To: Sunny (S1) Cloudy (S2) Rainy (S3)

From: Sunny (S1) 0.7 0.2 0.1
From: Cloudy (S2) 0.4 0.4 0.2
From: Rainy (S3) 0.3 0.6 0.1
Markov Models
 Let's use this model to answer a question: "If today is sunny, what is
the probability that it will be sunny three days from now?“

 We can find the probability of being in a certain state after multiple

steps by multiplying the transition matrix by itself.
 The matrix for two-day transitions, P(2) , is P×P.
 The matrix for three-day transitions, P(3) , is P×P×P.
Markov Models
 The matrix for two-day transitions, P(2) , is P×P.
0.7 0.2 0.1 0.7 0.2 0.1
𝑃2 = 𝑃 × 𝑃 = 0.4 0.4 0.2 × 0.4 0.4 0.2
0.3 0.4 0.1 0.3 0.4 0.1
2
𝑃11 = (0.7 × 0.7) + (0.2 × 0.4) + (0.1 × 0.3) = 0.49 + 0.08 + 0.03 = 0.60

0.6 0.28 0.12

𝑃2 = 0.5 0.36 0.14
0.48 0.38 0.14

This new matrix tells us, for example, that if today is sunny, there is a 60%
chance it will be sunny in two days.
Markov Models
 The matrix for two-day transitions, P(2) , is P×P.
0.7 0.2 0.1 0.7 0.2 0.1
𝑃2 = 𝑃 × 𝑃 = 0.4 0.4 0.2 × 0.4 0.4 0.2
0.3 0.4 0.1 0.3 0.4 0.1
2
𝑃11 = (0.7 × 0.7) + (0.2 × 0.4) + (0.1 × 0.3) = 0.49 + 0.08 + 0.03 = 0.60

0.6 0.28 0.12

𝑃2 = 0.5 0.36 0.14
0.48 0.38 0.14

This new matrix tells us, for example, that if today is sunny, there is a 60%
chance it will be sunny in two days.
Markov Models
 Consider the Markov chain with three states, S={1,2,3}, that has the
following transition matrix
1 1 1
2 4 4
1 2
0
3 3
1 1
0
2 2
 Draw the state transition diagram for this chain.
1
 If we know 𝑃(𝑋1 = 1) = 𝑃(𝑋1 = 2) = 4
, find 𝑃(𝑋1 = 3, 𝑋2 = 2, 𝑋3 = 1)
Markov Models
 Draw the state transition diagram for this chain.
Markov Models
1
 If we know 𝑃(𝑋1 = 1) = 𝑃(𝑋1 = 2) = 4
, find 𝑃(𝑋1 = 3, 𝑋2 = 2, 𝑋3 = 1)
 For a Markov chain the joint probability factors as
𝑃 𝑋1 = 𝑎, 𝑋2 = 𝑏, 𝑋3 = 𝑐 = 𝑃 𝑋1 = 𝑎 𝑃𝑎,𝑏 𝑃𝑏,𝑐
 We are given 𝑃(𝑋1=1)=𝑃(𝑋1=2)=1/4
1 1 1
 Hence𝑃(𝑋1=3)=1 − − = .
4 4 2

1 1
 From the matrix: 𝑃3,2 = 2
; and 𝑃2,1 = 3
.

1 1 1 1
 Therefore 𝑃(𝑋1 = 3, 𝑋2 = 2, 𝑋3 = 1) = ⋅ ⋅ = .
2 2 3 12
MDP
Robot in a Grid World
 States (S):The robot can be in three states: 𝑆 = {𝑠1 = 𝑆𝑡𝑎𝑟𝑡, 𝑠2 =
𝑀𝑖𝑑𝑑𝑙𝑒, 𝑠3 = 𝐺𝑜𝑎𝑙}
 Actions (A):From each state, the robot can choose: 𝐴={Left, Right}
 Transition probabilities (P):
 If the robot is in Start (s1) and takes:
 Right → goes to Middle (s2) with prob 1.
 Left → stays in Start (s1) with prob 1.
 If the robot is in Middle (s2) and takes:
 Right → goes to Goal (s3) with prob 0.8, else falls back to Start (s1) with prob 0.2.
 Left → goes back to Start (s1) with prob 1.
 If in Goal (s3): absorbing state (stays there regardless of action).
 Rewards (R):Moving to Goal (s3) gives reward +10.All other moves give
reward 0.
MDP
Robot in a Grid World
State Action Next State Probability Reward
s1 Right s2 1.0 0
s1 Left s1 1.0 0
s2 Right s3 0.8 +10
s2 Left s1 0.2 0
s3 any s3 1.0 0
MDP
Robot in a Grid World • So, the expected reward starting from
• Suppose the robot always chooses s2 is:
Right (policy π). • 𝑉 𝑠2 = 0.8 × 10 + 0.2 × 𝑉 𝑠1 .

• From s1: • And from s1:

• It goes to s2 with prob 1. • 𝑉 𝑠1 = 𝑉 𝑠2 .

• From s2 with Right: • Solving:

• To Goal (s3) with prob 0.8 → reward = 10 • 𝑉 𝑠2 = 8 + 0.2𝑉 𝑠1 , 𝑉 𝑠1 = 𝑉 𝑠2 .
• To Start (s1) with prob 0.2 → reward = 0, • 𝑉 𝑠1 = 8 + 0.2𝑉 𝑠1 .
and it must try again. • 0.8𝑉 𝑠1 = 8 ⇒ 𝑉 𝑠1 = 10.
• So, if the robot always goes Right, the
expected total reward from the Start is
10.
Dempster-Shafer Theory
• We are diagnosing a patient:
Θ = 𝐻1 = Flu 𝐻2 = Covid 𝐻3 = Other .
• Two tests (A and B) give basic probability assignments (BPAs).

Test A evidence Test B evidence

• 𝑚𝐴 𝐻1 𝐻2 = 0.7( symptom • 𝑚𝐵 𝐻2 = 0.6( strong sign of
suggests Flu or Covid) Covid
• 𝑚𝐴 𝐻3 = 0.1( 10% suggest • 𝑚𝐵 𝐻1 𝐻2 = 0.2( could be Flu
other or Covid
• 𝑚𝐴 Θ = 0.2( : Ignorance • 𝑚𝐵 Θ = 0.2( uncertainty
Dempster-Shafer Theory
Dempster’s Rule
• When we combine two pieces of evidence, the mass of a set 𝐶is:
𝟏
𝒎𝑨𝑩 𝑪 = ෍ 𝒎𝑨 𝑨𝒊 𝒎𝑩 𝑩𝒋
𝟏−𝑲
𝑨𝒊∩𝑩𝒋=𝑪
• The numerator adds up products where the intersection = 𝐶.
• 𝐾 is the conflict mass (where the intersection is empty).
Dempster-Shafer Theory
Conflict 𝐾
• Look for pairs where 𝐴and 𝐵 don’t overlap.
• Example: 𝐴 = 𝐻3 , 𝐵 = 𝐻2 .
• Intersection = ∅.
• Contribution = 0.1 × 0.6 = 0.06.
• So:
• 𝐾 = 0.06.
Normalization factor:
1 1
• = ≈ 1.064.
1−𝐾 0.94
Dempster-Shafer Theory
Test A:
• 𝑚𝐴 𝐻1 𝐻2 = 0.7
• 𝑚𝐴 𝐻3 = 0.1
• 𝑚𝐴 Θ = 0.2

Test B:
• 𝑚𝐵 𝐻2 = 0.6
• 𝑚𝐵 𝐻1 𝐻2 = 0.2
• 𝑚𝐵 Θ = 0.2
Dempster-Shafer Theory
Multiply all pairs
• We take one set from A and one set from B, multiply their masses,
and record the intersection.
Example:
• A gives 𝐻1 𝐻2 ,B gives 𝐻2 .
Intersection = 𝐻2 .
Multiply masses: 0.7 × 0.6 = 0.42.
• Do this for all pairs:
Dempster-Shafer Theory
Multiply all pairs: Do this for all pairs:
From A From B Intersection Mass product
{H1,H2} (0.7) {H2} (0.6) {H2} 0.42
{H1,H2} (0.7) {H1,H2} (0.2) {H1,H2} 0.14
{H1,H2} (0.7) Θ (0.2) {H1,H2} 0.14
{H3} (0.1) {H2} (0.6) ∅ (conflict) 0.06
{H3} (0.1) {H1,H2} (0.2) ∅ (conflict) 0.02
{H3} (0.1) Θ (0.2) {H3} 0.02
Θ (0.2) {H2} (0.6) {H2} 0.12
Θ (0.2) {H1,H2} (0.2) {H1,H2} 0.04
Θ (0.2) Θ (0.2) Θ 0.04
Dempster-Shafer Theory
Add up by set
• For 𝐻2 :0.42 + 0.12 = 0.54
• For 𝐻1 𝐻2 :0.14 + 0.14 + 0.04 = 0.32
• For 𝐻3 :0.02
• For Θ: 0.04
• Conflict = 0.06 + 0.02 = 0.08
Dempster-Shafer Theory
Normalize
• We must remove the conflict by dividing everything by 1 − 𝐾.
• 1 − 𝐾 = 1 − 0.08 = 0.92
• So:
• 𝑚 𝐻2 = 0.54/0.92 ≈ 0.587
• 𝑚 𝐻1 𝐻2 = 0.32/0.92 ≈ 0.348
• 𝑚 𝐻3 = 0.02/0.92 ≈ 0.022
• 𝑚 Θ = 0.04/0.92 ≈ 0.043
Dempster-Shafer Theory
Final combined masses:
 Covid (H2) → 58.7%
 Flu or Covid (H1, H2) → 34.8%
 Other (H3) → 2.2%
 Uncertainty (Θ) → 4.3%
Decision Networks
Final combined masses:
 Covid (H2) → 58.7%
 Flu or Covid (H1, H2) → 34.8%
 Other (H3) → 2.2%
 Uncertainty (Θ) → 4.3%
Decision Networks

Umbrella

Weather

Forecast
Decision Networks
• MEU: choose the action which maximizes the expected utility given the evidence

 Can directly operationalize this with

decision networks
Umbrella
 Bayes nets with nodes for utility and
actions
 Lets us calculate the expected utility for U
each action

Weather
 New node types:
 Chance nodes (just like BNs)

 Actions (rectangles, cannot have parents,

act as observed evidence)
Forecast
 Utility node (diamond, depends on action
and chance nodes)
Decision Networks
 Action selection
 Instantiate all evidence
Umbrella
 Set action node(s) each
possible way U

 Calculate posterior for all

Weather
parents of utility node, given
the evidence

 Calculate expected utility for

each action
Forecast
 Choose maximizing action
Decision Networks
Umbrella = leave

Umbrella

U
Umbrella = take
Weather

A W U(A,W)
W P(W) leave sun 100
sun 0.7 leave rain 0
rain 0.3 take sun 20
Optimal decision = leave
take rain 70
Decision Networks: Notation
Umbrella = leave  EU(leave) = Expected Utility of taking action
leave
 In the parentheses, we write an action
 Calculating EU requires taking an expectation
over chance node outcomes
Umbrella = take  MEU(ø) = Maximum Expected Utility, given
no information
 In the parentheses, we write the evidence (which
nodes we know)
 Calculating MEU requires taking a maximum over
several expectations (one EU per action)
Optimal decision = leave
Decisions as Outcome Trees
{}

Umbrella

U Weather | {} Weather | {}

Weather

U(t,s) U(t,r) U(l,s) U(l,r)

• Almost exactly like expectimax

• What’s changed?
Example: Decision Networks
Umbrella = leave A W U(A,W)
Umbrella leave sun 100
leave rain 0
take sun 20
U take rain 70

Umbrella = take
Weather
W P(W|F=bad)
sun 0.34
rain 0.66

Forecast
Optimal decision = take
=bad
Decision Networks: Notation
Umbrella = leave  EU(leave|bad) = Expected Utility of choosing
leave, given you know the forecast is bad
 Left side of conditioning bar: Action being taken
 Right side of conditioning bar: The random
variable(s) we know the value of (evidence)
Umbrella = take  MEU(F=bad) = Maximum Expected Utility,
given you know the forecast is bad
 In the parentheses, we write the evidence (which
nodes we know)

Optimal decision = take

Decisions as Outcome Trees
Umbrella {b}

U
W | {b} W | {b}
Weather

U(t,s) U(t,r) U(l,s) U(l,r)

Forecast
=bad
Bayesian Decision Theory
• A doctor wants to diagnose whether a patient has Flu (H1) or Covid
(H2).
• Prior probabilities (before seeing test result):
• 𝑃 𝐻1 = 0.6, 𝑃 𝐻2 = 0.4
• (Flu is more common than Covid).
• Test result (Symptom X = "loss of smell"):
• If patient has Flu → probability(symptom) = 0.2
• If patient has Covid → probability(symptom) = 0.7
• Loss (or cost) of wrong decisions:
• Misdiagnosing Covid as Flu = 100 (dangerous mistake).
• Misdiagnosing Flu as Covid = 10 (mild inconvenience).
• Correct diagnosis = 0.
Bayesian Decision Theory
• Step 1: Likelihoods
• 𝑃 𝑋 ∣ 𝐻1 = 0.2, 𝑃 𝑋 ∣ 𝐻2 = 0.7
•

• Step 2: Evidence (normalizing constant)

• 𝑃 𝑋 = 𝑃 𝑋 ∣ 𝐻1 𝑃 𝐻1 + 𝑃 𝑋 ∣ 𝐻2 𝑃 𝐻2
• = 0.2 0.6 + 0.7 0.4 = 0.12 + 0.28 = 0.40
Bayesian Decision Theory
• Step 3: Posterior probabilities
• Using Bayes’ rule:
𝑃 𝑋∣𝐻1 𝑃 𝐻1 0.2⋅0.6 0.12
• 𝑃 𝐻1 ∣ 𝑋 = 𝑃 𝑋
= 0.40
= 0.40
= 0.30
𝑃 𝑋∣𝐻2 𝑃 𝐻2 0.7⋅0.4 0.28
• 𝑃 𝐻2 ∣ 𝑋 = = = = 0.70
𝑃 𝑋 0.40 0.40
• So after seeing the symptom,
• 𝑃 𝐻1 ∣ 𝑋 = 0.30, 𝑃 𝐻2 ∣ 𝑋 = 0.70
Bayesian Decision Theory
• Step 4: Expected Risk for each decision
• We compute expected loss for choosing each hypothesis.
• If we decide H1 (Flu):
• 𝑅 𝐻1 ∣ 𝑋 = 0 ⋅ 𝑃 𝐻1 ∣ 𝑋 + 100 ⋅ 𝑃 𝐻2 ∣ 𝑋 = 0 ⋅ 0.30 + 100 ⋅ 0.70
= 70
• If we decide H2 (Covid):
• 𝑅 𝐻2 ∣ 𝑋 = 10 ⋅ 𝑃 𝐻1 ∣ 𝑋 + 0 ⋅ 𝑃 𝐻2 ∣ 𝑋 = 10 ⋅ 0.30 + 0 ⋅ 0.70
=3
Bayesian Decision Theory
• Step 5: Decision Rule
• Choose the hypothesis with minimum expected risk.
• 𝑅 𝐻1 ∣ 𝑋 = 70, 𝑅 𝐻2 ∣ 𝑋 = 3

So, the Bayesian decision is: diagnose Covid (H2).

MDPs – Value Iteration
Robot in a Grid • Transitions:
• A robot can be in two rooms: • If robot takes Stay → it stays in the
same room with probability 1.
• 𝑠1 =Room 1 (start)
• If robot takes Move:
• 𝑠2 =Room 2
• From 𝑠1 :goes to 𝑠2with probability 1.
• It can take actions: • From 𝑠2 :goes to 𝑠1with probability 1.
• 𝑎 = Stay • Discount factor: 𝛾 = 0.9.
• 𝑎 = Move
• Rewards:
• Staying in 𝑠1 gives +1.
• Moving to 𝑠2 gives +5.
• Staying in 𝑠2 gives +2.
MDPs – Value Iteration
• Define the Value Equations
• The Bellman equation for each state is:
𝑉 𝑠 = max[𝑅 𝑠 𝑎 + 𝛾 ෍ 𝑃 (𝑠 ′ |𝑠, 𝑎)𝑉 𝑠 ′ ]
a
𝑠′

Write for each state

For 𝑠1:
• If Stay:
• 𝑄 𝑠1 Stay = 1 + 0.9𝑉 𝑠1
• If Move:
• 𝑄 𝑠1 Move = 5 + 0.9𝑉 𝑠2
• So: 𝑉(𝑠1 ) = max 1 + 0.9𝑉 𝑠1 , 5 + 0.9𝑉 𝑠2
MDPs – Value Iteration
• For 𝑠2:
• If Stay:
• 𝑄 𝑠2 Stay = 2 + 0.9𝑉 𝑠2
• If Move:
• 𝑄 𝑠2 Move = 0 + 0.9𝑉 𝑠1
• So: max(2 + 0.9𝑉(𝑠2 ), 0.9𝑉(𝑠1 ))
MDPs – Value Iteration
• Step 3: Solve by Iteration (Value Iteration idea)
• Start with 𝑉 𝑠1 = 𝑉 𝑠2 = 0.
• Iteration 1:

• Iteration 2:

• Iteration 3:

• …and it will converge.

MDPs – Value Iteration
• Policy
• From the updates, the best choices are:
• At 𝑠1 :Move (because reward+future is higher).
• At 𝑠2 :Move (back to 𝑠1 ,since that gives higher long-term value).
• So the optimal policy is:
• Always move between rooms instead of staying.
References
• Bayesian Networks: Link
• Markov Models: Link
• Demster Shafer Theory: Link
• Decision Network: Link_1, Link_2
• MDP – Value Iteration: Link, Link_2

Module 4 - Bece309l - Aiml
No ratings yet
Module 4 - Bece309l - Aiml
94 pages
Module 5 - Bece309l - Aiml
No ratings yet
Module 5 - Bece309l - Aiml
108 pages
GRP 9
No ratings yet
GRP 9
15 pages
12655/navjeevan Exp Third Ac (3A)
No ratings yet
12655/navjeevan Exp Third Ac (3A)
3 pages
Static Timing Analysis
No ratings yet
Static Timing Analysis
21 pages
Module5 - DCVSL - Dynamic Logic
No ratings yet
Module5 - DCVSL - Dynamic Logic
15 pages
AWS EC2 Cloud Computing Guide
No ratings yet
AWS EC2 Cloud Computing Guide
73 pages
Module1 Non-idealEffects
No ratings yet
Module1 Non-idealEffects
14 pages
BECE 355L AWS Cloud Module 3 Total
No ratings yet
BECE 355L AWS Cloud Module 3 Total
133 pages
AWS Cloud Computing Course Overview
No ratings yet
AWS Cloud Computing Course Overview
109 pages
Design of High Performance 64 Bit MAC Unit
No ratings yet
Design of High Performance 64 Bit MAC Unit
5 pages
Sequential Circuits
No ratings yet
Sequential Circuits
12 pages
Sequential Logic Circuit Design
100% (1)
Sequential Logic Circuit Design
106 pages
Copper Pipe Bender
No ratings yet
Copper Pipe Bender
1 page
Ovid Amores Text, Prolegomena and Commentary in Four Volumes. Commentary Part I (J. C. McKeown)
0% (1)
Ovid Amores Text, Prolegomena and Commentary in Four Volumes. Commentary Part I (J. C. McKeown)
447 pages
Class11th, P&C, B T, Sequence&Series, Worksheet, TSC, PT ddusDV, Aug, 22nd, 2025
No ratings yet
Class11th, P&C, B T, Sequence&Series, Worksheet, TSC, PT ddusDV, Aug, 22nd, 2025
2 pages
Mathematics P2
No ratings yet
Mathematics P2
17 pages
Quran and Islam Addresses
No ratings yet
Quran and Islam Addresses
5 pages
Full Length Drift Inspection SOP
No ratings yet
Full Length Drift Inspection SOP
3 pages
Process Analysis Essay Guide
100% (1)
Process Analysis Essay Guide
14 pages
PowerShell Commands Quick Guide
100% (2)
PowerShell Commands Quick Guide
2 pages
Prelim Summative Test in Rizalcrs
No ratings yet
Prelim Summative Test in Rizalcrs
3 pages
Reading Comprehension Worksheet
0% (1)
Reading Comprehension Worksheet
6 pages
Boolean Simplification Guide
No ratings yet
Boolean Simplification Guide
33 pages
219 - Nov Dez24
No ratings yet
219 - Nov Dez24
144 pages
Algorithm Design: MST & Huffman Coding
No ratings yet
Algorithm Design: MST & Huffman Coding
3 pages
Solution Manual For Effective Human Relations Interpersonal and Organizational Applications 12th Edition by Reece ISBN 1133960839 9781133960836
No ratings yet
Solution Manual For Effective Human Relations Interpersonal and Organizational Applications 12th Edition by Reece ISBN 1133960839 9781133960836
6 pages
GL300M Series Manage Tool User Guide - V1.01
No ratings yet
GL300M Series Manage Tool User Guide - V1.01
24 pages
使用QUARTUS II做FPGA开发全流程
No ratings yet
使用QUARTUS II做FPGA开发全流程
50 pages
9 Planet Frescos in The Khmer Temples
No ratings yet
9 Planet Frescos in The Khmer Temples
47 pages
Abyssinia/Ethiopia: State Formation and Nation Building Project
No ratings yet
Abyssinia/Ethiopia: State Formation and Nation Building Project
140 pages
Interactive Word Wall Strategies for Literacy
No ratings yet
Interactive Word Wall Strategies for Literacy
1 page
ĐỀ 8
No ratings yet
ĐỀ 8
4 pages
Degrees of Comparison: When We Compare Two Nouns: Comparative. When We Compare Three or More Nouns: Superlative
No ratings yet
Degrees of Comparison: When We Compare Two Nouns: Comparative. When We Compare Three or More Nouns: Superlative
6 pages
Bai Tap Tieng Anh 7 Global Success Unit 3
No ratings yet
Bai Tap Tieng Anh 7 Global Success Unit 3
3 pages
Ottawa Dialect: Ottawa (Or Odawa) Is A Dialect of The Ojibwe Language, Spoken by
No ratings yet
Ottawa Dialect: Ottawa (Or Odawa) Is A Dialect of The Ojibwe Language, Spoken by
23 pages
Gullivers Test Ak
No ratings yet
Gullivers Test Ak
2 pages
OSI Model: A Guide for Developers
100% (1)
OSI Model: A Guide for Developers
6 pages
Borang Practicum Guidelines Be101
No ratings yet
Borang Practicum Guidelines Be101
15 pages
Sheila Walsh
No ratings yet
Sheila Walsh
6 pages
Ultimate Data Science Programming in Python 9365895669
100% (3)
Ultimate Data Science Programming in Python 9365895669
756 pages
Past Perfect and Second Conditional Exercises
No ratings yet
Past Perfect and Second Conditional Exercises
8 pages
Reading Test 1
No ratings yet
Reading Test 1
13 pages

Module 4 - Problems

Uploaded by

Module 4 - Problems

Uploaded by

Module 4

Conditional Probability Tables • 𝑃(𝐽 ∣ 𝐴)

• 𝑃(𝐵) × 𝑃(𝑬 = 𝒇𝒂𝒍𝒔𝒆) × 𝑃(𝑨 = 𝒇𝒂𝒍𝒔𝒆 ∣ 𝐵 = 𝑡𝑟𝑢𝑒, 𝑬 = 𝒇𝒂𝒍𝒔𝒆) × 𝑃(𝐽 =

To: Sunny (S1) Cloudy (S2) Rainy (S3)

To: Sunny (S1) Cloudy (S2) Rainy (S3)

 We can find the probability of being in a certain state after multiple

0.6 0.28 0.12

0.6 0.28 0.12

• From s1: • And from s1:

• From s2 with Right: • Solving:

Test A evidence Test B evidence

 Can directly operationalize this with

 Actions (rectangles, cannot have parents,

 Calculate posterior for all

 Calculate expected utility for

U(t,s) U(t,r) U(l,s) U(l,r)

• Almost exactly like expectimax

Optimal decision = take

U(t,s) U(t,r) U(l,s) U(l,r)

• Step 2: Evidence (normalizing constant)

So, the Bayesian decision is: diagnose Covid (H2).

Write for each state

• …and it will converge.

You might also like