Uncertainty
(Based on slides by UW-AI faculty)
Knowledge Representation
KR Language Ontological Commitment Epistemological Commitment
Propositional Logic facts true, false, unknown
First Order Logic facts, objects, relations true, false, unknown
Temporal Logic facts, objects, relations, times true, false, unknown
Probability Theory facts degree of belief
Fuzzy Logic facts, degree of truth known interval values
Probabilistic Relational Models
- combine probability and first order logic
Need for Reasoning w/ Uncertainty
• The world is full of uncertainty
– chance nodes/sensor noise/actuator error/partial info..
– Logic is brittle
• can’t encode exceptions to rules
• can‘t encode statistical properties in a domain
– Computers need to be able to handle uncertainty
• Probability: new foundation for AI (& CS!)
• Massive amounts of data around today
– Statistics and CS are both about data
– Statistics lets us summarize and understand it
– Statistics is the basis for most learning
• Statistics lets data do our work for us
•
3
Logic vs. Probability
Symbol: Q, R … Random variable: Q …
Boolean values: T, F Domain: you specify
e.g. {heads, tails} [1, 6]
State of the world: Atomic event: complete
Assignment to Q, R specification of world:
…Z Q… Z
• Mutually exclusive
• Exhaustive
Prior probability
(aka Unconditional
prob: P(Q)
Joint distribution:
Prob. of every
Probability Basics
• Begin with a set S: the sample space
– e.g., 6 possible rolls of a die.
• x ϵ S is a sample point/possible world/atomic event
• A probability space or probability model is a sample
space with an assignment P(x) for every x s.t.
0≤P(x)≤1 and ∑P(x) = 1
• An event A is any subset of S
– e.g. A= ‘die roll < 4’
• A random variable is a function from sample points
to some range, e.g., the reals or Booleans
Types of Probability Spaces
•
6
Axioms of Probability Theory
• All probabilities between 0 and 1
– 0 ≤ P(A) ≤ 1
– P(true) = 1
– P(false) = 0.
• The probability of disjunction is:
P( A B) P( A) P(B) P( A B)
A
Tru
B
e
A
Faculty B
•© UW CSE AI •
7
Prior Probability
Joint distribution can answer any
question •
8
Conditional probability
• Conditional or posterior probabilities
e.g., P(cavity | toothache) = 0.8
i.e., given that toothache is all I know there is 80% chance of cavity
• Notation for conditional distributions:
P(Cavity | Toothache) = 2-element vector of 2-element vectors)
• If we know more, e.g., cavity is also given, then we have
P(cavity | toothache, cavity) = 1
• New evidence may be irrelevant, allowing simplification:
P(cavity | toothache, sunny) = P(cavity | toothache) = 0.8
• This kind of inference, sanctioned by domain knowledge, is crucial
•
9
Conditional Probability
• P(A | B) is the probability of A given B
• Assumes that B is the only info known.
• Defined by: P( A B)
P( A | B)
P(B)
A AB B
Tru
e
•© UW CSE AI •1
Faculty 0
Chain Rule/Product Rule
• P(X1, …, Xn) = P(Xn|X1..Xn-1)P(Xn-1|X1..Xn-2)… P(X1)
= ∐ P(Xi|X1,..Xi-1)
Dilemma at the Dentist’s
What is the probability of a cavity given a toothache?
What is the probability of a cavity given the probe
catches?
•1
2
Inference by Enumeration
P(toothache)=.108+.012+.016
+.064
= .20 or 20%
Inference by Enumeration
? + .072
P(toothachecavity) = .20
+ .008 ?
.28
Inference by Enumeration
Complexity of Enumeration
• Worst case time: O(dn)
– Where d = max arity
– And n = number of random variables
• Space complexity also O(dn)
– Size of joint distribution
• Prohibitive!
Independence
• A and B are independent iff:
P(A | B) These two constraints
P(A) are logically equivalent
P(B | A)
• Therefore,
P(B) if A and B are
independent:
P(B)
P( A B)
P( A | B) P( A)
P( A B)
P( A)P(B)
Independence
Complete independence is powerful but
rare
Conditional Independence
Instead of 7 entries, only need
5
Conditional Independence II
P(catch | toothache,cavity) = P(catch |
cavity) P(catch | toothache,cavity) =
P(catch |cavity)
Why only 5 entries in
table?
Power of Cond. Independence
• Often, using conditional independence
reduces the storage complexity of the joint
distribution from exponential to linear!!
• Conditional independence is the most basic &
robust form of knowledge about uncertain
environments.
Bayes Rule Bayes rules!
posterior
P(x, y) P(x | y)P( y) P( y | x)P(x)
P( y | x) P(x) likelihood
P( x y)
prior
P( y)
evidence
Computing Diagnostic Prob. from Causal Prob.
E.g. let M be meningitis, S be stiff
neck
P(M) = 0.0001,
P(S) = 0.1,
P(S|M)= 0.8
P(M|S)
Other forms of Bayes Rule
P( y | x) P(x) likelihood
P( x y) P( y) evidence
prior
P( y | x) P(x)
P(x y)
P( y | x)
x
P(x)
P(x y) P( y |
x)P(x) posterior
likelihood prior
Conditional Bayes Rule
P( y | x, z) P(x |
P( x y, z) P( y | z)
z)
P( y | x, z) P(x, z)
P(x y, z)
P( y | x, z) P(x |
x
z)
P(x y, z) P( y | x,
z)P(x | z)
Bayes’ Rule & Cond. Independence
•2
6