Conditional
Independence
Definition
Conditional independence is a fundamental concept in probability
theory and graphical models. It refers to a situation where two
random variables are independent of each other given a third
variable.
• Formal Definition:
• Let A, B, and C be three random variables.
We say A is conditionally independent of B given C, denoted as:
• if:
• This means: Once we know C, knowing B doesn't give any more information about A
and vice versa.
Why Conditional Independence
Matters?
• It simplifies probability distributions
• It reduces computation in inference algorithms
• It captures causal and statistical relationships
• It's foundational in Bayesian Networks and Markov
Random Fields.
Conditional Independence in
Graphical Models
Conditional
Independence
Model Type Example
Representatio
n
Bayesian Directed Acyclic Causal Uses d-
Networks Graphs (DAGs) relationships separation
Markov Spatial data,
Undirected Uses graph
Random image
Graphs separation
Fields (MRFs) segmentation
D-Separation (in Bayesian
Networks)
• D-Separation is a rule used to decide whether two
nodes are conditionally independent given some
evidence
• Key Structures in Bayesian Networks:
Structur
Pattern Conditional Independence
e
A→B→
Chain A⫫C
C
A←B→
Fork A⫫C
C
A / C (In a collider, A and B become
A → C ← dependent when C is observed or
Conditional Independence in Markov
Random Fields (MRFs)
In MRFs, conditional independence is determined by
graph separation:
• If two nodes A and B are not connected directly, and
all paths are blocked by a set of nodes C, then:
Real-World Scenario - Autonomous
Vehicle Decision System
• An autonomous vehicle must decide whether to brake
in a given situation. Let’s define the following variables:
• W = Weather (e.g., Rainy, Clear)
• T = Traffic (e.g., Heavy, Light)
• O = Obstacle Detected (e.g., Person, Object on the road)
• S = Sensor Reading (e.g., LiDAR or camera detection)
• B = Brake Decision (Yes/No)
Dependencies Between Variables
• We define the following structure:
• W → T (Weather affects traffic)
• W → S (Weather affects sensor accuracy)
• O → S (Obstacle influences sensor reading)
• T → O (Traffic increases chances of obstacles)
• S, O → B (Sensor reading and real obstacle status influence
the braking decision)
Conditional Independences
•T⫫S|W
• Once you know the weather, traffic and sensor readings are
conditionally independent.
• B ⫫ T | S, O
• Once you know what the sensors are saying and whether
there’s an obstacle, traffic conditions no longer influence the
brake decision
• S ⫫ T | O, W
• Once we know both the obstacle and weather, traffic doesn’t
influence sensor output
DAG
Conditional Probability Table
W (No parents — prior distribution)
Weather P(Weather)
Clear 0.7
Rainy 0.3
Traffic Depends on Weather
Weather Traffic = Light Traffic = heavy
Clear 0.3 0.7
Rainy 0.7 0.3
Obstacle Depends on Traffic
Traffic Obstacle = No Obstacle = Yes
Light 0.9 0.1
Heavy 0.5 0.5
Sensor Reading Depends on Weather and Obstacle
Obstacl Sensor Reading = Sensor Reading =
Weather
e low high
Clear No 0.9 0.1
Clear Yes 0.2 0.8
Rainy No 0.6 0.4
Rainy Yes 0.1 0.9
Breaking Decision Depends on Sensor Reading and
Obstacle
Sensor Obstacl Breaking Decision Breaking Decision =
Reading e =Y N
Low No 0.95 0.05
Low Yes 0.4 0.6
High No 0.5 0.5
High Yes 0.1 0.9
Markov Random Fields (MRF)
• These mathematical models allow us to represent and
reason about complex systems involving multiple variables,
while avoiding some of the limitations of other approaches
like Bayesian networks or decision trees
• Markov Random Fields are models for representing joint
probability distributions over a set of variables
• it captures the dependencies between these variables using
an explicit graph structure
• The basic idea is that each variable in the model is
influenced by its neighbors, but not by distant variables,
which simplifies complex dependencies
• In an MRF, the structure is an undirected graph, where
each node represents a random variable, and edges
represent dependencies between them
• Markov property, which states that the conditional
distribution of a node (variable) depends only on its
neighbors, not on the entire graph
• Local Interactions: MRFs are particularly useful when
you need to model systems with local interactions,
where each node interacts primarily with its nearby
neighbors
Formal Mathematical Model
• The joint probability distribution for a set of variables in
a Markov Random Field can be expressed as:
Where:
• Z is a normalization constant (partition function).
• C represents a clique (a set of neighboring variables) in
the graph.
• is a potential function associated with clique C that
expresses how likely a particular configuration of the
clique is
Example in Action: Denoising an
Image
• Initial Noisy Image: You start with a noisy image, where each
pixel has been randomly altered.
• Neighborhood Dependency: Assume that nearby pixels are
likely to have similar values. This reflects the real-world property
that neighboring pixels in an image often share color or intensity.
• Inference with MRF: You use the Markov Random Field to
compute the most likely configuration of pixel values by
considering both the noisy image and the dependencies between
neighboring pixels.
• Result: The output is a denoised version of the image, where the
pixel values have been adjusted to reflect the local dependencies,
removing the noise.
Why MRF is Useful in this Scenario?
• Local Dependencies: In images, the value of each
pixel is influenced by the pixels around it (spatial
dependencies), making MRFs an excellent fit for tasks
like image denoising.
• Efficient Representation: Instead of modeling
complex global dependencies (which would be
computationally expensive), MRFs allow you to focus on
local interactions and dependencies, simplifying the
problem.