0% found this document useful (0 votes)
84 views6 pages

EM Algorithm for Coin Toss Analysis

The EM algorithm is an iterative method to find maximum likelihood estimates of parameters in probabilistic models with latent variables. It alternates between performing an expectation (E) step, which computes the expectation of the log-likelihood evaluated using the current estimate for the parameters, and a maximization (M) step, which computes parameters maximizing the expected log-likelihood found on the E step. The EM algorithm is commonly used for problems like clustering and missing data imputation.

Uploaded by

Dev Goyal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views6 pages

EM Algorithm for Coin Toss Analysis

The EM algorithm is an iterative method to find maximum likelihood estimates of parameters in probabilistic models with latent variables. It alternates between performing an expectation (E) step, which computes the expectation of the log-likelihood evaluated using the current estimate for the parameters, and a maximization (M) step, which computes parameters maximizing the expected log-likelihood found on the E step. The EM algorithm is commonly used for problems like clustering and missing data imputation.

Uploaded by

Dev Goyal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

EM Algorithm

• It stands for Expectation – Maximization Algorithm


• It is used to find latent variable (variables that are not directly observable and are actually
inferred from the values of the other observed variables). A latent variable is a random
variable which you can't observe neither in training nor in test phase.
• It is basic for many unsupervised clustering algorithm.

Steps involved in EM Algorithm:

1. Initially, a set of initial values are considered (A set of incomplete data is given to system).
2. Expectation or E-Step – Here, we use observed data to estimate or guess the values of
missing/incomplete data.
3. Maximization or M-Step – Here we use the complete data generated in preceding e-step to
update the values.
4. We check if values are converging/not
If converging then stop the process otherwise repeat the step 2 and 3 till the convergence
occurs.
Usage of EM Algorithm:

• Used to fill missing data.


• Used for unsupervised clustering
• Used to discover values of latent variables

Advantages:

• With each iteration, likelihood increases


• E-step and M-step are easy to implement

Disadvantages:

• Slow convergence
• Makes convergence to local optimal

Application of EM Algorithm:

The latent variable model has several real-life applications in Machine learning:

• Used to calculate the Gaussian density of a function.


• Helpful to fill in the missing data during a sample.
• It finds plenty of use in different domains such as Natural Language Processing (NLP), Computer
Vision, etc.
• Used in image reconstruction in the field of Medicine and Structural Engineering.
• Used for estimating the parameters of the Hidden Markov Model (HMM) and also for some
other mixed models like Gaussian Mixture Models, etc.

Example -1 (Two Coin Problem)

• Assume that we have two coins, C1 and C2


• Assume the bias of C1 is t1 (i.e., probability of getting heads with C1)
• Assume the bias of C2 is t2 (i.e., probability of getting heads with C2)
• We want to find t1 and t2 by performing a number of trials (i.e., coin tosses)

Solution:

First Experiment –

• We choose 5 times one of the coins


• We toss the chosen coin 10 times.
B H T T T H H T H T H
A H H H H T H H H H H
A H T H H H H H T H H
B H T H T T T H H T T
A T H H H T H H H T H

t1 = Number of heads using C1 / total number of flips using C1

t2 = Number of heads using C2 / total number of flips using C2

Coin A Coin B
5H, 5T
9H, 1T
8H, 2T
4H, 6T
7H, 3T

t1 = 24/(24+6) = 24/30 = 0.8

t2 = 9/(9+11) = 9/20 = 0.45

• We have two coins: A and B


• The probabilities for heads are tA and tB
• We have 5 measurement sets including 10 coin tosses in each set
• If we know which of the coins are tossed in each set, we can calculate the ML probabilities for t A
and tB
• If we don’t know which of the coins are tossed in each set, ML estimates cannot be calculated
directly. → EM Algorithm
• Binomial distribution used to calculate probabilities:
L(C) = tk (1 – t)n-k

Where t = probability of that particular point


n = Number of flips taken by the coin
k = Number of heads taken by particular coin

EM Coin Problem Solved

• Assume that for first coin probability for head is 0.60


• Assume that for first coin probability for head is 0.50
• For first trial (toss) 5 heads and 5 tails
• Use binomial probability distribution for likelihood of head
• k is number of tails is 5 and n is trials 10
Likelihood for first coin flips

L (A) = 0.65 (1 – 0.6)10-5 = 0.0007963

L (B) = 0.55 (1 – 0.5)10-5 = 0.0009766

P (A) = L (A) / L (A) + L (B) = 0.0007963 / (0.0007963 + 0.0009766) = 0.45

P (B) = L (B) / L (A) + L (B) = 0.0009766 / (0.0007963 + 0.0009766) = 0.55

In similar fashion find probability of all coins with all flips. It will be as follows:

L (H) = Likely no of heads [P(A) * Total no of heads)]

L (T) = Likely no of tails [P(A) * Total no of tails)]

Iteration 1->: Coin A Coin B


P(A) P(B) L(H) L(T) L(H) L(T)
B H T T T H H T H T H 0.45 0.55 2.2 2.2 2.8 2.8
A H H H H T H H H H H 0.80 0.20 7.2 0.8 1.8 0.2
A H T H H H H H T H H 0.73 0.27 5.9 1.5 2.1 0.5
B H T H T T T H H T T 0.35 0.65 1.4 2.1 2.6 3.9
A T H H H T H H H T H 0.65 0.35 4.5 1.9 2.5 1.1

For Coin A

∑ L (H) = 21.3 and ∑ L (T) = 8.6

t1 = 21.3 / (21.3 + 8.6) = 0.71

For Coin B

∑ L (H) = 11.7 and ∑ L (T) = 8.4

t1 = 11.7 / (11.7 + 8.4) = 0.58


Python Code:
import numpy as np

def coin_em(rolls, theta_A, theta_B, maxiter=10):


# Initial Guess
theta_A = theta_A or random.random()
theta_B = theta_B or random.random()
thetas = [(theta_A, theta_B)]
# Iterate
for c in range(maxiter):
print("#%d:\t%0.2f %0.2f" % (c, theta_A, theta_B))
heads_A, tails_A, heads_B, tails_B = e_step(rolls, theta_A, theta_B)
theta_A, theta_B = m_step(heads_A, tails_A, heads_B, tails_B)

thetas.append((theta_A,theta_B))
return thetas, (theta_A,theta_B)

def e_step(rolls, theta_A, theta_B):


"""Produce the expected value for heads_A, tails_A, heads_B, tails_B
over the rolls given the coin biases"""

heads_A, tails_A = 0,0


heads_B, tails_B = 0,0
for trial in rolls:
likelihood_A = coin_likelihood(trial, theta_A)
likelihood_B = coin_likelihood(trial, theta_B)
p_A = likelihood_A / (likelihood_A + likelihood_B)
p_B = likelihood_B / (likelihood_A + likelihood_B)
heads_A += p_A * trial.count("H")
tails_A += p_A * trial.count("T")
heads_B += p_B * trial.count("H")
tails_B += p_B * trial.count("T")
return heads_A, tails_A, heads_B, tails_B

def m_step(heads_A, tails_A, heads_B, tails_B):


"""Produce the values for theta that maximize the expected number of heads/tails"""

# Replace dummy values with your implementation


theta_A = heads_A / (heads_A + tails_A)
theta_B = heads_B / (heads_B + tails_B)
return theta_A, theta_B

def coin_likelihood(roll, bias):


# P(X | Z, theta)
numHeads = roll.count("H")
flips = len(roll)
return pow(bias, numHeads) * pow(1-bias, flips-numHeads)
rolls = [ "HTTTHHTHTH", "HHHHTHHHHH", "HTHHHHHTHH",
"HTHTTTHHTT", "THHHTHHHTH" ]
thetas, _ = coin_em(rolls, 0.6, 0.5, maxiter=6)

Output:

#0: 0.60 0.50


#1: 0.71 0.58
#2: 0.75 0.57
#3: 0.77 0.55
#4: 0.78 0.53
#5: 0.79 0.53

You might also like