0% found this document useful (0 votes)

84 views6 pages

EM Algorithm for Coin Toss Analysis

The EM algorithm is an iterative method to find maximum likelihood estimates of parameters in probabilistic models with latent variables. It alternates between performing an expectation (E) step, which computes the expectation of the log-likelihood evaluated using the current estimate for the parameters, and a maximization (M) step, which computes parameters maximizing the expected log-likelihood found on the E step. The EM algorithm is commonly used for problems like clustering and missing data imputation.

Uploaded by

Dev Goyal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

84 views6 pages

EM Algorithm for Coin Toss Analysis

Uploaded by

Dev Goyal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

EM Algorithm

• It stands for Expectation – Maximization Algorithm

• It is used to find latent variable (variables that are not directly observable and are actually
inferred from the values of the other observed variables). A latent variable is a random
variable which you can't observe neither in training nor in test phase.
• It is basic for many unsupervised clustering algorithm.

Steps involved in EM Algorithm:

1. Initially, a set of initial values are considered (A set of incomplete data is given to system).
2. Expectation or E-Step – Here, we use observed data to estimate or guess the values of
missing/incomplete data.
3. Maximization or M-Step – Here we use the complete data generated in preceding e-step to
update the values.
4. We check if values are converging/not
If converging then stop the process otherwise repeat the step 2 and 3 till the convergence
occurs.
Usage of EM Algorithm:

• Used to fill missing data.

• Used for unsupervised clustering
• Used to discover values of latent variables

Advantages:

• With each iteration, likelihood increases

• E-step and M-step are easy to implement

Disadvantages:

• Slow convergence
• Makes convergence to local optimal

Application of EM Algorithm:

The latent variable model has several real-life applications in Machine learning:

• Used to calculate the Gaussian density of a function.

• Helpful to fill in the missing data during a sample.
• It finds plenty of use in different domains such as Natural Language Processing (NLP), Computer
Vision, etc.
• Used in image reconstruction in the field of Medicine and Structural Engineering.
• Used for estimating the parameters of the Hidden Markov Model (HMM) and also for some
other mixed models like Gaussian Mixture Models, etc.

Example -1 (Two Coin Problem)

• Assume that we have two coins, C1 and C2

• Assume the bias of C1 is t1 (i.e., probability of getting heads with C1)
• Assume the bias of C2 is t2 (i.e., probability of getting heads with C2)
• We want to find t1 and t2 by performing a number of trials (i.e., coin tosses)

Solution:

First Experiment –

• We choose 5 times one of the coins

• We toss the chosen coin 10 times.
B H T T T H H T H T H
A H H H H T H H H H H
A H T H H H H H T H H
B H T H T T T H H T T
A T H H H T H H H T H

t1 = Number of heads using C1 / total number of flips using C1

t2 = Number of heads using C2 / total number of flips using C2

Coin A Coin B
5H, 5T
9H, 1T
8H, 2T
4H, 6T
7H, 3T

t1 = 24/(24+6) = 24/30 = 0.8

t2 = 9/(9+11) = 9/20 = 0.45

• We have two coins: A and B

• The probabilities for heads are tA and tB
• We have 5 measurement sets including 10 coin tosses in each set
• If we know which of the coins are tossed in each set, we can calculate the ML probabilities for t A
and tB
• If we don’t know which of the coins are tossed in each set, ML estimates cannot be calculated
directly. → EM Algorithm
• Binomial distribution used to calculate probabilities:
L(C) = tk (1 – t)n-k

Where t = probability of that particular point

n = Number of flips taken by the coin
k = Number of heads taken by particular coin

EM Coin Problem Solved

• Assume that for first coin probability for head is 0.60

• Assume that for first coin probability for head is 0.50
• For first trial (toss) 5 heads and 5 tails
• Use binomial probability distribution for likelihood of head
• k is number of tails is 5 and n is trials 10
Likelihood for first coin flips

L (A) = 0.65 (1 – 0.6)10-5 = 0.0007963

L (B) = 0.55 (1 – 0.5)10-5 = 0.0009766

P (A) = L (A) / L (A) + L (B) = 0.0007963 / (0.0007963 + 0.0009766) = 0.45

P (B) = L (B) / L (A) + L (B) = 0.0009766 / (0.0007963 + 0.0009766) = 0.55

In similar fashion find probability of all coins with all flips. It will be as follows:

L (H) = Likely no of heads [P(A) * Total no of heads)]

L (T) = Likely no of tails [P(A) * Total no of tails)]

Iteration 1->: Coin A Coin B

P(A) P(B) L(H) L(T) L(H) L(T)
B H T T T H H T H T H 0.45 0.55 2.2 2.2 2.8 2.8
A H H H H T H H H H H 0.80 0.20 7.2 0.8 1.8 0.2
A H T H H H H H T H H 0.73 0.27 5.9 1.5 2.1 0.5
B H T H T T T H H T T 0.35 0.65 1.4 2.1 2.6 3.9
A T H H H T H H H T H 0.65 0.35 4.5 1.9 2.5 1.1

For Coin A

∑ L (H) = 21.3 and ∑ L (T) = 8.6

t1 = 21.3 / (21.3 + 8.6) = 0.71

For Coin B

∑ L (H) = 11.7 and ∑ L (T) = 8.4

t1 = 11.7 / (11.7 + 8.4) = 0.58

Python Code:
import numpy as np

def coin_em(rolls, theta_A, theta_B, maxiter=10):

# Initial Guess
theta_A = theta_A or random.random()
theta_B = theta_B or random.random()
thetas = [(theta_A, theta_B)]
# Iterate
for c in range(maxiter):
print("#%d:\t%0.2f %0.2f" % (c, theta_A, theta_B))
heads_A, tails_A, heads_B, tails_B = e_step(rolls, theta_A, theta_B)
theta_A, theta_B = m_step(heads_A, tails_A, heads_B, tails_B)

thetas.append((theta_A,theta_B))
return thetas, (theta_A,theta_B)

def e_step(rolls, theta_A, theta_B):

"""Produce the expected value for heads_A, tails_A, heads_B, tails_B
over the rolls given the coin biases"""

heads_A, tails_A = 0,0

heads_B, tails_B = 0,0
for trial in rolls:
likelihood_A = coin_likelihood(trial, theta_A)
likelihood_B = coin_likelihood(trial, theta_B)
p_A = likelihood_A / (likelihood_A + likelihood_B)
p_B = likelihood_B / (likelihood_A + likelihood_B)
heads_A += p_A * trial.count("H")
tails_A += p_A * trial.count("T")
heads_B += p_B * trial.count("H")
tails_B += p_B * trial.count("T")
return heads_A, tails_A, heads_B, tails_B

def m_step(heads_A, tails_A, heads_B, tails_B):

"""Produce the values for theta that maximize the expected number of heads/tails"""

# Replace dummy values with your implementation

theta_A = heads_A / (heads_A + tails_A)
theta_B = heads_B / (heads_B + tails_B)
return theta_A, theta_B

def coin_likelihood(roll, bias):

# P(X | Z, theta)
numHeads = roll.count("H")
flips = len(roll)
return pow(bias, numHeads) * pow(1-bias, flips-numHeads)
rolls = [ "HTTTHHTHTH", "HHHHTHHHHH", "HTHHHHHTHH",
"HTHTTTHHTT", "THHHTHHHTH" ]
thetas, _ = coin_em(rolls, 0.6, 0.5, maxiter=6)

Output:

#0: 0.60 0.50

#1: 0.71 0.58
#2: 0.75 0.57
#3: 0.77 0.55
#4: 0.78 0.53
#5: 0.79 0.53

Lec 5
No ratings yet
Lec 5
73 pages
Expectation Maximization Algorithm
No ratings yet
Expectation Maximization Algorithm
25 pages
Sem-6 Statistical-Analysis 26089
No ratings yet
Sem-6 Statistical-Analysis 26089
11 pages
DMML2023 Lecture16 09mar2023
No ratings yet
DMML2023 Lecture16 09mar2023
16 pages
Hw1renewed 1 Merged
No ratings yet
Hw1renewed 1 Merged
9 pages
EM Algorithm Explained: Coin Toss
No ratings yet
EM Algorithm Explained: Coin Toss
7 pages
Notes 19
No ratings yet
Notes 19
11 pages
EM Presentation 2013
No ratings yet
EM Presentation 2013
18 pages
Xu-Ly-Ngon-Ngu-Tu-Nhien - Regina-Barzilay - Lec5-The-Em-Algorithm - (Cuuduongthancong - Com)
No ratings yet
Xu-Ly-Ngon-Ngu-Tu-Nhien - Regina-Barzilay - Lec5-The-Em-Algorithm - (Cuuduongthancong - Com)
73 pages
03 Lectureslides ParameterInference
No ratings yet
03 Lectureslides ParameterInference
24 pages
Bayesian Probability Calculation Example
No ratings yet
Bayesian Probability Calculation Example
6 pages
2. Given Information:: 3. Calculate ε
No ratings yet
2. Given Information:: 3. Calculate ε
9 pages
Probabilistic Theory of Deep Learning
No ratings yet
Probabilistic Theory of Deep Learning
11 pages
DI&M Part3
No ratings yet
DI&M Part3
18 pages
EM Algorithm
No ratings yet
EM Algorithm
10 pages
Statistics 100A Homework 1 Solutions: Ryan Rosario
No ratings yet
Statistics 100A Homework 1 Solutions: Ryan Rosario
11 pages
BMC Probability
No ratings yet
BMC Probability
4 pages
Probability of Choosing Coin D
No ratings yet
Probability of Choosing Coin D
6 pages
Disertation Final
No ratings yet
Disertation Final
31 pages
CS464 Ch3 Estimation
No ratings yet
CS464 Ch3 Estimation
56 pages
Probability Distributions Lecture
No ratings yet
Probability Distributions Lecture
58 pages
Datascince 2
No ratings yet
Datascince 2
90 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
6 pages
Mle & Map
No ratings yet
Mle & Map
21 pages
EEC 126 Discussion 4 Solutions
100% (1)
EEC 126 Discussion 4 Solutions
4 pages
MATLAB Histogram Normalization Guide
No ratings yet
MATLAB Histogram Normalization Guide
27 pages
Introstat
No ratings yet
Introstat
16 pages
ML and MAP - HTML
No ratings yet
ML and MAP - HTML
9 pages
CS124 Lecture 1 - Jan 27
No ratings yet
CS124 Lecture 1 - Jan 27
16 pages
Course Overview: - Tuesday Lecture - Thursday Computer Lab
No ratings yet
Course Overview: - Tuesday Lecture - Thursday Computer Lab
61 pages
PSFDSLAB Exp1
No ratings yet
PSFDSLAB Exp1
10 pages
34-Em Algorithm-07-03-2023
No ratings yet
34-Em Algorithm-07-03-2023
19 pages
Theorem
No ratings yet
Theorem
14 pages
Foundations of Probability in Python - Part 1
No ratings yet
Foundations of Probability in Python - Part 1
44 pages
Cls 13
No ratings yet
Cls 13
24 pages
Bayes
No ratings yet
Bayes
1 page
Sol 3
No ratings yet
Sol 3
7 pages
Bayesian Statistics with Python
No ratings yet
Bayesian Statistics with Python
9 pages
Report Endterm
No ratings yet
Report Endterm
30 pages
03 MLE MAP NBayes-1-21-2015
No ratings yet
03 MLE MAP NBayes-1-21-2015
40 pages
Programming Bayes Rule Optional
No ratings yet
Programming Bayes Rule Optional
17 pages
Essentials On The Analysis of Randomized Algorithms: 1 Basics
No ratings yet
Essentials On The Analysis of Randomized Algorithms: 1 Basics
8 pages
ML Lecture 03 - Probabilistic Inference (Spring 2024)
No ratings yet
ML Lecture 03 - Probabilistic Inference (Spring 2024)
46 pages
Probability & Random Variables
No ratings yet
Probability & Random Variables
8 pages
Report Mid
No ratings yet
Report Mid
19 pages
Frequentist vs Bayesian Analysis
No ratings yet
Frequentist vs Bayesian Analysis
18 pages
Math Project Probability Class10
No ratings yet
Math Project Probability Class10
11 pages
CS109 Probability and Statistics Cheat Sheet
No ratings yet
CS109 Probability and Statistics Cheat Sheet
13 pages
Reliability Engineering Unit I
No ratings yet
Reliability Engineering Unit I
93 pages
Test Tasks Camp 2025
No ratings yet
Test Tasks Camp 2025
5 pages
Tutorial 1
No ratings yet
Tutorial 1
5 pages
Assignment-1 ML
No ratings yet
Assignment-1 ML
1 page
Bayesian Prediction for Students
No ratings yet
Bayesian Prediction for Students
4 pages
5: Discrete Random Variables: Probability Mass Functions and Expectations
No ratings yet
5: Discrete Random Variables: Probability Mass Functions and Expectations
18 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
Problem Set 7
No ratings yet
Problem Set 7
11 pages
HT Week 1 Assignment Answers
No ratings yet
HT Week 1 Assignment Answers
4 pages
Intermediate Statistics Using SPSS 1st Edition Knapp Solutions Manual Digital Version 2025
100% (1)
Intermediate Statistics Using SPSS 1st Edition Knapp Solutions Manual Digital Version 2025
146 pages
Z-Test and T-Test Procedures Explained
No ratings yet
Z-Test and T-Test Procedures Explained
45 pages
Business Statistics Exam Questions
No ratings yet
Business Statistics Exam Questions
4 pages
Statistics Q Sol.
No ratings yet
Statistics Q Sol.
44 pages
AKC Notes D18 08 23T12-45-1
No ratings yet
AKC Notes D18 08 23T12-45-1
18 pages
SM025 KMJ Muafakat Set 2 (Question)
No ratings yet
SM025 KMJ Muafakat Set 2 (Question)
5 pages
Cost Analysis for Freight Transport
No ratings yet
Cost Analysis for Freight Transport
3 pages
Statistical Methods for Quality Management
No ratings yet
Statistical Methods for Quality Management
2 pages
M3 Part 2: Regression Analysis
No ratings yet
M3 Part 2: Regression Analysis
21 pages
Module02 ControlChart Part1
No ratings yet
Module02 ControlChart Part1
45 pages
PY3CO34 Biostatistics and Research Methodology
No ratings yet
PY3CO34 Biostatistics and Research Methodology
2 pages
Normal Distribution Tables Overview
No ratings yet
Normal Distribution Tables Overview
1 page
Syllabus - App Lab
No ratings yet
Syllabus - App Lab
1 page
Allergic Rhinitis RCT Data Analysis
No ratings yet
Allergic Rhinitis RCT Data Analysis
10 pages
M.Sc. Statistics Syllabus - BHU
No ratings yet
M.Sc. Statistics Syllabus - BHU
25 pages
Information Theory: Dr. Eng. Sattar B. Sadkhan MSC - Sarah Abd UL - Rizah
No ratings yet
Information Theory: Dr. Eng. Sattar B. Sadkhan MSC - Sarah Abd UL - Rizah
23 pages
How-To-Measure-Uncertainty 2
No ratings yet
How-To-Measure-Uncertainty 2
6 pages
Probability Distributions Assessment
No ratings yet
Probability Distributions Assessment
8 pages
The T Distribution
No ratings yet
The T Distribution
5 pages
%% Implementation of Generation and Decoding of Cyclic Codes
No ratings yet
%% Implementation of Generation and Decoding of Cyclic Codes
7 pages
CampusX DSMP 2.0 Syllabus
No ratings yet
CampusX DSMP 2.0 Syllabus
66 pages
C. Tamilselvi and G. Mohan Naidu
No ratings yet
C. Tamilselvi and G. Mohan Naidu
10 pages
Unit 6 Progress Check FRQ - Scoring Guide
No ratings yet
Unit 6 Progress Check FRQ - Scoring Guide
10 pages
VTU Model Question Paper 18EC44 1
No ratings yet
VTU Model Question Paper 18EC44 1
3 pages
Extreme Wind Speeds in The Kingdom of Saudi Arabia: A. M. Arafab, G. H. Siddiqi, and A. Dakbeelallab
No ratings yet
Extreme Wind Speeds in The Kingdom of Saudi Arabia: A. M. Arafab, G. H. Siddiqi, and A. Dakbeelallab
10 pages
PSR Lecture2356
No ratings yet
PSR Lecture2356
25 pages
Logistics Regression Exercise
No ratings yet
Logistics Regression Exercise
2 pages
Understanding P-Values and Significance
No ratings yet
Understanding P-Values and Significance
3 pages
Data Reduction Principles Explained
No ratings yet
Data Reduction Principles Explained
14 pages