Model
So, they are unable to capture higher order correlations among amino-acids.
Markov Chain assumption of independent events
Probabilities of states are supposed to be independent which is not true of biology
Eg, P(y) must be independent of P(x), and vice versa
P(x … P(y
Standard Machine Learning Problems
In the training problem, we need to watch out for local maxima and so model may not converge to
a truly optimal parameter set for a given training set. Secondly, since the model is only as good as
your training set, this may lead to over-fitting.
1. Open areas for research in HMMs in biology
Integration of structural information into profile HMMs.
Despite the almost obvious application of using structural information on a member protein family
when one exists to better the parameterization of the HMM, this has been extremely hard to achieve
in practice.
Model architecture
The architectures of HMMs have largely been chosen to be the simplest architectures that can fit the
observed data. Is this the best architecture to use? Can one use protein structure knowledge to make
better architecture decisions, or, in limited regions, to learn the architecture directly from the data?
Will these implied architectures have implications for our structural understanding?