Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2002, Journal of Applied Probability
…
16 pages
1 file
We consider the estimation of Markov transition matrices by Bayes’ methods. We obtain large and moderate deviation principles for the sequence of Bayesian posterior distributions.
Statistica sinica, 1997
The Bayesian bootstrap for Markov chains is the Bayesian analogue of the bootstrap method for Markov chains. We construct a random-weighted empirical distribution, based on i.i.d. exponential random variables, to simulate the posterior distribution of the transition probability, the stationary probability, as well as the first hitting time up to a specific state, of a finite state ergodic Markov chain. The large sample theory is developed which shows that with a matrix beta prior on the transition probability, the Bayesian bootstrap procedure is second-order consistent for approximating the pivot of the posterior distributions of the transition probability. The small sample properties of the Bayesian bootstrap are also discussed by a simulation study.
P m (x, ·) − Π(·) T V → 0 as m → ∞.
Statistics & Probability Letters, 2014
In this paper we develop a statistical estimation technique to recover the transition kernel P of a Markov chain X = (X m ) m∈N in presence of censored data. We consider the situation where only a sub-sequence of X is available and the time gaps between the observations are iid random variables. Under the assumption that neither the time gaps nor their distribution are known, we provide an estimation method which applies when some transitions in the initial Markov chain X are known to be unfeasible. A consistent estimator of P is derived in closed form as a solution of a minimization problem. The asymptotic performance of the estimator is then discussed in theory and through numerical simulations.
Given a Markov chain with uncertain transition probabilities modelled in a Bayesian way, we investigate a technique for analytically approximating the mean transition frequency counts over a finite horizon. Conventional techniques for addressing this problem either require the enumeration of a set of generalized process "hyperstates" whose cardinality grows exponentially with the terminal horizon, or axe limited to the two-state case and expressed in terms of hypergeometric series. Our approach makes use of a diffusion approximation technique for modelling the evolution of information state components of the hyperstate process. Interest in this problem stems from a consideration of the policy evaluation step of policy iteration algorithms applied to Markov decision processes with uncertain transition probabilities.
2020
Note this is a random variable with expected value π(f) (i.e. the estimator is unbiased) and standard deviation of order O(1/ √ N). Then by CLT, the errorπ(f) − π(f) will have a limiting normal distribution as N → ∞. Therefore we can compute π(f) by computing samples (plus some regression techniques?). But the problem is if π u is complicated, then it is very difficult to simulate i.i.d. random variables from π(•). The MCMC solution is to construct a Markov chain on X which has π(•) as a stationary distribution, i.e. X π(dx)P (x, dy) = π(dy) Then for large n the distribution of X n will be approximately stationary. We can set Z 1 = X n and get Z 2 , Z 3 ,. .. , Z n repeatedly. Remark. In practice instead of starting a fresh Markov chain every time we take the successive X n 's, for example, (N − B) −1 N i=B+1 f (X i). We tend to ignore the dependence problem as many of the mathematical issues are similar in either implementation. Remark. We have other ways of estimation, such as "rejection sampling" and "importance sampling". But MCMC algorithms is applied most widely. 2 MCMC and its construction This section will explain how MCMC algorithm is constructed. Now we introduce reversibility. Definition. A Markov Chain on state space X is reversible with respect to a probability distribution π(•) on X , if π(dx)P (x, dy) = π(dy)P (y, dx), x, y ∈ X Proposition. A Markov Chain is reversible with respect to π(•), then π(•) is the stationary distribution for the chain. Proof. By reversibility, we have x∈X π(dx)P (x, dy) = x∈X π(dy)P (y, dx) = π(dy) x∈X P (x, dy) = π(dy) Now the simplest way to construct a MCMC algorithm which satisfies reversibility is using Metropolis-Hastings algorithm. 2.1 The Metropolis-Hastings Algorithm. Suppose that π(•) has a (possibly unnormalized) density π u. Let Q(x, •) be essentially any other Markov Chain, whose transitions also have a (possibly unnormalized) density, i.e. Q(x, dy) ∝ q(x, y)dy. First choose some X 0. Then given X n , generate a proposal Y n+1 from Q(X n , •). In the meantime we flip a independent bias coin with probability of heads equals to α(X n , Y n+1), where α(x, y) = min 1, π u (y)q(y, x) π u (x)q(x, y) , π(x)q(x, y) = 0 And α(x, y) = 1 when π(x)q(x, y) = 0. Then if the coin is heads, we accept the proposal and set X n+1 = Y n+1. If the coin is tails, then we reject the proposal and set X n+1 = X n. Then we replace n by n + 1 and repeat. The reason we take α(x, y) as above is explain as follow. Proposition. The Metropolis-Hastings Algorithm produces a Markov Chain {X n } which is reversible with respect to π(•). Proof. We want to show for any x, y ∈ X , π(dx)P (x, dy) = π(dy)P (y, dx) whereȲ i = 1 J j Y ij. The Gibbs sampler then proceeds by updating the K + 3 variables according to the above conditional distributions. This is feasible since the conditional distributions are all easily simulated (IG and N).
Scandinavian Journal of Statistics, 2016
Bayesian shrinkage methods have generated a lot of interest in recent years, especially in the context of high-dimensional linear regression. Armagan, propose a Bayesian shrinkage approach using generalized double Pareto priors. They establish several useful properties of this approach, including the derivation of a tractable three-block Gibbs sampler to sample from the resulting posterior density. We show that the Markov operator corresponding to this three-block Gibbs sampler is not Hilbert-Schmidt. We propose a simpler two-block Gibbs sampler, and show that the corresponding Markov operator is trace class (and hence Hilbert-Schmidt). Establishing the trace class property for the proposed two-block Gibbs sampler has several useful consequences. Firstly, it implies that the corresponding Markov chain is geometrically ergodic, thereby implying the existence of a Markov chain CLT, which in turn enables computation of asymptotic standard errors for Markov chain based estimates of posterior quantities. Secondly, since the proposed Gibbs sampler uses two-blocks, standard recipes in the literature can be used to construct a sandwich Markov chain (by inserting an appropriate extra step) to gain further efficiency and to achieve faster convergence. The trace class property for the two-block sampler implies that the corresponding sandwich Markov chain is also trace class and thereby geometrically ergodic. Finally, it also guarantees that all eigenvalues of the sandwich chain are dominated by the corresponding eigenvalues of the Gibbs sampling chain (with at least one strict domination). Our results demonstrate that a minor change in the structure of a Markov chain can lead to fundamental changes in its theoretical properties. We illustrate the improvement in efficiency and convergence resulting from our proposed Markov chains using simulated and real examples.
Bernoulli, 2012
Measure-valued Markov chains have raised interest in Bayesian nonparametrics since the seminal paper by (Math. Proc. Cambridge Philos. Soc. 105 (1989) 579-585) where a Markov chain having the law of the Dirichlet process as unique invariant measure has been introduced. In the present paper, we propose and investigate a new class of measure-valued Markov chains defined via exchangeable sequences of random variables. Asymptotic properties for this new class are derived and applications related to Bayesian nonparametric mixture modeling, and to a generalization of the Markov chain proposed by (Math. Proc. Cambridge Philos. Soc. 105 (1989) 579-585), are discussed. These results and their applications highlight once again the interplay between Bayesian nonparametrics and the theory of measure-valued Markov chains.
2007
Hidden Markov Models can be considered an extension of mixture models, allowing for dependent observations. In a hierarchical Bayesian framework, we show how Reversible Jump Markov Chain Monte Carlo techniques can be used to estimate the parameters of a model, as well as the number of regimes. We consider a mixture of normal distributions characterized by different means and variances under each regime, extending the model proposed by Robert et al. (2000), based on a mixture of zero mean normal distributions. Rosella Castellano, Università di Macerata. E-mail: [email protected]. Luisa Scaccia, Università di Macerata. E-mail: [email protected].
2018
We develop a model for Bayesian selection in high order Markov chains through an extension of the mixture transition distribution of Raftery (1985). We demonstrate two uses for the model: parsimonious approximation of high order dynamics by mixing lower order transition models, and model selection through over-specification and shrinkage via priors for sparse probability vectors. We discuss properties of the model and demonstrate its utility with simulation studies. We further apply the model to a data analysis from the high-order Markov chain literature and a novel application to pink salmon abundance time series.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Technometrics, 2015
Physical Review E, 2007
arXiv: Methodology, 2017
International Journal of Approximate Reasoning, 2007
arXiv: Methodology, 2020
Neural Computation
arXiv (Cornell University), 2017
Cornell University - arXiv, 2019
AIP Conference Proceedings, 2006