Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2016, arXiv (Cornell University)
This paper proposes a new type of recurrence where we divide the Markov chains into intervals that start when the chain enters into a subset , then sample another subset far away from and end when the chain again return to. The length of these intervals have the same distribution and if and are far apart, almost independent of each other. and may be any subsets of the state space that are far apart of each other and such that the movement between the subsets is repeated several times in a long Markov chain. The expected length of the intervals is used in a function that describes the mixing properties of the chain and improves our understanding of Markov chains. The paper proves a theorem that gives a bound on the variance of the estimate for (), the probability for under the limiting density of the Markov chain. This may be used to find the length of the Markov chain that is needed to explore the state space sufficiently. It is shown that the length of the periods between each time is entered by the Markov chain, has a heavy tailed distribution. This increases the upper bound for the variance of the estimate (). The paper gives a general guideline on how to find the optimal scaling of parameters in the Metropolis-Hastings simulation algorithm that implicit determine the acceptance rate. We find examples where it is optimal to have a much smaller acceptance rate than what is generally recommended in the literature and also examples where the optimal acceptance rate vanishes in the limit.
Annals of Statistics, 1996
The Markov chain simulation method has been successfully used in many problems, including some that arise in Bayesian statistics. We give a self-contained proof of the convergence of this method in general state spaces under conditions that are easy to verify.
Bottlenecks correspond to small cuts, so to prove rapid mixing we could try to prove that no small cut exists. The (approximate) dual of the cut problem is the above flow problem, so alternatively we can try to prove that the Markov chain has a good flow. Note that this is easier than proving that it has no small cut. (Later in the course we will show how to use cuts to prove lower bounds on the mixing time.) 11-1 Definition 11.5 The length of flow f is: l(f) = max p:f(p)>0 | p . In other words, the length of a flow is the length of the longest path that carries flow. Our goal is to prove the following theorem, due to [DS91,Si92], which gives an upper bound on the mixing time in terms of the cost (and length) of any flow: Theorem 11.6 For a lazy, ergodic, reversible Markov chain, we have: for any flow f . Corollary 11.7 The mixing time is bounded by # x (#) #(f)l(f)[ln((2 #(x)) -1 ) + ln # -1 ] and hence: # mix = const where #min = min x #(x). Before prov
Operations Research, 2008
We introduce and study a randomized quasi-Monte Carlo method for the simulation of Markov chains up to a random (and possibly unbounded) stopping time. The method simulates n copies of the chain in parallel, using a (d + 1)-dimensional, highly uniform point set of cardinality n, randomized independently at each step, where d is the number of uniform random numbers required at each transition of the Markov chain. The general idea is to obtain a better approximation of the state distribution, at each step of the chain, than with standard Monte Carlo. The technique can be used in particular to obtain a low-variance unbiased estimator of the expected total cost when state-dependent costs are paid at each step. It is generally more effective when the state space has a natural order related to the cost function.
2009
This book is an introduction to the modern approach to the theory of Markov chains. The main goal of this approach is to determine the rate of convergence of a Markov chain to the stationary distribution as a function of the size and geometry of the state space. The authors develop the key tools for estimating convergence times, including coupling, strong stationary times, and spectral methods. Whenever possible, probabilistic methods are emphasized.
Monte Carlo and Quasi-Monte Carlo Methods 2008, 2009
We study the convergence behavior of a randomized quasi-Monte Carlo (RQMC) method for the simulation of discrete-time Markov chains, known as array-RQMC. The goal is to estimate the expectation of a smooth function of the sample path of the chain. The method simulates n copies of the chain in parallel, using highly uniform point sets randomized independently at each step. The copies are sorted after each step, according to some multidimensional order, for the purpose of assigning the RQMC points to the chains. In this paper, we provide some insight on why the method works, explain what would need to be done to bound its convergence rate, discuss and compare different ways of realizing the sort and assignment, and report empirical experiments on the convergence rate of the variance and of the mean square discrepancy between the empirical and theoretical distribution of the states, as a function of n, for various types of discrepancies.
Bayesian Time Series Models
The Annals of Statistics, 1999
We study estimation in the class of stationary variable length Markov chains (VLMC) on a nite space. The processes in this class are still Markovian of higher order, but with memory of variable length yielding a much bigger and structurally richer class of models than ordinary higher order Markov chains. From a more algorithmic view, the VLMC model class has attracted interest in information theory and machine learning but statistical properties have not been explored very much. Provided that good estimation is available, an additional structural richness of the model class enhances predictive power by nding a better trade-o between model bias and variance and allows better structural description which can be of speci c interest. The latter is exempli ed with some DNA data. A version of the tree-structured context algorithm, proposed by Rissanen (1983) in an information theoretical setup , is shown to have new good asymptotic properties for estimation in the class of VLMC's, even when the underlying model increases in dimensionality: consistent estimation of minimal state spaces and mixing properties of tted models are given. We also propose a new bootstrap scheme based on tted VLMC's. We show its validity for quite general stationary categorical time series and for a broad range of statistical procedures.
Computational Statistics & Data Analysis, 2008
In Bertail & Clémençon (2005a) a novel methodology for bootstrapping general Harris Markov chains has been proposed, which crucially exploits their renewal properties (when eventually extended via the Nummelin splitting technique) and has theoretical properties that surpass other existing methods within the Markovian framework (bmoving block bootstrap, sieve bootstrap etc...). This paper is devoted to discuss practical issues related to the implementation of this specific resampling method and to present various simulations studies for investigating the performance of the latter and comparing it to other bootstrap resampling schemes standing as natural candidates in the Markov setting. Résumé : Une nouvelle méthodologie pour "bootstrapper" des chaînes de Markov Harris récurrente à été proposée par Bertail et Clémençon (2005a). Cette méthode utilise de manière cruciale les propriétés de renouvellement des chaînes de Markov (éventuellement en étendant la chaîne via la technique de "splitting" introduite par Nummelin). Elle possède des propriétés asymptotiques (propriétés au second ordre) meilleures que celles obtenues pour les méthodes existantes dans un contexte markovien (bootstrap par block, sieve bootstrap etc...). L'objet de cet article est de discuter les questions pratiques d'implémentation de cette méthode de rééchantillonnage et de présenter diverses simulations pour en étudier les performances à distance finie. Nous comparons ces résultats avec ceux obtenus avec des méthodes concurrentes dans un cadre markovien.
Mathematics of Operations Research, 2001
We introduce a new class of density estimators, termed look-ahead density estimators, for performance measures associated with a Markov chain. Look-ahead density estimators are given for both transient and steady-state quantities. Look-ahead density estimators converge faster (especially in multi-dimensional problems) and empirically give visually superior results relative to more standard estimators, such as kernel density estimators. Several numerical examples that demonstrate the potential applicability of look-ahead density estimation are given.
1999
This paper is concerned with improving the performance of Markov chain algorithms for Monte Carlo simulation. We propose a new algorithm for simulating from multivariate Gaussian densities. This algorithm combines ideas from Metropolis-coupled Markov chain Monte Carlo methods and from an existing algorithm based only on over-relaxation. The speed of convergence of the proposed and existing algorithms can be measured by the spectral radius of certain matrices. We present examples in which the proposed algorithm converges faster than the existing algorithm and the Gibbs sampler. We also derive an expression for the asymptotic variance of any linear combination of the variables simulated by the proposed algorithm. From this expression it follows that the proposed algorithm o ers no asymptotic variance reduction compared with the existing algorithm. We extend the proposed algorithm to the non-Gaussian case and discuss its performance by means of examples from Bayesian image analysis. We nd that better performance is obtained from a special case of the proposed algorithm, which is a modi ed version of the algorithm of , than from a Metropolis algorithm. 1 2. estimating expected values under of functions de ned over the state space.
2020
Note this is a random variable with expected value π(f) (i.e. the estimator is unbiased) and standard deviation of order O(1/ √ N). Then by CLT, the errorπ(f) − π(f) will have a limiting normal distribution as N → ∞. Therefore we can compute π(f) by computing samples (plus some regression techniques?). But the problem is if π u is complicated, then it is very difficult to simulate i.i.d. random variables from π(•). The MCMC solution is to construct a Markov chain on X which has π(•) as a stationary distribution, i.e. X π(dx)P (x, dy) = π(dy) Then for large n the distribution of X n will be approximately stationary. We can set Z 1 = X n and get Z 2 , Z 3 ,. .. , Z n repeatedly. Remark. In practice instead of starting a fresh Markov chain every time we take the successive X n 's, for example, (N − B) −1 N i=B+1 f (X i). We tend to ignore the dependence problem as many of the mathematical issues are similar in either implementation. Remark. We have other ways of estimation, such as "rejection sampling" and "importance sampling". But MCMC algorithms is applied most widely. 2 MCMC and its construction This section will explain how MCMC algorithm is constructed. Now we introduce reversibility. Definition. A Markov Chain on state space X is reversible with respect to a probability distribution π(•) on X , if π(dx)P (x, dy) = π(dy)P (y, dx), x, y ∈ X Proposition. A Markov Chain is reversible with respect to π(•), then π(•) is the stationary distribution for the chain. Proof. By reversibility, we have x∈X π(dx)P (x, dy) = x∈X π(dy)P (y, dx) = π(dy) x∈X P (x, dy) = π(dy) Now the simplest way to construct a MCMC algorithm which satisfies reversibility is using Metropolis-Hastings algorithm. 2.1 The Metropolis-Hastings Algorithm. Suppose that π(•) has a (possibly unnormalized) density π u. Let Q(x, •) be essentially any other Markov Chain, whose transitions also have a (possibly unnormalized) density, i.e. Q(x, dy) ∝ q(x, y)dy. First choose some X 0. Then given X n , generate a proposal Y n+1 from Q(X n , •). In the meantime we flip a independent bias coin with probability of heads equals to α(X n , Y n+1), where α(x, y) = min 1, π u (y)q(y, x) π u (x)q(x, y) , π(x)q(x, y) = 0 And α(x, y) = 1 when π(x)q(x, y) = 0. Then if the coin is heads, we accept the proposal and set X n+1 = Y n+1. If the coin is tails, then we reject the proposal and set X n+1 = X n. Then we replace n by n + 1 and repeat. The reason we take α(x, y) as above is explain as follow. Proposition. The Metropolis-Hastings Algorithm produces a Markov Chain {X n } which is reversible with respect to π(•). Proof. We want to show for any x, y ∈ X , π(dx)P (x, dy) = π(dy)P (y, dx) whereȲ i = 1 J j Y ij. The Gibbs sampler then proceeds by updating the K + 3 variables according to the above conditional distributions. This is feasible since the conditional distributions are all easily simulated (IG and N).
arXiv (Cornell University), 2016
We present several Monte Carlo strategies for simulating discrete-time Markov chains with continuous multi-dimensional state space; we focus on stratified techniques. We first analyze the variance of the calculation of the measure of a domain included in the unit hypercube, when stratified samples are used. We then show that each step of the simulation of a Markov chain can be reduced to the numerical integration of the indicator function of a subdomain of the unit hypercube. Our approach for Markov chains simulates N copies of the chain in parallel using stratified sampling and the copies are sorted after each step, according to their successive coordinates. We analyze variance reduction on examples of pricing of European and Asian options: enhanced efficiency of stratified strategies is shown.
Methodology and Computing in Applied Probability, 2007
We introduce an estimate of the entropy E p t (log p t ) of the marginal density p t of a (eventually inhomogeneous) Markov chain at time t ≥ 1. This estimate is based on a double Monte Carlo integration over simulated i.i.d. copies of the Markov chain, whose transition density kernel is supposed to be known. The technique is extended to compute the external entropy E p t 1 (log p t ), where the p t 1 's are the successive marginal densities of another Markov process at time t. We prove, under mild conditions, weak consistency and asymptotic normality of both estimators. The strong consistency is also obtained under stronger assumptions. These estimators can be used to study by simulation the convergence of p t to its stationary distribution. Potential applications for this work are presented: (i) a diagnostic by simulation of the stability property of a Markovian dynamical system with respect to various initial conditions; (ii) a study of the rate in the Central Limit Theorem for i.i.d. random variables. Simulated examples are provided as illustration.
Computational Statistics, 2021
We consider versions of the Metropolis algorithm which avoid the inefficiency of rejections. We first illustrate that a natural Uniform Selection Algorithm might not converge to the correct distribution. We then analyse the use of Markov jump chains which avoid successive repetitions of the same state. After exploring the properties of jump chains, we show how they can exploit parallelism in computer hardware to produce more efficient samples. We apply our results to the Metropolis algorithm, to Parallel Tempering, to a Bayesian model, to a two-dimensional ferromagnetic 4×4 Ising model, and to a pseudomarginal MCMC algorithm.
2006
We develop explicit, general bounds for the probability that the empirical sample averages of a function of a Markov chain on a general alphabet will exceed the steady-state mean of that function by a given amount. Our bounds combine simple information-theoretic ideas together with techniques from optimization and some fairly elementary tools from analysis. In one direction, motivated by central problems in simulation, we develop bounds for the general class of "geometrically ergodic" Markov chains. These bounds take a form that is particularly suited to simulation problems, and they naturally lead to a new class of sampling criteria. These are illustrated by several examples. In another direction, we obtain a new bound for the important special class of Doeblin chains; this bound is optimal, in the sense that in the special case of independent and identically distributed random variables it essentially reduces to the classical Hoeffding bound.
arXiv (Cornell University), 2013
In this paper we study Markov chains associated with the Metropolis-Hastings algorithm. We consider conditions under which the sequence of the successive densities of such a chain converges to the target density according to the total variation distance for any choice of the initial density. In particular we prove that the positiveness of the proposal density is enough for the chain to converge. The content of this work basically presents a stand alone proof that the reversibility along with the kernel positivity imply the convergence.
Markov Chain Monte Carlo Simulations and Their Statistical Analysis, 2004
This article is a tutorial on Markov chain Monte Carlo simulations and their statistical analysis. The theoretical concepts are illustrated through many numerical assignments from the author's book on the subject. Computer code (in Fortran) is available for all subjects covered and can be downloaded from the web.
Mathematical Methods of Operations Research, 1997
Consider a finite state irreducible Markov reward chain. It is shown that there exist simulation estimates and confidence intervals for the expected first passage times and rewards as well as the expected average reward, with 100% coverage probability. The length of the confidence intervals converges to zero with probability one as the sample size increases; it also satisfies a large deviations property.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.