Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
AI
This work delves into the principles governing the establishment of stationary distributions within Markov Chains, particularly as used in Markov Chain Monte Carlo (MCMC) simulations. It examines the necessary properties of transition probability matrices that facilitate the convergence of MCMC to an invariant distribution, emphasizing the significance of eigenvalues and eigenvectors in achieving this convergence. Moreover, it highlights relevant mathematical methods, such as eigen decomposition, to elucidate the connections between the underlying structure of Markov Chains and their stationary properties.
2020
Note this is a random variable with expected value π(f) (i.e. the estimator is unbiased) and standard deviation of order O(1/ √ N). Then by CLT, the errorπ(f) − π(f) will have a limiting normal distribution as N → ∞. Therefore we can compute π(f) by computing samples (plus some regression techniques?). But the problem is if π u is complicated, then it is very difficult to simulate i.i.d. random variables from π(•). The MCMC solution is to construct a Markov chain on X which has π(•) as a stationary distribution, i.e. X π(dx)P (x, dy) = π(dy) Then for large n the distribution of X n will be approximately stationary. We can set Z 1 = X n and get Z 2 , Z 3 ,. .. , Z n repeatedly. Remark. In practice instead of starting a fresh Markov chain every time we take the successive X n 's, for example, (N − B) −1 N i=B+1 f (X i). We tend to ignore the dependence problem as many of the mathematical issues are similar in either implementation. Remark. We have other ways of estimation, such as "rejection sampling" and "importance sampling". But MCMC algorithms is applied most widely. 2 MCMC and its construction This section will explain how MCMC algorithm is constructed. Now we introduce reversibility. Definition. A Markov Chain on state space X is reversible with respect to a probability distribution π(•) on X , if π(dx)P (x, dy) = π(dy)P (y, dx), x, y ∈ X Proposition. A Markov Chain is reversible with respect to π(•), then π(•) is the stationary distribution for the chain. Proof. By reversibility, we have x∈X π(dx)P (x, dy) = x∈X π(dy)P (y, dx) = π(dy) x∈X P (x, dy) = π(dy) Now the simplest way to construct a MCMC algorithm which satisfies reversibility is using Metropolis-Hastings algorithm. 2.1 The Metropolis-Hastings Algorithm. Suppose that π(•) has a (possibly unnormalized) density π u. Let Q(x, •) be essentially any other Markov Chain, whose transitions also have a (possibly unnormalized) density, i.e. Q(x, dy) ∝ q(x, y)dy. First choose some X 0. Then given X n , generate a proposal Y n+1 from Q(X n , •). In the meantime we flip a independent bias coin with probability of heads equals to α(X n , Y n+1), where α(x, y) = min 1, π u (y)q(y, x) π u (x)q(x, y) , π(x)q(x, y) = 0 And α(x, y) = 1 when π(x)q(x, y) = 0. Then if the coin is heads, we accept the proposal and set X n+1 = Y n+1. If the coin is tails, then we reject the proposal and set X n+1 = X n. Then we replace n by n + 1 and repeat. The reason we take α(x, y) as above is explain as follow. Proposition. The Metropolis-Hastings Algorithm produces a Markov Chain {X n } which is reversible with respect to π(•). Proof. We want to show for any x, y ∈ X , π(dx)P (x, dy) = π(dy)P (y, dx) whereȲ i = 1 J j Y ij. The Gibbs sampler then proceeds by updating the K + 3 variables according to the above conditional distributions. This is feasible since the conditional distributions are all easily simulated (IG and N).
Electronic Journal of Linear Algebra, 2004
For an irreducible stochastic matrix T of order n, a certain condition number κ j (T ) that measures the sensitivity of the j−th entry of the corresponding stationary distribution under perturbation of T is considered. A lower bound on κ j is produced in terms of the directed graph of T , and the case of equality is characterized in that lower bound. Also all of the directed graphs D are characterized such that κ j (T ) is bounded from above as T ranges over the set of irreducible stochastic matrices having directed graph D. For those D for which κ j is bounded, a tight upper bound is given on κ j in terms of information contained in D.
Linear Algebra and its Applications, 2016
Computational procedures for the stationary probability distribution, the group inverse of the Markovian kernel and the mean first passage times of a finite irreducible Markov chain, are developed using perturbations. The derivation of these expressions involves the solution of systems of linear equations and, structurally, inevitably the inverses of matrices. By using a perturbation technique, starting from a simple base where no such derivations are formally required, we update a sequence of matrices, formed by linking the solution procedures via generalized matrix inverses and utilising matrix and vector multiplications. Four different algorithms are given, some modifications are discussed, and numerical comparisons made using a test example. The derivations are based upon the ideas outlined in Hunter,
2009
This book is an introduction to the modern approach to the theory of Markov chains. The main goal of this approach is to determine the rate of convergence of a Markov chain to the stationary distribution as a function of the size and geometry of the state space. The authors develop the key tools for estimating convergence times, including coupling, strong stationary times, and spectral methods. Whenever possible, probabilistic methods are emphasized.
ESAIM: Probability and Statistics, 1999
We study the convergence to equilibrium of n−samples of independent Markov chains in discrete and continuous time. They are defined as Markov chains on the n−fold Cartesian product of the initial state space by itself, and they converge to the direct product of n copies of the initial stationary distribution. Sharp estimates for the convergence speed are given in terms of the spectrum of the initial chain. A cutoff phenomenon occurs in the sense that as n tends to infinity, the total variation distance between the distribution of the chain and the asymptotic distribution tends to 1 or 0 at all times. As an application, an algorithm is proposed for producing an n−sample of the asymptotic distribution of the initial chain, with an explicit stopping test.
2006
We describe the life, times and legacy of Andrei Andreevich Markov (1856 -1922), and his writings on what became known as Markov chains. One focus is on his first paper [27] of 1906 on this topic, which already contains important contractivity principles embodied in the Markov Dobrushin coefficient of ergodicity, which in fact makes an explicit appearance in that paper. The contractivity principles are shown directly to underpin a number of results of the later theory. The coefficient is especially useful as a condition number in measuring the effect of perturbation of a stochastic matrix on the stationary distribution (sensitivity analysis). Some recent work in this direction is reviewed from the standpoint of the paper [53], presented at the first of the present series of conferences [63].
arXiv (Cornell University), 2020
We discuss a non-reversible, lifted Markov-chain Monte Carlo (MCMC) algorithm for particle systems in which the direction of proposed displacements is changed deterministically. This algorithm sweeps through directions analogously to the popular MCMC sweep methods for particle or spin indices. Direction-sweep MCMC can be applied to a wide range of original reversible or non-reversible Markov chains, such as the Metropolis algorithm or the event-chain Monte Carlo algorithm. For a single two-dimensional dipole, we consider direction-sweep MCMC in the limit where restricted equilibrium is reached among the accessible configurations before changing the direction. We show rigorously that direction-sweep MCMC leaves the stationary probability distribution unchanged, and that it profoundly modifies the Markov-chain trajectory. Long excursions, with persistent rotation in one direction, alternate with long sequences of rapid zigzags resulting in persistent rotation in the opposite direction in the limit of small direction increments. The mapping to a Langevin equation then yields the exact scaling of excursions while the zigzags are described through a non-linear differential equation that is solved exactly. We show that the direction-sweep algorithm can have shorter mixing times than the algorithms with random updates of directions. We point out possible applications of directionsweep MCMC in polymer physics and in molecular simulation.
2011
Questions are posed regarding the influence that the column sums of the transition probabilities of a stochastic matrix (with row sums all one) have on the stationary distribution, the mean first passage times and the Kemeny constant of the associated irreducible discrete time Markov chain. Some new relationships, including some inequalities, and partial answers to the questions, are given using a special generalized matrix inverse that has not previously been considered in the literature on Markov chains.
Linear Algebra and Its Applications, 2006
A measure of the "mixing time" or "time to stationarity" in a finite irreducible discrete time Markov chain is considered. The statistic η π i i j j m j m = = ∑ 1 , where {π j } is the stationary distribution and m ij is the mean first passage time from state i to state j of the Markov chain, is shown to be independent of the state i that the chain starts in (so that η i = η for all i), is minimal in the case of a periodic chain, yet can be arbitrarily large in a variety of situations. An application concerning the affect perturbations of transition probabilities have on the stationary distributions of Markov chains leads to a new bound, involving η, for the 1-norm of the difference between the stationary probability vectors of the original and the perturbed chain. When η is large the stationary distribution of the Markov chain is very sensitive to perturbations of the transition probabilities.
2007
The effective application of Markov chains has been paid much attention, and it has raised a lot of theoretical and applied problems. In this paper, we would like to approach one of these problems which is finding the long-run behavior of extremely huge-state Markov chains according to the direction of investigating the structure of Markov Graph to reduce complexity of computation. We focus on the way to access to the finite-state Markov chain theory via Graph theory. We suggested some basic knowledge about state classification and a small project of modelling the structure and the moving process of the finite-state Markov chain model. This project based on the remark that it is impossible to study deeperly the finite-state Markov chain theory if we do not have the clear sense about the structure and the movement of it.
Linear Algebra and its Applications, 1986
Techniques for updating the stationary distribution of a finite irreducible Markov chain following a rank one perturbation of its transition matrix are discussed. A variety of situations where such perturbations may arise are presented together with suitable procedures for the derivation of the related stationary distributions.
Monte Carlo and Quasi-Monte Carlo Methods 2008, 2009
We study the convergence behavior of a randomized quasi-Monte Carlo (RQMC) method for the simulation of discrete-time Markov chains, known as array-RQMC. The goal is to estimate the expectation of a smooth function of the sample path of the chain. The method simulates n copies of the chain in parallel, using highly uniform point sets randomized independently at each step. The copies are sorted after each step, according to some multidimensional order, for the purpose of assigning the RQMC points to the chains. In this paper, we provide some insight on why the method works, explain what would need to be done to bound its convergence rate, discuss and compare different ways of realizing the sort and assignment, and report empirical experiments on the convergence rate of the variance and of the mean square discrepancy between the empirical and theoretical distribution of the states, as a function of n, for various types of discrepancies.
Electronic Communications in Probability, 2000
The key condition of the result is the spatial symmetry and polynomial decay of the Green's function of the chain.
Proceedings of the 9th …, 2007
We consider a stationary distribution of a finite, irreducible, homogeneous Markov chain. Our aim is to perturb the transition probabilities matrix using approximations to find regions of feasibility and optimality for a given basis when the chain is optimized using linear programming. We also explore the application of perturbations bonds and analyze the effects of these on the construction of optimal policies.
2007
The derivation of mean first passage times in Markov chains involves the solution of a family of linear equations. By exploring the solution of a related set of equations, using suitable generalized inverses of the Markovian kernel I -P, where P is the transition matrix of a finite irreducible Markov chain, we are able to derive elegant new results for finding the mean first passage times. As a by-product we derive the stationary distribution of the Markov chain without the necessity of any further computational procedures. Standard techniques in the literature, using for example Kemeny and Snell's fundamental matrix Z, require the initial derivation of the stationary distribution followed by the computation of Z, the inverse I -P + eπ T where e T = (1, 1, …,1) and π T is the stationary probability vector. The procedures of this paper involve only the derivation of the inverse of a matrix of simple structure, based upon known characteristics of the Markov chain together with simple elementary vectors. No prior computations are required. Various possible families of matrices are explored leading to different related procedures.
1996
In this note we explicitly determine the steady state probabilities for an arbitrary circular Markov chain by successively reducing the number of states of the Markov chain.
arXiv (Cornell University), 2016
This paper proposes a new type of recurrence where we divide the Markov chains into intervals that start when the chain enters into a subset , then sample another subset far away from and end when the chain again return to. The length of these intervals have the same distribution and if and are far apart, almost independent of each other. and may be any subsets of the state space that are far apart of each other and such that the movement between the subsets is repeated several times in a long Markov chain. The expected length of the intervals is used in a function that describes the mixing properties of the chain and improves our understanding of Markov chains. The paper proves a theorem that gives a bound on the variance of the estimate for (), the probability for under the limiting density of the Markov chain. This may be used to find the length of the Markov chain that is needed to explore the state space sufficiently. It is shown that the length of the periods between each time is entered by the Markov chain, has a heavy tailed distribution. This increases the upper bound for the variance of the estimate (). The paper gives a general guideline on how to find the optimal scaling of parameters in the Metropolis-Hastings simulation algorithm that implicit determine the acceptance rate. We find examples where it is optimal to have a much smaller acceptance rate than what is generally recommended in the literature and also examples where the optimal acceptance rate vanishes in the limit.
Comptes Rendus Mathematique
The restoration of a hidden process X from an observed process Y is often performed in the framework of hidden Markov chains (HMC). HMC have been recently generalized to triplet Markov chains (TMC). In the TMC model one introduces a third random chain U and assumes that the triplet T = (X, U, Y) is a Markov chain (MC). TMC generalize HMC but still enable the development of efficient Bayesian algorithms for restoring X from Y. This paper lists some recent results concerning TMC; in particular, we recall how TMC can be used to model hidden semi-Markov Chains or deal with non-stationary HMC.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.