Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
Proceedings of the National Academy of Sciences
…
20 pages
1 file
In 1953, Lloyd Shapley defined the model of stochastic games, which were the first general dynamic model of a game to be defined, and proved that competitive stochastic games have a discounted value. In 1982, Jean-François Mertens and Abraham Neyman proved that competitive stochastic games admit a robust solution concept, the value, which is equal to the limit of the discounted values as the discount rate goes to 0. Both contributions were published in PNAS. In the present paper, we provide a tractable formula for the value of competitive stochastic games.
2018
In a zero-sum stochastic game, at each stage, two adversary players take decisions and receive a stage payoff determined by them and by a random variable representing the state of nature. The total payoff is the discounted sum of the stage payoffs. Assume that the players are very patient and use optimal strategies. We then prove that, at any point in the game, players get essentially the same expected payoff: the payoff is constant. This solves a conjecture by Sorin, Venel and Vigeral (2010). The proof relies on the semi-algebraic approach for discounted stochastic games introduced by Bewley and Kohlberg (1976), on the theory of Markov chains with rare transitions, initiated by Friedlin and Wentzell (1984), and on some variational inequalities for value functions inspired by the recent work of Davini, Fathi, Iturriaga and Zavidovique (2016).
Dynamic Games and Applications, 2013
To celebrate the 60th anniversary of the seminal paper "Stochastic Games" of L.S. Shapley [16], Dynamic Games and Applications is proud to publish this special issue. Shapley's paper on stochastic games has had a tremendous scientific impact on the theory and applications of dynamic games and there is still a very active research on these domains. In addition, as can be seen by the content of this volume, the theoretical model as well the potential for applications develops in new directions including continuous time framework, link with evolutionary games, algorithmic game theory, economics, and social networks. The idea to devote a special issue to celebrate the 60th anniversary of Shapley's paper [16] emerged a few years ago, and the decision was taken in 2011. Since then, we had the great pleasure to enjoy the attribution to Llyod Shapley of the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel 2012 1 "for the theory of stable allocations and the practice of market design," jointly with Alvin Roth. This is the occasion to recall the importance of Shapley's contributions in other areas of game theory like: • core for TU and NTU cooperative games; • equivalence principle for large economies; • matching (with David Gale);
2012
We provide a direct, elementary proof for the existence of lim λ→0 v λ , where v λ is the value of λ-discounted finite two-person zero-sum stochastic game. 1 Introduction Two-person zero-sum stochastic games were introduced by Shapley [4]. They are described by a 5-tuple (Ω, I, J , q, g), where Ω is a finite set of states, I and J are finite sets of actions, g : Ω × I × J → [0, 1] is the payoff, q : Ω × I × J → ∆(Ω) the transition and, for any finite set X, ∆(X) denotes the set of probability distributions over X. The functions g and q are bilinearly extended to Ω × ∆(I) × ∆(J). The stochastic game with initial state ω ∈ Ω and discount factor λ ∈ (0, 1] is denoted by Γ λ (ω) and is played as follows: at stage m ≥ 1, knowing the current state ω m , the players choose actions (i m , j m) ∈ I × J ; their choice produces a stage payoff g(ω m , i m , j m) and influences the transition: a new state ω m+1 is chosen according to the probability distribution q(•|ω m , i m , j m). At the end of the game, player 1 receives m≥1 λ(1 − λ) m−1 g(ω m , i m , j m) from player 2. The game Γ λ (ω) has a value v λ (ω), and v λ = (v λ (ω)) ω∈Ω is the unique fixed point of the so-called Shapley operator [4], i.e. v λ = Φ(λ, v λ), where for all f ∈ R Ω :
Stochastic and Differential Games, 1999
This paper treats stochastic games. A nonzero-sum average payoff stochastic games with arbitrary state spaces and the stopping games are considered. Such models of games very well fit in some studies in economic theory and operations research. A correlation of strategies of the players, involving "public signals", is allowed in the nonzero-sum average payoff stochastic games. The main result is an extension of the correlated equilibrium theorem proved recently by Nowak and Raghavan for dynamic games with discounting to the average payoff stochastic games. The stopping games are special model of stochastic games. The version of Dynkin's game related to observation of Markov process with random priority assignment mechanism of states is presented in the paper. The zero-sum and nonzero-sum games are considered. The paper also provides a brief overview of the theory of nonzero-sum stochastic games and stopping games which are very far from being complete.
Annales de l'Institut Henri Poincaré, Probabilités et Statistiques
In a zero-sum stochastic game, at each stage, two adversary players take decisions and receive a stage payoff determined by them and by a random variable representing the state of nature. The total payoff is the discounted sum of the stage payoffs. Assume that the players are very patient and use optimal strategies. We then prove that, at any point in the game, players get essentially the same expected payoff: the payoff is constant. This solves a conjecture by Sorin, Venel and Vigeral (2010). The proof relies on the semi-algebraic approach for discounted stochastic games introduced by Bewley and Kohlberg (1976), on the theory of Markov chains with rare transitions, initiated by Friedlin and Wentzell (1984), and on some variational inequalities for value functions inspired by the recent work of Davini, Fathi, Iturriaga and Zavidovique (2016).
International Journal of Game Theory, 2015
We study the links between the values of stochastic games with varying stage duration h, the corresponding Shapley operators T and T h = hT + (1 -h)Id and the solution of the evolution equation ḟt = (T -Id)ft. Considering general non expansive maps we establish two kinds of results, under both the discounted or the finite length framework, that apply to the class of "exact" stochastic games. First, for a fixed length or discount factor, the value converges as the stage duration go to 0. Second, the asymptotic behavior of the value as the length goes to infinity, or as the discount factor goes to 0, does not depend on the stage duration. In addition, these properties imply the existence of the value of the finite length or discounted continuous time game (associated to a continuous time jointly controlled Markov process), as the limit of the value of any discretization with vanishing mesh.
Theory and Decision Library, 1991
Stochastic games were first formulated by Shapley in 1953. In his fundamental paper Shapley [13] established the existence of value and optimal stationary strategies for zero-sum ,a-discounted stochastic games with finitely many states and actions for the two players. A positive stochastic game with countable state space and finite action spaces consists of the following objects: 1. State space Sthe set of nonnegative integers. 2. Finite action spaces A(B) for players 1(11). 3. The space of mixed strategies P(A)(P(B)) on the action spaces A(B) for players 1(11). 4. Nonnegative (immediate) reward function r(s, a, b). 5. Markovian transition q(tls, a, b) where q(tls, a, b) is the chance of moving from state s to state t when actions a, b are chosen by players I and II in the current state s. When playing a stochastic game, the players, in selecting their actions for the k th day, could use all the available information to them till that day, namely on the partial history up to the kth day given by (sllallbllsz,az,bz, ...sk-lIak-l,bk-l,Sk). Thus a strategy P for player I is a sequence (PlIPZ, ..) where Pk selects a mixed strategy (an element of P(A), for the k th day. We can classify stochastic games with additional structure on the immediate rewards and transition probabilities. The law of motion is said to be controlled by one player (say player II) if q(tls,i,i) = q(tls,i) for all i. We call a stochastic
arXiv: Optimization and Control, 2018
Stochastic games are a classical model in game theory in which two opponents interact and the environment changes in response to the players' behavior. The central solution concepts for these games are the discounted values and the value, which represent what playing the game is worth to the players for different levels of impatience. In the present manuscript, we provide algorithms for computing exact expressions for the discounted values and for the value, which are polynomial in the number of pure stationary strategies of the players. This result considerably improves all the existing algorithms, including the most efficient one, due to Hansen, Kouck\'y, Lauritzen, Miltersen and Tsigaridas (STOC 2011).
Stochastic Processes and their Applications, 2016
In this paper we consider two-person zero-sum risk-sensitive stochastic dynamic games with Borel state and action spaces and bounded reward. The term risk-sensitive refers to the fact that instead of the usual risk neutral optimization criterion we consider the exponential certainty equivalent. The discounted reward case on a finite and an infinite time horizon is considered, as well as the ergodic reward case. Under continuity and compactness conditions we prove that the value of the game exists and solves the Shapley equation and we show the existence of optimal (non-stationary) strategies. In the ergodic reward case we work with a local minorization property and a Lyapunov condition and show that the value of the game solves the Poisson equation. Moreover, we prove the existence of optimal stationary strategies. A simple example highlights the influence of the risk-sensitivity parameter. Our results generalize findings in [1] and answer an open question posed there.
Proceedings of the 43rd annual ACM symposium on Theory of computing - STOC '11, 2011
Shapley's discounted stochastic games, Everett's recursive games and Gillette's undiscounted stochastic games are classical models of game theory describing two-player zero-sum games of potentially infinite duration. We describe algorithms for exactly solving these games. When the number of positions of the game is constant, our algorithms run in polynomial time.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
International Journal of Game Theory, 1989
Journal of Dynamics and Games, 2015
International Journal of Game Theory, 1993
arXiv: Optimization and Control, 2018
Mathematics of Operations Research, 2014
Mathematics of Control, Signals, and Systems, 1996
Dynamic Games and Applications, 2012
Journal of Mathematical Psychology, 1992
European Journal of Operational Research, 1999
Lithuanian Mathematical Transactions, 1974
Lecture Notes in Computer Science, 2013