Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2012
We provide a direct, elementary proof for the existence of lim λ→0 v λ , where v λ is the value of λ-discounted finite two-person zero-sum stochastic game. 1 Introduction Two-person zero-sum stochastic games were introduced by Shapley [4]. They are described by a 5-tuple (Ω, I, J , q, g), where Ω is a finite set of states, I and J are finite sets of actions, g : Ω × I × J → [0, 1] is the payoff, q : Ω × I × J → ∆(Ω) the transition and, for any finite set X, ∆(X) denotes the set of probability distributions over X. The functions g and q are bilinearly extended to Ω × ∆(I) × ∆(J). The stochastic game with initial state ω ∈ Ω and discount factor λ ∈ (0, 1] is denoted by Γ λ (ω) and is played as follows: at stage m ≥ 1, knowing the current state ω m , the players choose actions (i m , j m) ∈ I × J ; their choice produces a stage payoff g(ω m , i m , j m) and influences the transition: a new state ω m+1 is chosen according to the probability distribution q(•|ω m , i m , j m). At the end of the game, player 1 receives m≥1 λ(1 − λ) m−1 g(ω m , i m , j m) from player 2. The game Γ λ (ω) has a value v λ (ω), and v λ = (v λ (ω)) ω∈Ω is the unique fixed point of the so-called Shapley operator [4], i.e. v λ = Φ(λ, v λ), where for all f ∈ R Ω :
2018
In a zero-sum stochastic game, at each stage, two adversary players take decisions and receive a stage payoff determined by them and by a random variable representing the state of nature. The total payoff is the discounted sum of the stage payoffs. Assume that the players are very patient and use optimal strategies. We then prove that, at any point in the game, players get essentially the same expected payoff: the payoff is constant. This solves a conjecture by Sorin, Venel and Vigeral (2010). The proof relies on the semi-algebraic approach for discounted stochastic games introduced by Bewley and Kohlberg (1976), on the theory of Markov chains with rare transitions, initiated by Friedlin and Wentzell (1984), and on some variational inequalities for value functions inspired by the recent work of Davini, Fathi, Iturriaga and Zavidovique (2016).
Annales de l'Institut Henri Poincaré, Probabilités et Statistiques
In a zero-sum stochastic game, at each stage, two adversary players take decisions and receive a stage payoff determined by them and by a random variable representing the state of nature. The total payoff is the discounted sum of the stage payoffs. Assume that the players are very patient and use optimal strategies. We then prove that, at any point in the game, players get essentially the same expected payoff: the payoff is constant. This solves a conjecture by Sorin, Venel and Vigeral (2010). The proof relies on the semi-algebraic approach for discounted stochastic games introduced by Bewley and Kohlberg (1976), on the theory of Markov chains with rare transitions, initiated by Friedlin and Wentzell (1984), and on some variational inequalities for value functions inspired by the recent work of Davini, Fathi, Iturriaga and Zavidovique (2016).
International Journal of Game Theory, 2015
We study the links between the values of stochastic games with varying stage duration h, the corresponding Shapley operators T and T h = hT + (1 -h)Id and the solution of the evolution equation ḟt = (T -Id)ft. Considering general non expansive maps we establish two kinds of results, under both the discounted or the finite length framework, that apply to the class of "exact" stochastic games. First, for a fixed length or discount factor, the value converges as the stage duration go to 0. Second, the asymptotic behavior of the value as the length goes to infinity, or as the discount factor goes to 0, does not depend on the stage duration. In addition, these properties imply the existence of the value of the finite length or discounted continuous time game (associated to a continuous time jointly controlled Markov process), as the limit of the value of any discretization with vanishing mesh.
2016
We study the existence of different notions of value in two-person zero-sum repeated games where the state evolves and players receive signals. We provide some examples showing that the limsup value (and the uniform value) may not exist in general. Then we show the existence of the value for any Borel payoff function if the players observe a public signal including the actions played. We also prove two other positive results without assumptions on the signaling structure: the existence of the value in any game and the existence of the uniform value in recursive games with nonnegative payoffs.
We study the existence of different notions of values in two-person zero-sum repeated games where the state evolves and players receive signals. We provide some examples showing that the limsup value and the uniform value may not exist in general. Then, we show the existence of the value for any Borel payoff function if the players observe a public signal including the actions played. We prove also two other positive results without assumptions on the signaling structure: the existence of the sup-value and the existence of the uniform value in recursive games with non-negative payoffs.
Journal of Computer and System Sciences, 2004
We consider two-player games played for an infinite number of rounds, with ω-regular winning conditions. The games may be concurrent, in that the players choose their moves simultaneously and independently, and probabilistic, in that the moves determine a probability distribution for the successor state. We introduce quantitative game µ-calculus, and we show that the maximal probability of winning such games can be expressed as the fixpoint formulas in this calculus. We develop the arguments both for deterministic and for probabilistic concurrent games; as a special case, we solve probabilistic turn-based games with ω-regular winning conditions, which was also open. We also characterize the optimality, and the memory requirements, of the winning strategies. In particular, we show that while memoryless strategies suffice for winning games with safety and reachability conditions, Büchi conditions require the use of strategies with infinite memory. The existence of optimal strategies, as opposed to ε-optimal, is only guaranteed in games with safety winning conditions.
Israel Journal of Mathematics, 2001
We consider two person zero-sum stochastic games. The recursive formula for the values v~, (resp. vn) of the discounted (resp. finitely repeated) version can be written in terms of a single basic operator @(a, .f) where is the weight on the present payoff and .f the future payoff. We give sufficient conditions in terms of 4~(c~, f) and its derivative at 0 for lira vn and lira vA to exist and to be equal. We apply these results to obtain such convergence properties for absorbing games with compact action spaces and incomplete information games.
Pacific Journal of Mathematics, 1975
Some sufficient conditions are given to show the existence of equilibrium points with finite spectrum for nonzero-sum two-person continuous games on the unit square. We also examine the question of uniqueness of the equilibrium point for such games. 1* Players I and II choose secretly an x and a y in the closed interval [0, 1]. Player I receives K x {x, y) and player II receives K 2 (x, y) where K lf K 2 are continuous on the unit square. The following theorem is classical in game theory: ([3], see page 156). There exists a pair of probability distributions (F°, G°), called Nash equilibrium points satisfying K,(F\ G°) ^ K λ {x, G°) for all x in 0 ^ x < 1 and K 2 (F°, G°) ^ K 2 (F°, y) for all y in 0 ^ y ^ 1 where K λ {F, G) = \\κix, y)dF(x)dG(y) and fa G) = ^K,{x, y)dG(y) etc. Let £f be the set of such pairs (F°, G°). One can ask the following questions. When does an (F°, G°) e g 7
Proceedings of the National Academy of Sciences
In 1953, Lloyd Shapley defined the model of stochastic games, which were the first general dynamic model of a game to be defined, and proved that competitive stochastic games have a discounted value. In 1982, Jean-François Mertens and Abraham Neyman proved that competitive stochastic games admit a robust solution concept, the value, which is equal to the limit of the discounted values as the discount rate goes to 0. Both contributions were published in PNAS. In the present paper, we provide a tractable formula for the value of competitive stochastic games.
Dynamic Games and Applications, 2013
To celebrate the 60th anniversary of the seminal paper "Stochastic Games" of L.S. Shapley [16], Dynamic Games and Applications is proud to publish this special issue. Shapley's paper on stochastic games has had a tremendous scientific impact on the theory and applications of dynamic games and there is still a very active research on these domains. In addition, as can be seen by the content of this volume, the theoretical model as well the potential for applications develops in new directions including continuous time framework, link with evolutionary games, algorithmic game theory, economics, and social networks. The idea to devote a special issue to celebrate the 60th anniversary of Shapley's paper [16] emerged a few years ago, and the decision was taken in 2011. Since then, we had the great pleasure to enjoy the attribution to Llyod Shapley of the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel 2012 1 "for the theory of stable allocations and the practice of market design," jointly with Alvin Roth. This is the occasion to recall the importance of Shapley's contributions in other areas of game theory like: • core for TU and NTU cooperative games; • equivalence principle for large economies; • matching (with David Gale);
Universitext, 2019
Zero-sum games are two-person games where the players have opposite evaluations of outcomes, hence the sum of the payoff functions is 0. In this kind of strategic interaction the players are antagonists, hence this induces pure conflict and there is no room for cooperation. It is thus enough to consider the payoff of player 1 (which player 2 wants to minimize). A finite zero sum game is represented by a real-valued matrix A. The first important result in game theory is called the minmax theorem and was proved by von Neumann [224] in 1928. It states that one can associate a number v(A) to this matrix and a way of playing for each player such that each can guarantee this amount. This corresponds to the notions of "value" and "optimal strategies". This chapter introduces some general notations and concepts that apply to any zero-sum game, and provides various proofs and extensions of the minmax theorem. Some consequences of this result are then given. Finally, a famous learning procedure (fictitious play) is defined and we show that the empirical average of the stage strategies of each player converges to the set of optimal strategies. 2.2 Value and Optimal Strategies Definition 2.2.1 A zero-sum game G in strategic form is defined by a triple (I, J, g), where I (resp. J) is the non-empty set of strategies of player 1 (resp. player 2) and g : I × J −→ R is the payoff function of player 1. The interpretation is as follows: player 1 chooses i in I and player 2 chooses j in J , in an independent way (for instance simultaneously). The payoff of player 1 is then g(i, j) and that of player 2 is −g(i, j): this means that the evaluations of the outcome induced by the joint choice (i, j) are opposite for the two players. Player 1
Theory and Decision Library, 1991
Stochastic games were first formulated by Shapley in 1953. In his fundamental paper Shapley [13] established the existence of value and optimal stationary strategies for zero-sum ,a-discounted stochastic games with finitely many states and actions for the two players. A positive stochastic game with countable state space and finite action spaces consists of the following objects: 1. State space Sthe set of nonnegative integers. 2. Finite action spaces A(B) for players 1(11). 3. The space of mixed strategies P(A)(P(B)) on the action spaces A(B) for players 1(11). 4. Nonnegative (immediate) reward function r(s, a, b). 5. Markovian transition q(tls, a, b) where q(tls, a, b) is the chance of moving from state s to state t when actions a, b are chosen by players I and II in the current state s. When playing a stochastic game, the players, in selecting their actions for the k th day, could use all the available information to them till that day, namely on the partial history up to the kth day given by (sllallbllsz,az,bz, ...sk-lIak-l,bk-l,Sk). Thus a strategy P for player I is a sequence (PlIPZ, ..) where Pk selects a mixed strategy (an element of P(A), for the k th day. We can classify stochastic games with additional structure on the immediate rewards and transition probabilities. The law of motion is said to be controlled by one player (say player II) if q(tls,i,i) = q(tls,i) for all i. We call a stochastic
Journal of Dynamics and Games, 2015
We show that by coupling two well-behaved exit-time problems one can construct two-person zero-sum stochastic games with finite state space having oscillating discounted values. This unifies and generalizes recent examples due to Vigeral (2013) and Ziliotto (2013). Contents 1. Introduction 2. A basic model 2.1. 0 player 2.2. 1 player 2.3. 2 players 3. Reversibility 3.1. Two examples 3.2. The discounted framework 4. Some regular configurations of order 1 2 4.1. A regular configuration with 0 players and countable state space 4.2. A regular configuration with one player, finitely many states, compact action space and continuous transition 4.3. A regular configuration with two players and finitely many states and actions 5. Some oscillating configurations of order 1 2 5.1.
Annals of Operations Research, 2007
The aim of the present paper is to study a one-point solution concept for bicooperative games. For these games introduced by Bilbao (Cooperative Games on Combinatorial Structures, 2000), we define a one-point solution called the Shapley value, since this value can be interpreted in a similar way to the classical Shapley value for cooperative games. The main result of the paper is an axiomatic characterization of this value.
Mathematics of Operations Research, 1997
The "potential approach" to value theory for finite games was introduced by Hart and Mas-Colell (1989). Here this approach is extended to non-atomic games. On appropriate spaces of differentiable games there is a unique potential operator, that generates the Aumann and Shapley (1974) value. As a corollary we obtain the uniqueness of the Aumann-Shapley value on certain subspaces of games. Next, the potential approach is applied to the weighted case, leading to "weighted non-atomic values". It is further shown that the asymptotic weighted value is well-defined, and that it coincides with the weighted value generated by the potential.
Journal of Mathematical Analysis and Applications, 2008
We focus our attention on the argument developed by R.J. Aumann and L.S. Shapley in a proposition used to prove the existence of a value on a certain class of games. Since we have found such an argument to be inexact, we have devised a supplementary construction which should be incorporated in the original proof to make it run in a satisfactory way.
Studies in Economic Theory, 1991
International Journal of Game Theory, 2005
We propose a dynamic process leading to the Shapley value of TU games or any solution satisfying Inessential Game (IG) and Continuity (CONT), based on a modified version of Hamiache's notion of an associated game.
We study the links between the values of stochastic games with varying stage duration h, the corresponding Shapley operators T and T h = hT + (1 − h)Id and the solution of the evolution equationḟt = (T − Id)ft. Considering general non expansive maps we establish two kinds of results, under both the discounted or the finite length framework, that apply to the class of "exact" stochastic games. First, for a fixed length or discount factor, the value converges as the stage duration go to 0. Second, the asymptotic behavior of the value as the length goes to infinity, or as the discount factor goes to 0, does not depend on the stage duration. In addition, these properties imply the existence of the value of the finite length or discounted continuous time game (associated to a continuous time jointly controlled Markov process), as the limit of the value of any discretization with vanishing mesh.
Dynamic Games and Applications, 2012
Considérons un jeu stochastique à deux joueurs et à somme-nulle avec un espace d'état Borélien S, des espaces d'actions métriques et compacts A, B et une probabilité de transition q telle que l'intégrale sous q de toute fonction mesurable et bornée dépend mesurablement de l'état initial s et continument des actions (a,b) des joueurs. Supposons que le paiement est une fonction bornée f des histoires infinies des états et actions. Admettons enfin que f soit mesurable pour le produit des topologies Boréliennes (des espaces des cordonnées) et semicontinue inférieurement pour le produit des topologies discrètes. Alors, le jeu a une valeur et le joueur II a une stratégie optimale parfaite en sous jeux.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.