New algorithms for solving stochastic games

Miquel Barton

New algorithms for solving stochastic games

2018, arXiv: Optimization and Control

Abstract

Stochastic games are a classical model in game theory in which two opponents interact and the environment changes in response to the players' behavior. The central solution concepts for these games are the discounted values and the value, which represent what playing the game is worth to the players for different levels of impatience. In the present manuscript, we provide algorithms for computing exact expressions for the discounted values and for the value, which are polynomial in the number of pure stationary strategies of the players. This result considerably improves all the existing algorithms, including the most efficient one, due to Hansen, Kouck\'y, Lauritzen, Miltersen and Tsigaridas (STOC 2011).

We consider a class of hierarchical noncooperative N-player games where the ith player solves a parametrized stochastic mathematical program with equilibrium constraints (MPEC) with the caveat that the implicit form of the ith player's in MPEC is convex in player strategy, given rival decisions. Few, if any, general purpose schemes exist for computing equilibria even for deterministic specializations of such games. We develop computational schemes in two distinct regimes: (a) Monotone regimes. When player-specific implicit problems are convex, then the necessary and sufficient equilibrium conditions are given by a stochastic inclusion. Under a monotonicity assumption on the operator, we develop a variance-reduced stochastic proximalpoint scheme that achieves deterministic rates of convergence in terms of solving proximal-point problems in monotone/strongly monotone regimes and the schemes are characterized by optimal or near-optimal sample-complexity guarantees. Finally, the generated sequences are shown to be convergent to an equilibrium in an almost-sure sense in both monotone and strongly monotone regimes; (b) Potentiality. When the implicit form of the game admits a potential function, we develop an asynchronous relaxed inexact smoothed proximal best-response framework. However, any such avenue is impeded by the need to efficiently compute an approximate solution of an MPEC with a strongly convex implicit objective. To this end, we consider the smoothed counterpart of this game where each player's problem is smoothed via randomized smoothing. Notably, under suitable assumptions, we show that an η-smoothed game admits an η-approximate Nash equilibrium of the original game. Our proposed scheme produces a sequence that converges almost surely to an η-approximate Nash equilibrium in both relaxed and unrelaxed settings. This scheme is reliant on computing the proximal problem, a stochastic MPEC whose implicit form has a strongly convex objective, with increasing accuracy in finite-time. The smoothing framework allows for developing a variance-reduced zeroth-order scheme for such problems that admits a fast rate of convergence. Numerical studies on a class of multi-leader multi-follower games suggest that variance-reduced proximal schemes provide significantly better accuracy with far lower run-times. The relaxed best-response scheme scales well will problem size and generally displays more stability than its unrelaxed counterpart.

Log In

New algorithms for solving stochastic games

Sign up for access to the world's latest research

Abstract

Related papers

Related topics