MPI-INF/MPI-SWS or Local Campus Event
Action
<< Previous EntryNext Entry >>New Event EntryEdit this EntryLogin to DB (to update)
What and Who
Title:
Stochastic games with perfect information: the existence of canonical form
Speaker:
Khaled Elbassioni
From:
Max-Planck-Institut für Informatik - D1
If other:
Speakers Bio:
Event Type:
AG1 Mittagsseminar (own work)
Visibility:
D1, D4, RG1, MMCI, D3, D5, SWS
(please join our public mailing lists)
Date, Time and Location
Date:
24 November 2009
Language:
English
Weekday
Tuesday
Level:
AG Audience
Time:
13 : 15 -
Duration:
30 Minutes
Building:
E1 4 - MPI-INF
Location:
Saarbrücken
Room:
024
AbstractWe consider the well-known class of two-player, zero-sum, turn-based, stochastic games with perfect information introduced in the classical work of Gillette (1957). In such a game, there are two players White or maximizer and Black or minimizer, which control a finite set of states $V$, partitioned into two sets, each controlled by one of the players. At each state $u\in V$, the controlling player chooses one of a finite set of possible actions, according to which a real-valued reward is paid by Black to White, and then a new state is reached with a certain probability which depends on the action. The play continues forever, and White's (Black's) objective is to maximize (respectively, to minimize) her limiting average payoff.
The fact that a saddle point exists in pure positional strategies was proved by Gillette in 1957 and Liggett and Lippman in 1969
by considering the discounted version, in which the payoff of White is discounted by a factor $\gb^i$ at step $i$ and then proceeding to the limit as the discount factor $\gb\in[0,1)$ goes to $1$.

This class of games lies in the complexity class NP$\cap$CoNP and contains the well-known cyclic games (when $V_R$ is empty) and
Markov decision processes (when $V_B$ or $V_W$ is empty).

We show that every such game can be reduced by a potential transformation to a canonical form in which locally optimal strategies are globally optimal, and hence the value for every initial position and the optimal strategies of both players are obvious.

In the discounted case the transformed game values are also ergodic (i.e. do not depend on the initial position) and the corresponding potentials can be found in polynomial time.

Yet, this time tends to infinity, as $\gb \rightarrow 1^-$.


This is joint work with Endre Boros, Vladimir Gurvich and Kazuhisa Makino.
Contact
Name(s):
Khaled Elbassioni
EMail:
--email address not disclosed on the web
Phone
Video Broadcast
Video Broadcast
No
To Location:
To Building:
To Room:
Tags, Category, Keywords and additional notes
Created:Khaled Elbassioni, 10/17/2009 02:05 PMLast modified:Uwe Brahm/MPII/DE, 11/24/2009 06:00 AM