Solving Zero-sum Games through Reinforcement Learning - IEEE CoG2022 Tutorial III

11 views

Create Account or Sign In to post comments

Tutorial III of 2022 IEEE Conference on Games given by Yaodong Yang and Le Cong Dinh, with the title of Solving Zero-sum Games through Reinforcement Learning.

Recent advances in multiagent reinforcement learning have introduced a new learning paradigm around population-based training. The idea is to consider the structure of games not at the micro-level of individual actions but at the meta-level of which agent to train against for any given game or situation.

A typical framework of population-based training is the Policy Space Response Oracle (PSRO) method, where, at each iteration, a new Reinforcement Learning agent is discovered as the best response to a Nash mixture of agents from the opponent populations. PSRO methods can provably converge to Nash, correlated, and coarse correlated equilibria in N-player games; particularly, they have shown remarkable performance in solving large-scale zero-sum games.

In this tutorial, the speaker introduces the basic idea of PSRO methods, the necessity of using PSRO methods in solving real-world games such as Chess, the recent results on solving N-player games and mean-field games, how to promote behavioral diversity during training, and the relationship of PSRO method to the conventional no-regret methods.

Finally, a new meta-PSRO framework named Neural Auto-Curricula is introduced, where we make AI learning to learn a PSRO-like solution algorithm purely from data, and a new PSRO framework called online double oracle that inherits the benefits from both population-based methods and no-regret methods.

Tutorial III of 2022 IEEE Conference on Games given by Yaodong Yang and Le Cong Dinh, with the title of Solving Zero-sum Games through Reinforcement Learning.

Recent advances in multiagent reinforcement learning have introduced a new learning paradigm around...

November 21, 2022