Discover the SciOpen Platform and Achieve Your Research Goals with Ease.
Search articles, authors, keywords, DOl and etc.
The team-adversary game simulates many real-world scenarios in which a team of agents competes cooperatively against an adversary. However, decision-making in this type of game is a big challenge since the joint action space of the team is combinatorial and exponentially related to the number of team members. It also hampers the existing equilibrium finding algorithms from solving team-adversary games efficiently. To solve this issue caused by the combinatorial action space, we propose a novel framework based on Counterfactual Regret Minimization (CFR) framework: CFR-MIX. Firstly, we propose a new strategy representation to replace the traditional joint action strategy by using the individual action strategies of all the team members, which can significantly reduce the strategy space. To maintain the cooperation between team members, a strategy consistency relationship is proposed. Then, we transform the consistency relationship of the strategy to the regret consistency for computing the equilibrium strategy with the new strategy representation under the CFR framework. To guarantee the regret consistency relationship, a product-form decomposition method over cumulative regret values is proposed. To implement this decomposition method, our CFR-MIX framework employs a mixing layer under the CFR framework to get the final decision strategy for the team, i.e., the Nash equilibrium strategy. Finally, we conduct experiments on games in different domains. Extensive results show that CFR-MIX significantly outperforms state-of-the-art algorithms. We hope it can help the team make decisions in large-scale team-adversary games.
M. Zinkevich, M. Johanson, M. Bowling, and C. Piccione, Regret minimization in games with incomplete information, Adv. Neural Inf. Process. Syst., vol. 20, pp. 905–912, 2008.
M. Bowling, N. Burch, M. Johanson, and O. Tammelin, Heads-up limit hold’em poker is solved, Science, vol. 347, no. 6218, pp. 145–149, 2015.
M. Moravčík, M. Schmid, N. Burch, V. Lisý, D. Morrill, N. Bard, T. Davis, K. Waugh, M. Johanson, and M. Bowling, DeepStack: Expert-level artificial intelligence in heads-up no-limit poker, Science, vol. 356, no. 6337, pp. 508–513, 2017.
R. Gibson, M. Lanctot, N. Burch, D. Szafron, and M. Bowling, Generalized sampling and variance in counterfactual regret minimization, Proc. AAAI Conf. Artif. Intell., vol. 26, no. 1, pp. 1355–1361, 2021.
J. F. Nash Jr, Equilibrium points in n-person games, Proc. Natl. Acad. Sci. U. S. A., vol. 36, no. 1, pp. 48–49, 1950.
A. Celli and N. Gatti, Computational results for extensive-form adversarial team games, Proc. AAAI Conf. Artif. Intell., vol. 32, no. 1, pp. 965–972, 2018.
Y. Zhang, B. An, and J. Černý, Computing ex ante coordinated team-maxmin equilibria in zero-sum multiplayer extensive-form games, Proc. AAAI Conf. Artif. Intell., vol. 35, no. 6, pp. 5813–5821, 2021.
M. Schmid, N. Burch, M. Lanctot, M. Moravcik, R. Kadlec, and M. Bowling, Variance reduction in Monte Carlo counterfactual regret minimization (VR-MCCFR) for extensive form games using baselines, Proc. AAAI Conf. Artif. Intell., vol. 33, no. 1, pp. 2157–2164, 2019.
S. M. Ross, Goofspiel—The game of pure strategy, J. Appl. Probab., vol. 8, no. 3, pp. 621–625, 1971.
N. Brown and T. Sandholm, Solving imperfect-information games via discounted regret minimization, Proc. AAAI Conf. Artif. Intell., vol. 33, no. 1, pp. 1829–1836, 2019.
Y. Zhang, Q. Guo, B. An, L. Tran-Thanh, and N. R. Jennings, Optimal interdiction of urban criminals with the aid of real-time information, Proc. AAAI Conf. Artif. Intell., vol. 33, no. 1, pp. 1262–1269, 2019.
The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).