[1]
D. S. Gonzalez, M. Garzon, J. S. Dibangoye, and C. Laugier, Human-like decision-making for automated driving in highways, in Proc. IEEE Intelligent Transportation Systems Conf. (ITSC), Auckland, New Zealand, 2019, 2087–2094.
[2]
A. Tazoniero, R. Gonçalves, and F. Gomide, Decision making strategies for real-time train dispatch and control, in Analysis and Design of Intelligent Systems Using Soft Computing Techniques, P. Melin, O. Castillo, E. Gomez Ramírez, J. Kacprzyk, W. Pedrycz Eds. Heidelberg, Germany: Springer, 2007, pp. 195–204.
[4]
W. L. Waugh Jr., Mechanisms for collaboration in emergencymanagement: ICS, NIMS, and the problem with command and control, in The collaborative public manager: New ideas for the twenty-first century, R. O’Leary and L. B. Bingham Eds. Washington, DC, USA: Georgetown University Press, 2009, pp. 157–175, 2009.
[12]
D. Zha, J. Xie, W. Ma, S. Zhang, X. Lian, X. Hu, and J. Liu, DouZero: Mastering DouDizhu with self-play deep reinforcement learning, in Proc. 38th Int. Conf. Machine Learning, virtual, 2021, pp. 12333–12344.
[14]
C. Berner, G. Brockman, B. Chan, V. Cheung, P. Dębiak, C. Dennison, D. Farhi, Q. Fischer, S. Hashme, C. Hesse, et al., Dota 2 with large scale deep reinforcement learning, arXiv preprint arXiv: 1912.06680, 2019.
[15]
D. Ye, G. Chen, W. Zhang, S. Chen, B. Yuan, B. Liu, J. Chen, Z. Liu, F. Qiu, H. Yu, et al., Towards playing full MOBA games with deep reinforcement learning, in Proc. 34th Int. Conf. Neural Information Processing Systems, virtual, 2020, pp. 621–632.
[19]
M. Zinkevich, M. Johanson, M. Bowling, and C. Piccione, Regret minimization in games with incomplete information, in Proc. 21st Annu. Conf. Neural Information Processing Systems, Vancouver, Canada, 2007, pp. 1729–1736.
[20]
Z. Wang, C. Mu, S. Hu, C. Chu, and X. Li, Modelling the dynamics of regret minimization in large agent populations: a master equation approach, in Proc. 31st Int. Joint Conf. Artificial Intelligence, Vienna, Austria, 2022, pp. 23–29.
[21]
J. Heinrich, M. Lanctot, and D. Silver, Fictitious self-play in extensive-form games, in Proc. 32nd Int. Conf. Machine Learning, Lille, France, 2015, pp. 805–813.
[23]
D. J. Strouse, K. R. McKee, M. M. Botvinick, E. Hughes, and R. Everett, Collaborating with humans without human data, in Proc. 35th Annu. Conf. Neural Information Processing Systems, 2021, virtual, pp. 14502–14515.
[26]
M. Jaderberg, V. Dalibard, S. Osindero, W. MCzarnecki, J. Donahue, A. Razavi, O. Vinyals, T. Green, I. Dunning, K. Simonyan, et al., Population based training of neural networks, arXiv preprint arXiv: 1711.09846, 2017.
[27]
Z. Wu, K. Li, H. Xu, Y. Zang, B. An, and J. Xing, L2E: Learning to exploit your opponent, in Proc. Int. Joint Conf. Neural Networks (IJCNN), Padua, Italy, 2022, pp. 1–8.
[28]
J. N. Foerster, Y. M. Assael, N. de Freitas, and S. Whiteson, Learning to communicate with deep multi agent reinforcement learning, in Proc. 30th Conf. Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, pp. 2145–2153, 2016.
[29]
N. Rabinowitz, F. Perbet, F. Song, C. Zhang, S. M. Ali Eslami, and M. Botvinick, Machine theory of mind, in Proc. 35th Int. Conf. Machine Learning, Stockholm, Sweden, 2018, pp. 4218–4227.
[32]
A. Hussein, M. M. Gaber, E. Elyan, and C. Jayne, Imitation learning: A survey of learning methods, ACM Comput. Surv., vol. 50, no. 2, p. 21.
[33]
D. Wang, E. Churchill, P. Maes, X. Fan, B. Shneiderman, Y. Shi, and Q. Wang, From human-human collaboration to human-AI collaboration: Designing AI systems that can work together with people, in Proc. Extended Abstracts of the 2020 CHI Conf. Human Factors in Computing Systems, Honolulu, HI, USA, 2020, pp. 1–6.
[35]
A. Dengel, L. Devillers, and L. M. Schaal, Augmented human and human-machine co-evolution: Efficiency and ethics, in Reflections on Artificial Intelligence for Humanity, Cham, Switzerland: Springer, 2021, pp. 203–227.
[36]
J. Perolat, B. Scherrer, B. Piot, and O. Pietquin, Approximate dynamic programming for two-player zero-sum Markov games, in Proc. 32nd Int. Conf. Machine Learning, Lille, France, 2015, pp. 1321–1329.
[38]
F. Doshi-Velez and B. Kim, Towards a rigorous science of interpretable machine learning, arXiv preprint arXiv: 1702.08608, 2017.
[39]
M. T. Ribeiro, S. Singh, and C. Guestrin, “why should I trust you?”: Explaining the predictions of any classifier, in Proc. 22nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, San Francisco, CA, USA, 2016, pp. 1135–1144.
[41]
C. Yu, Y. Gu, Z. Yang, X. Yi, H. Luo, and Y. Shi, Tap, dwell or gesture?: Exploring head-based text entry techniques for HMDs, in Proc. 2017 CHI Conf. Human Factors in Computing Systems, Denver, CO, USA, 2017, pp. 4479–4488.
[42]
K. Lee, L. M. Smith, and P. Abbeel, Pebble: Feedback-efficient interactive reinforcement learning via relabeling experience and unsupervised pre-training, in Proc. 38th Int. Conf. Machine Learning, virtual, 2021, pp. 6152–6163.