Discover the SciOpen Platform and Achieve Your Research Goals with Ease.
Search articles, authors, keywords, DOl and etc.
The pervasive uncertainty and dynamic nature of real-world environments present significant challenges for the widespread implementation of machine-driven Intelligent Decision-Making (IDM) systems. Consequently, IDM should possess the ability to continuously acquire new skills and effectively generalize across a broad range of applications. The advancement of Artificial General Intelligence (AGI) that transcends task and application boundaries is critical for enhancing IDM. Recent studies have extensively investigated the Transformer neural architecture as a foundational model for various tasks, including computer vision, natural language processing, and reinforcement learning. We propose that a Foundation Decision Model (FDM) can be developed by formulating diverse decision-making tasks as sequence decoding tasks using the Transformer architecture, offering a promising solution for expanding IDM applications in complex real-world situations. In this paper, we discuss the efficiency and generalization improvements offered by a foundation decision model for IDM and explore its potential applications in multi-agent game AI, production scheduling, and robotics tasks. Lastly, we present a case study demonstrating our FDM implementation, DigitalBrain (DB1) with 1.3 billion parameters, achieving human-level performance in 870 tasks, such as text generation, image captioning, video game playing, robotic control, and traveling salesman problems. As a foundation decision model, DB1 represents an initial step toward more autonomous and efficient real-world IDM applications.
N. Khatri and H. A. Ng, The role of intuition in strategic decision making, Hum. Relat., vol. 53, no. 1, pp. 57–86, 2000.
R. S. Peres, X. Jia, J. Lee, K. Sun, A. W. Colombo, and J. Barata, Industrial artificial intelligence in industry 4.0 - systematic review, challenges and outlook, IEEE Access, vol. 8, pp. 220121–220139, 2020.
S. Eom and E. Kim, A survey of decision support system applications (1995–2001), J. Oper. Res. Soc., vol. 57, no. 11, pp. 1264–1278, 2006.
G. O. Barnett, J. J. Cimino, J. A. Hupp, and E. P. Hoffer, DXplain. An evolving diagnostic decision-support system, JAMA, vol. 258, no. 1, pp. 67–74, 1987.
J. Ranjan, Business intelligence: Concepts, components, techniques and benefits, J. Theor. Appl. Inf. Technol., vol. 9, no. 1, pp. 60–70, 2009.
M. B. Jensen, M. P. Philipsen, A. Mogelmose, T. B. Moeslund, and M. M. Trivedi, Vision for looking at traffic lights: Issues, survey, and perspectives, IEEE Trans. Intell. Transport. Syst., vol. 17, no. 7, pp. 1800–1815, 2016.
J. Wang, W. Zhang, and S. Yuan, Display advertising with real-time bidding (RTB) and behavioural targeting, Found. Trends® Inf. Retr., vol. 11, no. 4–5, pp. 297–435, 2017.
R. R. Murphy, Introduction to AI robotics, Ind. Robot Int. J., vol. 28, no. 3, pp. 266–267, 2001.
J. Chen, J. Sun, and G. Wang, From unmanned systems to autonomous intelligent systems, Engineering, vol. 12, pp. 16–19, 2022.
D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton et al., Mastering the game of Go without human knowledge, Nature, vol. 550, no. 7676, pp. 354–359, 2017.
O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Mathieu, A. Dudzik, J. Chung, D. H. Choi, R. Powell, T. Ewalds, P. Georgiev et al., Grandmaster level in StarCraft II using multi-agent reinforcement learning, Nature, vol. 575, no. 7782, pp. 350–354, 2019.
A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, Commun. ACM, vol. 60, no. 6, pp. 84–90, 2017.
C. A. Seger and E. J. Peterson, Categorization = decision making + generalization, Neurosci. Biobehav. Rev., vol. 37, no. 7, pp. 1187–1200, 2013.
R. Kirk, A. Zhang, E. Grefenstette, and T. Rocktäschel, A survey of zero-shot generalisation in deep reinforcement learning, J. Artif. Intell. Res., vol. 76, pp. 201–264, 2023.
O. Poquet and M. de Laat, Developing capabilities: Lifelong learning in the age of AI, Br. J. Educ. Technol., vol. 52, no. 4, pp. 1695–1708, 2021.
Q. Fu, Z. Wang, N. Fang, B. Xing, X. Zhang, and J. Chen, MAML^2: meta reinforcement learning via meta-learning for task categories, Front. Comput. Sci., vol. 17, no. 4, pp. 1–11, 2022.
Y. Zhang and Q. Yang, A survey on multi-task learning, IEEE Trans. Knowl. Data Eng., vol. 34, no. 12, pp. 5586–5609, 2022.
B. Liu, Lifelong machine learning: a paradigm for continuous learning, Front. Comput. Sci., vol. 11, no. 3, pp. 359–361, 2017.
E. Salvato, G. Fenu, E. Medvet, and F. A. Pellegrino, Crossing the reality gap: A survey on sim-to-real transferability of robot controllers in reinforcement learning, IEEE Access, vol. 9, pp. 153171–153187, 2021.
X. Zhang, H. Ma, X. Luo, and J. Yuan, LIDAR: learning from imperfect demonstrations with advantage rectification, Front. Comput. Sci., vol. 16, no. 1, pp. 1–10, 2021.
J. Hua, L. Zeng, G. Li, and Z. Ju, Learning for a robot: Deep reinforcement learning, imitation learning, transfer learning, Sensors, vol. 21, no. 4, p. 1278, 2021.
S. Gronauer and K. Diepold, Multi-agent deep reinforcement learning: a survey, Artif. Intell. Rev., vol. 55, no. 2, pp. 895–943, 2022.
L. Meng, M. Wen, Y. Yang, C. Le, X. Li, W. Zhang, Y. Wen, H. Zhang, J. Wang, and B. Xu, Offline pre-trained multi-agent decision Transformer, Mach. Intell. Res., vol. 20, no. 2, pp. 233–248, 2023.
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot et al., Mastering the game of Go with deep neural networks and tree search, Nature, vol. 529, no. 7587, pp. 484–489, 2016.
Z. Jiang, S. Yuan, J. Ma, and Q. Wang, The evolution of production scheduling from Industry 3.0 through Industry 4.0, Int. J. Prod. Res., vol. 60, no. 11, pp. 3534–3554, 2022.
C. Waubert de Puiseau, R. Meyes, and T. Meisen, On reliability of reinforcement learning based production scheduling systems: a comparative survey, J. Intell. Manuf., vol. 33, no. 4, pp. 911–927, 2022.
F. Bonin-Font, A. Ortiz, and G. Oliver, Visual navigation for mobile robots: A survey, J. Intell. Rob. Syst., vol. 53, no. 3, pp. 263–296, 2008.
B. Singh, R. Kumar, and V. P. Singh, Reinforcement learning in robotic applications: a comprehensive survey, Artif. Intell. Rev., vol. 55, no. 2, pp. 945–990, 2022.
S. Tunyasuvunakool, A. Muldal, Y. Doron, S. Liu, S. Bohez, J. Merel, T. Erez, T. Lillicrap, N. Heess, and Y. Tassa, Dm_control: Software and tasks for continuous control, Softw. Impacts, vol. 6, p. 100022, 2020.
S. Lin and B. W. Kernighan, An effective heuristic algorithm for the traveling-salesman problem, Oper. Res., vol. 21, no. 2, pp. 498–516, 1973.
R. J. Williams and D. Zipser, A learning algorithm for continually running fully recurrent neural networks, Neural Comput., vol. 1, no. 2, pp. 270–280, 1989.
The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).