Discover the SciOpen Platform and Achieve Your Research Goals with Ease.
Search articles, authors, keywords, DOl and etc.
Embodied Intelligence, which integrates physical interaction capabilities with cognitive computation in real-world scenarios, provides a promising path to achieve Artificial General Intelligence (AGI). Recently, the landscape of embodied intelligence has grown profoundly, empowering robotics, autonomous driving, intelligent manufacturing, and so on. This paper presents a comprehensive survey on the evolution of embodied intelligence, tracing its journey from philosophical roots to contemporary advancements. We emphasize significant progress in the integration of perceptual, cognitive, and behavioral components, rather than focusing on these elements in isolation. Despite these advancements, several challenges remain, including hardware limitations, model generalization, physical world understanding, multimodal integration, and ethical considerations, which are critical for the development of robust and reliable embodied intelligence systems. To address these challenges, we outline future research directions, emphasizing Large Perception-Cognition-Behavior (PCB) models, physical intelligence, and morphological intelligence. Central to these perspectives is the general agent framework termed as Bcent, which integrates perception, cognition, and behavior dynamics. Bcent aims to enhance the adaptability, robustness, and intelligence of embodied systems, aligning with the ongoing progress in robotics, autonomous systems, healthcare, and more.
R. A. Brooks, Intelligence without representation, Artif. Intell, vol. 47, nos.1–3, pp. 139–159, 1991.
B. M. Lake, T. D. Ullman, J. B. Tenenbaum, and S. J. Gershman, Building machines that learn and think like people, Behav. Brain Sci, vol. 40, p. e253, 2016.
B. Goertzel, Artificial general intelligence: Concept, state of the art, and future prospects, J. Artif. Gen. Intell, vol. 5, no. 1, pp. 1–8, 2014.
A. M. Turing, Computing machinery and intelligence, Mind, vol. 59, no. 236, pp. 433–460, 1950.
R. Held and A. Hein, Movement-produced stimulation in the development of visually guided behavior, J. Comp. Physiol. Psychol, vol. 56, no. 5, pp. 872–876, 1963.
B. Kuipers, E. A. Feigenbaum, P. E. Hart, and N. J. Nilsson, Shakey: from conception to history, AI Mag, vol. 38, no. 1, pp. 88–103, 2017.
R. A. Brooks, A robust layered control system for a mobile robot, IEEE J. Robot. Autom, vol. 2, no. 1, pp. 14–23, 1986.
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
A. Gupta, S. Savarese, S. Ganguli, and F. F. Li, Embodied intelligence via learning and evolution, Nat. Commun., vol. 12, no. 1, p. 5721, 2021.
H. P. Liu, D. Guo, F. C. Sun, and X. Zhang, Morphology-based embodied intelligence: Historical retrospect and research progress, Acta Autom. Sin., vol. 49, no. 6, pp. 1131–1154, 2023.
J. Duan, S. Yu, H. L. Tan, H. Zhu, and C. Tan, A survey of embodied AI: From simulators to research tasks, IEEE Trans. Emerg. Top. Comput. Intell., vol. 6, no. 2, pp. 230–244, 2022.
Z. Zhao, Q. Wu, J. Wang, B. Zhang, C. Zhong, and A. A. Zhilenkov, Exploring embodied intelligence in soft robotics: A review, Biomimetics, vol. 9, no. 4, p. 248, 2024.
W. S. McCulloch and W. Pitts, A logical calculus of the ideas immanent in nervous activity, Bull. Math. Biophys., vol. 5, no. 4, pp. 115–133, 1943.
L. Smith and M. Gasser, The development of embodied cognition: Six lessons from babies, Artif. Life, vol. 11, no. 1-2, pp. 13–29, 2005.
A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, Commun. ACM, vol. 60, no. 6, pp. 84–90, 2017.
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al., Mastering the game of Go with deep neural networks and tree search, Nature, vol. 529, no. 7587, pp. 484–489, 2016.
A. Chowdhery, S. Narang, J. Devlin, M. Bosma, G. Mishra, A. Roberts, P. Barham, H. Won Chung, C. Sutton, S. Gehrmann, et al., PaLM: Scaling language modeling with pathways, J. Mach. Learn. Res., vol. 24, no. 240, pp. 1–113, 2023.
W. Yuan, S. Dong, and E. Adelson, GelSight: high-resolution robot tactile sensors for estimating geometry and force, Sensors, vol. 17, no. 12, pp. 2762, 2017.
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, Backpropagation applied to handwritten zip code recognition, Neural Comput., vol. 1, no. 4, pp. 541–551, 1989.
T. Kong, F. Sun, H. Liu, Y. Jiang, L. Li, and J. Shi, FoveaBox: beyound anchor-based object detection, IEEE Trans. Image Process, vol. 29, pp. 7389–7398, 2020.
F. Sun, T. Kong, W. Huang, C. Tan, B. Fang, and H. Liu, Feature pyramid reconfiguration with consistent loss for object detection, IEEE Trans. Image Process., vol. 28, no. 10, pp. 5041–5051, 2019.
L. Cao, F. Sun, X. Liu, W. Huang, R. Kotagiri, and H. Li, End-to-end ConvNet for tactile recognition using residual orthogonal tiling and pyramid convolution ensemble, Cogn. Comput., vol. 10, no. 5, pp. 718–736, 2018.
C. Liu, W. Huang, F. Sun, M. Luo, and C. Tan, LDS-FCM: A linear dynamical system based fuzzy C-means method for tactile recognition, IEEE Trans. Fuzzy Syst., vol. 27, no. 1, pp. 72–83, 2019.
A. Newell and H. Simon, The logic theory machine: A complex information processing system, IEEE Trans. Inform. Theory, vol. 2, no. 3, pp. 61–79, 1956.
B. G. Buchanan and E. A. Feigenbaum, Dendral and meta-dendral: Their applications dimension, Artif. Intell., vol. 11, nos. 1&2, pp. 5–24, 1978.
T. Winograd, Understanding natural language, Cogn. Psychol., vol. 3, no. 1, pp. 1–191, 1972.
H. Liu and P. Singh, ConceptNet—A practical commonsense reasoning tool-kit, BT Technol. J., vol. 22, no. 4, pp. 211–226, 2004.
J. L. Elman, Finding structure in time, Cogn. Sci., vol. 14, no. 2, pp. 179–211, 1990.
S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.
S. Rasp, P. D. Dueben, S. Scher, J. A. Weyn, S. Mouatadid, and N. Thuerey, WeatherBench: A benchmark data set for data-driven weather forecasting, J. Adv. Model. Earth Syst., vol. 12, no. 11, pp. 1–17, 2020.
D. Salinas, V. Flunkert, J. Gasthaus, and T. Januschowski, DeepAR: Probabilistic forecasting with autoregressive recurrent networks, Int. J. Forecast., vol. 36, no. 3, pp. 1181–1191, 2020.
Z. Li and D. Hoiem, Learning without forgetting, IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 12, pp. 2935–2947, 2018.
J. G. Ziegler and N. B. Nichols, Optimum settings for automatic controllers, Trans. Am. Soc. Mech. Eng., vol. 64, no. 8, pp. 759–765, 1942.
J. Richalet, A. Rault, J. L. Testud, and J. Papon, Model predictive heuristic control, Automatica, vol. 14, no. 5, pp. 413–428, 1978.
R. S. Sutton, Learning to predict by the methods of temporal differences, Mach. Learn., vol. 3, pp. 9–14, 1988.
C. J. C. H. Watkins and P. Dayan, Q-learning, Mach. Learn., vol. 8, no. 3, pp. 279–292, 1992.
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al., Human-level control through deep reinforcement learning, Nature, vol. 518, pp. 529–533, 2015.
J. Duan, Y. Guan, S. E. Li, Y. Ren, Q. Sun, and B. Cheng, Distributional soft actor-critic: Off-policy reinforcement learning for addressing value estimation errors, IEEE Trans. Neural Netw. Learning Syst., vol. 33, no. 11, pp. 6584–6598, 2022.
F. Sun, H. Liu, C. Yang, and B. Fang, Multimodal continual learning using online dictionary updating, IEEE Trans. Cogn. Dev. Syst., vol. 13, no. 1, pp. 171–178, 2021.
B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, NeRF: Representing scenes as neural radiance fields for view synthesis, Commun. ACM, vol. 65, no. 1, pp. 99–106, 2021.
B. Kerbl, G. Kopanas, T. Leimkuehler, and G. Drettakis, 3D Gaussian splatting for real-time radiance field rendering, ACM Trans. Graph., vol. 42, no. 4, pp. 1–14, 2023.
O. M. Andrychowicz, B. Baker, M. Chociej, R. Józefowicz, B. McGrew, J. Pachocki, A. Petron, M. Plappert, G. Powell, A. Ray, et al., Learning dexterous in-hand manipulation, Int. J. Robot. Res., vol. 39, no. 1, pp. 3–20, 2020.
J. Aloimonos, I. Weiss, and A. Bandyopadhyay, Active vision, Int. J. Comput. Vis., vol. 1, no. 4, pp. 333–356, 1988.
S. Liu, G. Lever, Z. Wang, J. Merel, S. M. Ali Eslami, D. Hennes, W. M. Czarnecki, Y. Tassa, S. Omidshafiei, A. Abdolmaleki, et al., From motor control to team play in simulated humanoid football, Sci. Robot., vol. 7, no. 69, p. eabo0235, 2022.
F. Liu, F. Sun, B. Fang, X. Li, S. Sun, and H. Liu, Hybrid robotic grasping with a soft multimodal gripper and a deep multistage learning scheme, IEEE Trans. Robot., vol. 39, no. 3, pp. 2379–2399, 2023.
L. E. Kavraki, P. Svestka, J. C. Latombe, and M. H. Overmars, Probabilistic roadmaps for path planning in high-dimensional configuration spaces, IEEE Trans. Robot. Automat., vol. 12, no. 4, pp. 566–580, 1996.
S. Tan, M. Ge, D. Guo, H. Liu, and F. Sun, Knowledge-based embodied question answering, IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 10, pp. 11948–11960, 2023.
S. H. Vemprala, R. Bonatti, A. Bucker, and A. Kapoor, ChatGPT for robotics: Design principles and model abilities, IEEE Access, vol. 12, pp. 55682–55696, 2024.
S. James, Z. Ma, D. R. Arrojo, and A. J. Davison, RLBench: the robot learning benchmark & learning environment, IEEE Robot. Autom. Lett., vol. 5, no. 2, pp. 3019–3026, 2020.
C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, Exploring the limits of transfer learning with a unified text-to-text transformer, J. Mach. Learn. Res., vol. 21, no. 140, pp. 1–67, 2020.
S. Luo, N. F. Lepora, U. Martinez-Hernandez, J. Bimbo, and H. Liu, Editorial: ViTac: Integrating vision and touch for multimodal and cross-modal perception, Front. Robot. AI, vol. 8, pp. 697601, 2021.
W. Xu, G. Zhou, Y. Zhou, Z. Zou, J. Wang, W. Wu, and X. Li, A vision-based tactile sensing system for multimodal contact information perception via neural network, IEEE Trans. Instrum. Meas., vol. 73, pp. 1–11, 2024.
F. Sun, N. Liu, X. Wang, R. Sun, S. Miao, Z. Kang, B. Fang, H. Liu, Y. Zhao, and H. Huang, Digital-twin-assisted skill learning for 3C assembly tasks, IEEE Trans. Cybern., vol. 54, no. 7, pp. 3852–3863, 2024.
The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).