AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (10.9 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Article | Open Access

Growing from Exploration: A Self-Exploring Framework for Robots Based on Foundation Models

Shoujie Li1Ran Yu1Tong Wu1Junwen Zhong2Xiao-Ping Zhang1Wenbo Ding1,3( )
Shenzhen Ubiquitous Data Enabling Key Lab, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China
Department of Physics and Chemistry, Faculty of Science and Technology, University of Macau, Macau 999078, China
RISC-V International Open Source Laboratory, Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, China

Shoujie Li, Ran Yu, and Tong Wu contributed equally to this work.

Show Author Information

Abstract

Intelligent robot is the ultimate goal in the robotics field. Existing works leverage learning-based or optimization-based methods to accomplish human-defined tasks. However, the challenge of enabling robots to explore various environments autonomously remains unresolved. In this work, we propose a framework named GExp, which endows robots with the capability of exploring and learning autonomously without human intervention. To achieve this goal, we devise modules including self-exploration, knowledge-base-building, and close-loop feedback based on foundation models. Inspired by the way that infants interact with the world, GExp encourages robots to understand and explore the environment with a series of self-generated tasks. During the process of exploration, the robot will acquire skills from experiences that are useful in the future. GExp provides robots with the ability to solve complex tasks through self-exploration. GExp work is independent of prior interactive knowledge and human intervention, allowing it to adapt directly to different scenarios, unlike previous studies that provided in-context examples as few-shot learning. In addition, we propose a workflow of deploying the real-world robot system with self-learned skills as an embodied assistant. Project website: GExp.com.

References

[1]

G. Hacques, J. Komar, M. Dicks, and L. Seifert, Exploring to learn and learning to explore, Psychol. Res., vol. 85, no. 4, pp. 1367–1379, 2021.

[2]
S. Ivaldi, S. M. Nguyen, N. Lyubova, A. Droniou, V. Padois, D. Filliat, P. -Y. Oudeyer, and O. Sigaud, Object learning through active exploration, IEEE Trans. Auton. Mental Dev., vol. 6, no. 1, pp. 56–72, 2014.
[3]
M. Balsells, M. Torne, Z. Wang, S. Desai, P. Agrawal, and A. Gupta, Autonomous robotic reinforcement learning with asynchronous human feedback, in Proc. 7th Conf. Robot Learning (CoRL 2023), Atlanta, GA, USA, 2023, pp. 774–799.
[5]
A. Doumanoglou, J. Stria, G. Peleka, I. Mariolis, V. Petrik, A. Kargakos, L. Wagner, V. Hlavac, T. -K. Kim, and S. Malassiotis, Folding clothes autonomously: A complete pipeline, IEEE Trans. Robot., vol. 32, no. 6, pp. 1461–1478, 2016.
[6]
C. Bersch, B. Pitzer, and S. Kammel, Bimanual robotic cloth manipulation for laundry folding, in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems, San Francisco, CA, USA, 2011, pp. 1413–1419.
[7]
K. Mo, Y. Deng, C. Xia, and X. Wang, Learning language-conditioned deformable object manipulation with graph dynamics, arXiv preprint arXiv: 2303.01310, 2023.
[8]
H. Shi, H. Xu, S. Clarke, Y. Li, and J. Wu, Robocook: Long-horizon elasto-plastic object manipulation with diverse tools, arXiv preprint arXiv: 2306.14447, 2023.
[9]
Z. Fu, T. Z. Zhao, and C. Finn, Mobile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation, arXiv preprint arXiv: 2401.02117, 2024.
[10]
Z. Yan, N. Crombez, J. Buisson, Y. Ruichck, T. Krajnik, and L. Sun, A quantifiable stratification strategy for tidy-up in service robotics, in Proc. IEEE Int. Conf. Advanced Robotics and Its Social Impacts (ARSO), Tokoname, Japan, 2021, pp. 182–187.
[11]
I. Kapelyukh and E. Johns, My house, my rules: Learning tidying preferences with graph neural networks, in Proc. 5th Conf. Robot Learning (CoRL 2021), London, UK, 2023, pp. 740–749.
[12]
S. Li, H. Yu, W. Ding, H. Liu, L. Ye, C. Xia, X. Wang, and X. -P. Zhang, Visual–tactile fusion for transparent object grasping in complex backgrounds, IEEE Trans. Robot., vol. 39, no. 5, pp. 3838–3856, 2023.
[13]
Y. Deng, X. Guo, Y. Wei, K. Lu, B. Fang, D. Guo, H. Liu, and F. Sun, Deep reinforcement learning for robotic pushing and picking in cluttered environment, in Proc. IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS), Macau, China, 2019, pp. 619–626.
[14]

Ahmed Hussein, Mohamed Medhat Gaber, Eyad Elyan, and Chrisina Jayne. Imitation learning: A survey of learning methods, ACM Comput. Surv., vol. 50, no. 2, pp. 1–35, 2017.

[15]
OpenAI, J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, et al., GPT-4 technical report, arXiv preprint arXiv: 2303.08774, 2023.
[16]
H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, et al., Llama 2: Open foundation and fine-tuned chat models, arXiv preprint arXiv: 2307.09288, 2023.
[17]
W. Huang, C. Wang, R. Zhang, Y. Li, J. Wu, and F. F. Li, Voxposer: Composable 3D value maps for robotic manipulation with language models, arXiv preprint arXiv: 2307.05973, 2023.
[18]
J. Liang, W. Huang, F. Xia, P. Xu, K. Hausman, B. Ichter, P. Florence, and A. Zeng, Code as policies: Language model programs for embodied control, in Proc. IEEE Int. Conf. Robotics and Automation (ICRA), London, UK, 2023, pp. 9493–9500.
[19]
Anthony Brohan, Yevgen Chebotar, Chelsea Finn, Karol Hausman, Alexander Herzog, Daniel Ho, Julian Ibarz, Alex Irpan, Eric Jang, Ryan Julian, et al. Do as i can, not as i say: Grounding language in robotic affordances, in Proc. 6th Conf. Robot Learning (CoRL 2022), Auckland, New Zealand, 2022, pp. 287–318.
[20]

W. K. Vong, W. Wang, A. E. Orhan, and B. M. Lake, Grounded language acquisition through the eyes and ears of a single child, Science, vol. 383, no. 6682, pp. 504–511, 2024.

[21]
X. Zhu, Y. Chen, H. Tian, C. Tao, W. Su, C. Yang, G. Huang, B. Li, L. Lu, X. Wang, et al., Ghost in the minecraft: Generally capable agents for open-world enviroments via large language models with text-based knowledge and memory, arXiv preprint arXiv: 2305.17144, 2023.
[22]
G. Wang, Y. Xie, Y. Jiang, A. Mandlekar, C. Xiao, Y. Zhu, L. Fan, and A. Anandkumar, Voyager: An open-ended embodied agent with large language models, arXiv preprint arXiv: 2305.16291, 2023.
[23]
Y. Hu, F. Lin, T. Zhang, L. Yi, and Y. Gao, Look before you leap: Unveiling the power of GPT-4v in robotic vision-language planning, arXiv preprint arXiv: 2311.17842, 2023.
[24]
Z. Yang, L. Li, K. Lin, J. Wang, C. C. Lin, Z. Liu, and L. Wang, The dawn of LMMs: Preliminary explorations GPT-4v (ision), arXiv preprint arXiv: 2309.17421, 2023.
[25]

S. H. Vemprala, R. Bonatti, A. Bucker, and A. Kapoor, ChatGPT for robotics: Design principles and model abilities, IEEE Access, vol. 12, pp. 55682–55696, 2024.

[26]
W. Huang, F. Xia, T. Xiao, H. Chan, J. Liang, P. Florence, A. Zeng, J. Tompson, I. Mordatch, Y. Chebotar, et al., Inner monologue: Embodied reasoning through planning with language models, arXiv preprint arXiv: 2207.05608, 2022.
[27]
T. Kwon, N. Di Palo, and E. Johns, Language models as zero-shot trajectory generators, arXiv preprint arXiv: 2310.11604, 2023.
[28]
M. Xu, P. Huang, W. Yu, S. Liu, X. Zhang, Y. Niu, T. Zhang, F. Xia, J. Tan, and D. Zhao, Creative robot tool use with large language models, arXiv preprint arXiv: 2310.13065, 2023.
[29]
Y. J. Ma, W. Liang, G. Wang, D. A. Huang, O. Bastani, D. Jayaraman, Y. Zhu, L. Fan, and A. Anandkumar, Eureka: Human-level reward design via coding large language models, arXiv preprint arXiv: 2310.12931, 2023.
[30]
A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C Berg, W. Y. Lo, et al., Segment anything, arXiv preprint arXiv: 2304.02643, 2023.
[31]
M. Minderer, A. Gritsenko, A. Stone, M. Neumann, D. Weissenborn, A. Dosovitskiy, A. Mahendran, A. Arnab, M. Dehghani, Z. Shen, et al., Simple open-vocabulary object detection with vision transformers, arXiv preprint arXiv: 2205.06230, 2022.
[32]

C. R. Garrett, R. Chitnis, R. Holladay, B. Kim, T. Silver, L. P. Kaelbling, and T. Lozano-Pérez, Integrated task and motion planning, Annu. Rev. Control Robot. Auton. Syst., vol. 4, pp. 265–293, 2021.

[33]
D. Nau, Y. Cao, A. Lotem, and H. Munoz-Avila, SHOP: Simple hierarchical ordered planner, in Proc. 16th Int. Joint Conf. Artificial Intelligence (IJCAI’99), Stockholm, Sweden, 1999, pp. 968–973.
[34]
Y. Xie, C. Yu, T. Zhu, J. Bai, Z. Gong, and H. Soh, Translating natural language to planning goals with large language models, arXiv preprint arXiv: 2302.05128, 2023.
[35]
J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. H. Chi, Q. V. Le, D. Zhou, Chain-of-thought prompting elicits reasoning in large language models, in Proc. 36th Conf. Neural Information Processing Systems (NeurIPS 2022), New Orleans, LA, USA, 2022, pp. 24824–24837.
[36]
S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y. Cao, ReAct: Synergizing reasoning and acting in language models, arXiv preprint arXiv: 2210.03629, 2022.
[37]
A. Zeng, P. Florence, J. Tompson, S. Welker, J. Chien, M. Attarian, T. Armstrong, I. Krasin, D. Duong, V. Sindhwani, et al., Transporter networks: Rearranging the visual world for robotic manipulation, in Proc. 4th Conf. Robot Learning (CoRL 2020), virtual, 2020.
[38]

S. James, Z. Ma, D. R. Arrojo, and A. J. Davison, RLBench: The robot learning benchmark & learning environment, IEEE Robot. Autom. Lett., vol. 5, no. 2, pp. 3019–3026, 2020.

[39]
X. Gu, T. Y. Lin, W. Kuo, and Y. Cui, Open vocabulary object detection via vision and language knowledge distillation, arXiv preprint arXiv: 2104.13921, 2021.
CAAI Artificial Intelligence Research
Article number: 9150037
Cite this article:
Li S, Yu R, Wu T, et al. Growing from Exploration: A Self-Exploring Framework for Robots Based on Foundation Models. CAAI Artificial Intelligence Research, 2024, 3: 9150037. https://doi.org/10.26599/AIR.2024.9150037
Part of a topical collection:

387

Views

39

Downloads

0

Crossref

Altmetrics

Received: 24 January 2024
Revised: 23 February 2024
Accepted: 22 March 2024
Published: 04 July 2024
© The author(s) 2024.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return