Scholar - SciOpen

Follow this author

Chao Wang

Downloads: 8 Citations: 7 Articles: 2

Publication Fields

Physical Sciences and Engineering

Publications

Article type

Research Article (2)

Year

2024 (2)

Sort：

Published

Cited

Download

Open Access Research Article Issue

Leave It to Large Language Models! Correction and Planning with Memory Integration

Yuan Zhang, Chao Wang, Juntong Qi, Yan Peng

Cyborg and Bionic Systems 2024, 5: 0087

Published: 27 March 2024

Abstract

PDF (2.9 MB) Collect Collected

Downloads：2

As humans, we can naturally break down a task into individual steps in our daily lives and we are able to provide feedback or dynamically adjust the plan when encountering obstacles. Similarly, our aim is to facilitate agents in comprehending and carrying out natural language instructions in a more efficient and cost-effective manner. For example, in Vision–Language Navigation (VLN) tasks, the agent needs to understand instructions such as “go to the table by the fridge”. This understanding allows the agent to navigate to the table and infer that the destination is likely to be in the kitchen. The traditional VLN approach mainly involves training models using a large number of labeled datasets for task planning in unseen environments. However, manual labeling incurs a high cost for this approach. Considering that large language models (LLMs) already possess extensive commonsense knowledge during pre-training, some researchers have started using LLMs as decision modules in embodied tasks, although this approach shows the LLMs’ reasoning ability to plan a logical sequence of subtasks based on global information. However, executing subtasks often encounters issues, such as obstacles that hinder progress and alterations in the state of the target object. Even one mistake can cause the subsequent tasks to fail, which makes it challenging to complete the instructions through a single plan. Therefore, we propose a new approach—C (Correction) and P (Planning) with M (Memory) I (Integration)—that centered on an LLM for embodied tasks. In more detail, the auxiliary modules of the CPMI facilitate dynamic planning by the LLM-centric planner. These modules provide the agent with memory and generalized experience mechanisms to fully utilize the LLM capabilities, allowing it to improve its performance during execution. Finally, the experimental results on public datasets demonstrate that we achieve the best performance in the few-shot scenario, improving the efficiency of the successive task while increasing the success rate.

Open Access Research Article Issue

Exploring into the Unseen: Enhancing Language-Conditioned Policy Generalization with Behavioral Information

Longhui Cao, Chao Wang, Juntong Qi, Yan Peng

Cyborg and Bionic Systems 2024, 5: 0084

Published: 26 January 2024

Abstract

PDF (11.8 MB) Collect Collected

Downloads：6

Generalizing policies learned by agents in known environments to unseen domains is an essential challenge in advancing the development of reinforcement learning. Lately, language-conditioned policies have underscored the pivotal role of linguistic information in the context of cross-environments. Integrating both environmental and textual information into the observation space enables agents to accomplish similar tasks across different scenarios. However, for entities with varying forms of motion but the same name present in observations (e.g., immovable mage and fleeing mage), existing methods are unable to learn the motion information the entities possess well. They face the problem of ambiguity caused by motion. In order to tackle this challenge, we propose the entity mapper with multi-modal attention based on behavior prediction (EMMA-BBP) framework, comprising modules for predicting motion behavior and text matching. The behavioral prediction module is used to determine the motion information of the entities present in the environment to eliminate the semantic ambiguity of the motion information. The role of the text-matching module is to match the text given in the environment with the information about the entity’s behavior under observation, thus eliminating false textual information. EMMA-BBP has been tested in the demanding environment of MESSENGER, doubling the generalization ability of EMMA.

Total 2

<1/11>GOpage