Real-time vehicle prediction is crucial in autonomous driving technology, as it allows adjustments to be made in advance to the driver or the vehicle, enabling them to take smoother driving actions to avoid potential collisions. This study proposes a physics-enhanced residual learning (PERL)-based predictive control method to mitigate traffic oscillation in the mixed traffic environment of connected and automated vehicles (CAVs) and human-driven vehicles (HDVs). The introduced model includes a prediction model and a CAV controller. The prediction model is responsible for forecasting the future behavior of the preceding vehicle on the basis of the behavior of preceding vehicles. This PERL model combines physical information (i.e., traffic wave properties) with data-driven features extracted from deep learning techniques, thereby precisely predicting the behavior of the preceding vehicle, especially speed fluctuations, to allow sufficient time for the vehicle/driver to respond to these speed fluctuations. For the CAV controller, we employ a model predictive control (MPC) model that considers the dynamics of the CAV and its following vehicles, improving safety and comfort for the entire platoon. The proposed model is applied to an autonomous driving vehicle through vehicle-in-the-loop (ViL) and compared with real driving data and three benchmark models. The experimental results validate the proposed method in terms of damping traffic oscillation and enhancing the safety and fuel efficiency of the CAV and the following vehicles in mixed traffic in the presence of uncertain human-driven vehicle dynamics and actuator lag.
- Article type
- Year
- Co-author
Model-based reinforcement learning (RL) is anticipated to exhibit higher sample efficiency than model-free RL by utilizing a virtual environment model. However, obtaining sufficiently accurate representations of environmental dynamics is challenging because of uncertainties in complex systems and environments. An inaccurate environment model may degrade the sample efficiency and performance of model-based RL. Furthermore, while model-based RL can improve sample efficiency, it often still requires substantial training time to learn from scratch, potentially limiting its advantages over model-free approaches. To address these challenges, this paper introduces a knowledge-informed model-based residual reinforcement learning framework aimed at enhancing learning efficiency by infusing established expert knowledge into the learning process and avoiding the issue of beginning from zero. Our approach integrates traffic expert knowledge into a virtual environment model, employing the intelligent driver model (IDM) for basic dynamics and neural networks for residual dynamics, thus ensuring adaptability to complex scenarios. We propose a novel strategy that combines traditional control methods with residual RL, facilitating efficient learning and policy optimization without the need to learn from scratch. The proposed approach is applied to connected automated vehicle (CAV) trajectory control tasks for the dissipation of stop-and-go waves in mixed traffic flows. The experimental results demonstrate that our proposed approach enables the CAV agent to achieve superior performance in trajectory control compared with the baseline agents in terms of sample efficiency, traffic flow smoothness and traffic mobility.
Trajectory prediction for heterogeneous traffic agents plays a crucial role in ensuring the safety and efficiency of automated driving in highly interactive traffic environments. Numerous studies in this area have focused on physics-based approaches because they can clearly interpret the dynamic evolution of trajectories. However, physics-based methods often suffer from limited accuracy. Recent learning-based methods have demonstrated better performance, but they cannot be fully trusted due to the insufficient incorporation of physical constraints. To mitigate the limitations of purely physics-based and learning-based approaches, this study proposes a kinematics-aware multigraph attention network (KA-MGAT) that incorporates physics models into a deep learning framework to improve the learning process of neural networks. Besides, we propose a residual prediction module to further refine the trajectory predictions and address the limitations arising from simplified assumptions in kinematic models. We evaluate our proposed model through experiments on two challenging trajectory datasets, namely, ApolloScape and NGSIM. Our findings from the experiments demonstrate that our model outperforms various kinematics-agnostic models with respect to prediction accuracy and learning efficiency.
Despite significant progress in autonomous vehicles (AVs), the development of driving policies that ensure both the safety of AVs and traffic flow efficiency has not yet been fully explored. In this paper, we propose an enhanced human-in-the-loop reinforcement learning method, termed the Human as AI mentor-based deep reinforcement learning (HAIM-DRL) framework, which facilitates safe and efficient autonomous driving in mixed traffic platoon. Drawing inspiration from the human learning process, we first introduce an innovative learning paradigm that effectively injects human intelligence into AI, termed Human as AI mentor (HAIM). In this paradigm, the human expert serves as a mentor to the AI agent. While allowing the agent to sufficiently explore uncertain environments, the human expert can take control in dangerous situations and demonstrate correct actions to avoid potential accidents. On the other hand, the agent could be guided to minimize traffic flow disturbance, thereby optimizing traffic flow efficiency. In detail, HAIM-DRL leverages data collected from free exploration and partial human demonstrations as its two training sources. Remarkably, we circumvent the intricate process of manually designing reward functions; instead, we directly derive proxy state-action values from partial human demonstrations to guide the agents’ policy learning. Additionally, we employ a minimal intervention technique to reduce the human mentor’s cognitive load. Comparative results show that HAIM-DRL outperforms traditional methods in driving safety, sampling efficiency, mitigation of traffic flow disturbance, and generalizability to unseen traffic scenarios.