Sort:
Open Access Issue
Offline Reinforcement Learning with Constrained Hybrid Action Implicit Representation Towards Wargaming Decision-Making
Tsinghua Science and Technology 2024, 29(5): 1422-1440
Published: 02 May 2024
Abstract PDF (13 MB) Collect
Downloads:346

Reinforcement Learning (RL) has emerged as a promising data-driven solution for wargaming decision-making. However, two domain challenges still exist: (1) dealing with discrete-continuous hybrid wargaming control and (2) accelerating RL deployment with rich offline data. Existing RL methods fail to handle these two issues simultaneously, thereby we propose a novel offline RL method targeting hybrid action space. A new constrained action representation technique is developed to build a bidirectional mapping between the original hybrid action space and a latent space in a semantically consistent way. This allows learning a continuous latent policy with offline RL with better exploration feasibility and scalability and reconstructing it back to a needed hybrid policy. Critically, a novel offline RL optimization objective with adaptively adjusted constraints is designed to balance the alleviation and generalization of out-of-distribution actions. Our method demonstrates superior performance and generality across different tasks, particularly in typical realistic wargaming scenarios.

Total 1