Reinforcement learning-based missile terminal guidance of maneuvering targets with decoys

Tianbo DENG; Hao HUANG; Yangwang FANG; Jie YAN; Haoyu CHENG

doi:10.1016/j.cja.2023.05.028

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

Article Link

Cite

EndNote(RIS) BibTeX

Collect

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Full Length Article | Open Access

Reinforcement learning-based missile terminal guidance of maneuvering targets with decoys

Tianbo DENG^{^a}, Hao HUANG^{^b}, Yangwang FANG^{^b}(

), Jie YAN^{^b}, Haoyu CHENG^{^b}

School of Astronautics, Northwestern Polytechnical University, Xi’an 710072, China

Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an 710072, China

Show Author Information

Abstract

In this paper, a missile terminal guidance law based on a new Deep Deterministic Policy Gradient (DDPG) algorithm is proposed to intercept a maneuvering target equipped with an infrared decoy. First, to deal with the issue that the missile cannot accurately distinguish the target from the decoy, the energy center method is employed to obtain the equivalent energy center (called virtual target) of the target and decoy, and the model for the missile and the virtual decoy is established. Then, an improved DDPG algorithm is proposed based on a trusted-search strategy, which significantly increases the train efficiency of the previous DDPG algorithm. Furthermore, combining the established model, the network obtained by the improved DDPG algorithm and the reward function, an intelligent missile terminal guidance scheme is proposed. Specifically, a heuristic reward function is designed for training and learning in combat scenarios. Finally, the effectiveness and robustness of the proposed guidance law are verified by Monte Carlo tests, and the simulation results obtained by the proposed scheme and other methods are compared to further demonstrate its superior performance.

Keywords

Reinforcement learning Deep deterministic policy gradient Infrared decoy Maneuvering target Terminal guidance law

References

Danali S, Palaiah R, Raha K. Developments in pyrotechnics. Def Sci J 2010;60(2):152-8.

Crossref Google Scholar

Bai K, Wang YH, Yan Y, et al. Infrared small target tracking based on target and interference behaviors model. Infrared Phys Technol 2014;67:256-65.

Crossref Google Scholar

Yao Y, Zheng T, He F, et al. Several hot issues and challenges in terminal guidance of flight vehicles. Acta Aeronaut Astronaut Sin 2015;36(8):2696-716 [Chinese].

Google Scholar

Zarchan PTactical and strategic missile guidance. 6th ed. Reston: AIAA; 2012. p. 163–5.

Crossref

Nobahari H, Nasrollahi S. A terminal guidance algorithm based on ant colony optimization. Comput Electr Eng 2019;77:128-46.

Crossref Google Scholar

Kim BS, Lee JG, Han HS. Biased PNG law for impact with angular constraint. IEEE Trans Aerosp Electron Syst 1998;34(1):277-88.

Crossref Google Scholar

Lee CH, Kim TH, Tahk MJ. Interception angle control guidance using proportional navigation with error feedback. J Guid Control Dyn 2013;36(5):1556-61.

Crossref Google Scholar

Kim TH, Lee CH, Tahk MJ, et al. Biased PNG law for impact-time control. Trans Japan Soc Aero S Sci 2013;56(4):205-14.

Crossref Google Scholar

Prieto A, Prieto B, Ortigosa EM, et al. Neural networks: An overview of early research, current frameworks and new challenges. Neurocomputing 2016;214:242-68.

Crossref Google Scholar

Arulkumaran K, Deisenroth MP, Brundage M, et al. Deep reinforcement learning: A brief survey. IEEE Signal Process Mag 2017;34(6):26-38.

Crossref Google Scholar

Zhang ZW, Cui P, Zhu WW. Deep learning on graphs: a survey. IEEE Trans Knowl Data Eng 2022;34(1):249-70.

Crossref Google Scholar

Gaudet B, Furfaro R, Linares R. Reinforcement learning for angle-only intercept guidance of maneuvering targets. Aerosp Sci Technol 2020;99(C):105746.

Crossref Google Scholar

Hong D, Kim M, Park S. Study on reinforcement learning-based missile guidance law. Appl Sci 2020;10(18):6567.

Crossref Google Scholar

Gong XP, Chen WC, Chen ZY. All-aspect attack guidance law for agile missiles based on deep reinforcement learning. Aerosp Sci Technol 2022;127:107677.

Crossref Google Scholar

Zhou WH, Li J, Liu ZH, et al. Improving multi-target cooperative tracking guidance for UAV swarms using multi-agent reinforcement learning. Chin J Aeronaut 2022;35(7):100-12.

Crossref Google Scholar

Gaudet B, Linares R, Furfaro R. Deep reinforcement learning for six degree-of-freedom planetary landing. Adv Space Res 2020;65(7):1723-41.

Crossref Google Scholar

Gaudet B, Linares R, Furfaro R. Adaptive guidance and integrated navigation with reinforcement meta-learning. Acta Astronaut 2020;169(C):180-90.

Crossref Google Scholar

Yuan H, Li DX. Deep reinforcement learning for rendezvous guidance with enhanced angles-only observability. Aerosp Sci Technol 2022;129:107812.

Crossref Google Scholar

Li WF, Zhu YH, Zhao DB. Missile guidance with assisted deep reinforcement learning for head-on interception of maneuvering target. Complex Intell Syst 2021;8(3):1205-16.

Crossref Google Scholar

Hu ZJ, Gao XG, Wan KF, et al. Relevant experience learning: a deep reinforcement learning method for UAV autonomous motion planning in complex unknown environments. Chin J Aeronaut 2021;34(12):187-204.

Crossref Google Scholar

Zhang YZ, Xu JL, Yao KJ, et al. Pursuit missions for UAV swarms based on DDPG algorithm. Acta Aeronautica et Astronautica Sinica 2020;41(10):314-26 [Chinese].

Google Scholar

Li B, Yang ZP, Chen DQ, et al. Maneuvering target tracking of UAV based on MN-DDPG and transfer learning. Def Technol 2021;17(2):457-66.

Crossref Google Scholar

He SM, Shin HS, Tsourdos A. Computational missile guidance: a deep reinforcement learning approach. J Aerosp Inf Syst 2021;18(8):571-82.

Crossref Google Scholar

Chen ZY, Chen WC. Reinforcement learning-based intelligent guidance law for cooperative attack of multiple missiles. Acta Armamentarii 2022;42(8):1638-47 [Chinese].

Crossref Google Scholar

Shalumov V. Cooperative online Guide-Launch-Guide policy in a target-missile-defender engagement using deep reinforcement learning. Aerosp Sci Technol 2020;104:105996.

Crossref Google Scholar

English JT, Wilhelm JP. Defender-aware attacking guidance policy for the target–attacker–defender differential game. J Aerosp Inf Syst 2021;18(6):366-76.

Crossref Google Scholar

Sun QL, Zhang CF, Liu N, et al. Guidance laws for attacking defended target. Chin J Aeronaut 2019;32(10):2337-53.

Crossref Google Scholar

Peng C, Ma JJ, Liu XM. An online data driven actor-critic-disturbance guidance law for missile-target interception with input constraints. Chin J Aeronaut 2022;35(7):144-56.

Crossref Google Scholar

Wang YN, Wang H, Lin DF, et al. Nonlinear guidance laws for maneuvering target interception with virtual look angle constraint. IEEE Trans Aerosp Electron Syst 2022;58(4):2807-22.

Crossref Google Scholar

Polasek M, Nemecek J, Pham IQ. Counter countermeasure method for missile’s imaging infrared seeker. 2016 IEEE/AIAA 35th digital avionics systems conference (DASC). Piscataway: IEEE Press; 2016. p. 1–8.

Crossref

Fu XW, Zhu JD, Wei ZY, et al. A UAV pursuit-evasion strategy based on DDPG and imitation learning. Int J Aerosp Eng 2022;2022(12):1-14.

Crossref Google Scholar

Koryakovskiy I, Vallery H, Babuška R, et al. Evaluation of physical damage associated with action selection strategies in reinforcement learning. IFAC-PapersOnLine 2017;50(1):6928-33.

Crossref Google Scholar

Wei ZB, Quan ZY, Wu JD, et al. Deep deterministic policy gradient-DRL enabled multiphysics-constrained fast charging of lithium-ion battery. IEEE Trans Ind Electron 2022;69(3):2588-98.

Crossref Google Scholar

Fu CC, Xu XY, Zhang YT, et al. Memory-enhanced deep reinforcement learning for UAV navigation in 3D environment. Neural Comput Appl 2022;34(17):14599-607.

Crossref Google Scholar

Saini GS, Erge O, Ashok P, et al. Well construction action planning and automation through finite-horizon sequential decision-making. Energies 2022;15(16):5776.

Crossref Google Scholar

Yan XD, Lyu S. A robust hybrid nonlinear guidance law for intercepting a non-cooperative maneuvering target. Aeronaut J 2020;124(1273):429-45.

Crossref Google Scholar

El-Hakem Hegazy SA, Kamel AM, Arafa II, et al. INS stochastic noise impact on circular error probability of ballistic missiles. Navi 2022;69(2):523.

Crossref Google Scholar

Jeon IS, Karpenko M, Lee JI. Connections between proportional navigation and terminal velocity maximization guidance. J Guid Control Dyn 2019;43(2):383-8.

Crossref Google Scholar

Chinese Journal of Aeronautics

Volume 36 Issue 12,
December 2023

Pages 309-324

DOI: 10.1016/j.cja.2023.05.028

Cite this article:

DENG T, HUANG H, FANG Y, et al. Reinforcement learning-based missile terminal guidance of maneuvering targets with decoys. Chinese Journal of Aeronautics, 2023, 36(12): 309-324. https://doi.org/10.1016/j.cja.2023.05.028

Views

Crossref

Web of Science

Scopus

Google Scholar
Citation

Altmetrics

Received: 08 November 2022

Revised: 16 December 2022

Accepted: 27 February 2023

Published: 02 June 2023

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).