AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Full Length Article | Open Access

Reinforcement learning based UAV formation control in GPS-denied environment

Bodi MAaZhenbao LIUa,b( )Feihong JIANGa,cWen ZHAOaQingqing DANGaXiao WANGaJunhong ZHANGa,cLina WANGa
School of Civil Aviation, Northwestern Polytechnical University, Xi’an 710000, China
Research & Development Institute in Shenzhen, Northwestern Polytechnical University, Shenzhen 518000,China
Flight Control Division, AVIC The First Aircraft Institute, Xi’an 710089, China
Show Author Information

Abstract

Highly accurate positioning is a crucial prerequisite of multi Unmanned Aerial Vehicle close-formation flight for target tracking, formation keeping, and collision avoidance. Although the position of a UAV can be obtained through the Global Positioning System (GPS), it is difficult for a UAV to obtain highly accurate positioning data in a GPS-denied environment (e.g., a GPS jamming area, suburb, urban canyon, or mountain area); this may cause it to miss a tracking target or collide with another UAV. In particular, UAV close-formation control in GPS-denied environments faces difficulties owing to the low-accuracy position, close distance between vehicles, and nonholonomic dynamics of a UAV. In this paper, on the one hand, we develop an innovative UAV formation localization method to address the formation localization issues in GPS-denied environments; on the other hand, we design a novel reinforcement learning based algorithm to achieve the high-efficiency and robust performance of the controller. First, a novel Lidar-based localization algorithm is developed to measure the localization of each aircraft in the formation flight. In our solution, each UAV is equipped with Lidar as the position measurement sensor instead of the GPS module. The k-means algorithm is implemented to calculate the center point position of UAV. A novel formation position vector matching method is proposed to match center points with UAVs in the formation and estimate their position information. Second, a reinforcement learning based UAV formation control algorithm is developed by selecting the optimal policy to control UAV swarm to start and keep flying in a close formation of a specific geometry. Third, the innovative collision risk evaluation module is proposed to address the collision-free issues in the formation group. Finally, a novel experience replay method is also provided in this paper to enhance the learning efficiency. Experimental results validate the accuracy, effectiveness, and robustness of the proposed scheme.

References

1

An BH, Wang B, Fan HJ, et al. Fully distributed prescribed performance formation control for UAVs with unknown maneuver of leader. Aerosp Sci Technol 2022;130:107886.

2

Yan C, Wang C, Xiang XJ, et al. Deep reinforcement learning of collision-free flocking policies for multiple fixed-wing UAVs using local situation maps. IEEE Trans Ind Inform 2022;18(2):1260–70.

3

He A, Gao HG, Zhang SS, et al. Full mode flight dynamics modelling and control of stopped-rotor UAV. Chin J Aeronaut 2022;35(10):95–105.

4

Chen LL, Liu ZB, Dang QQ, et al. Robust trajectory tracking control for a quadrotor using recursive sliding mode control and nonlinear extended state observer. Aerosp Sci Technol 2022;128:107749.

5

Ma BD, Liu ZB, Jiang FH, et al. Vehicle detection in aerial images using rotation-invariant cascaded forest. IEEE Access 2019;7:59613–23.

6

Muslimov TZ, Munasypov RA. Consensus-based cooperative control of parallel fixed-wing UAV formations via adaptive backstepping. Aerosp Sci Technol 2021;109:106416.

7
Li H, Wang J, Han CW, et al. Leader-follower formation control of mutilple UAVs based on ADRC: Experiment research. 2021 4th IEEE international conference on industrial cyber-physical systems (ICPS). Piscataway: IEEE Press; 2021. p. 558–65.
8

Huang Y, Meng ZY. Bearing-based distributed formation control of multiple vertical take-off and landing UAVs. IEEE Trans Contr Netw Syst 2021;8(3):1281–92.

9

Liu YC, Bucknall R. Path planning algorithm for unmanned surface vehicle formations in a practical maritime environment. Ocean Eng 2015;97:126–44.

10

Singh Y, Sharma S, Sutton R, et al. A constrained A* approach towards optimal path planning for an unmanned surface vehicle in a maritime environment containing dynamic obstacles and ocean currents. Ocean Eng 2018;169:187–201.

11

Zhang JL, Zhang P, Yan JG. Distributed adaptive finite-time compensation control for UAV swarm with uncertain disturbances. IEEE Trans Circuits Syst I 2021;68(2):829–41.

12

Zhang QR, Liu HHT. UDE-based robust command filtered backstepping control for close formation flight. IEEE Trans Ind Electron 2018;65(11):8818–27.

13

Peng ZH, Wang D, Li TS, et al. Output-feedback cooperative formation maneuvering of autonomous surface vehicles with connectivity preservation and collision avoidance. IEEE Trans Cybern 2020;50(6):2527–35.

14

Liu C, Jiang B, Zhang K. Adaptive fault-tolerant H-infinity output feedback control for lead–wing close formation flight. IEEE Trans Syst Man Cybern 2020;50(8):2804–14.

15

Yang ZC, Zheng SQ, Liu F, et al. Adaptive output feedback control for fractional-order multi-agent systems. ISA Trans 2020;96:195–209.

16

Ali ZA, Israr A, Alkhammash EH, et al. A leader-follower formation control of multi-UAVs via an adaptive hybrid controller. Complexity 2021;2021:1–16.

17

Ali ZA, Han ZG. Multi-unmanned aerial vehicle swarm formation control using hybrid strategy. Trans Inst Meas Contr 2021;43(12):2689–701.

18

Yu YJ, Guo J, Ahn CK, et al. Neural adaptive distributed formation control of nonlinear multi-UAVs with unmodeled dynamics. IEEE Trans Neural Netw Learn Syst 2022;99:1–7.

19

Wang CP, Wang JQ, Wu P, et al. Consensus problem and formation control for heterogeneous multi-agent systems with switching topologies. Electronics 2022;11(16):2598.

20

Liu BJ, Li AJ, Guo Y, et al. Adaptive distributed finite-time formation control for multi-UAVs under input saturation without collisions. Aerosp Sci Technol 2022;120:107252.

21

Chen LL, Liu ZB, Gao HG, et al. Robust adaptive recursive sliding mode attitude control for a quadrotor with unknown disturbances. ISA Trans 2022;122:114–25.

22

Liu WL, Li ZX, Sun SS, et al. Design a novel target to improve positioning accuracy of autonomous vehicular navigation system in GPS denied environments. IEEE Trans Ind Inform 2021;17(11):7575–88.

23

Tang YZ, Hu YC, Cui JQ, et al. Vision-aided multi-UAV autonomous flocking in GPS-denied environment. IEEE Trans Ind Electron 2019;66(1):616–26.

24
Lezki H, Yetik İŞ. Localization using single camera and lidar in GPS-denied environments. 2020 28th signal processing and communications applications conference (SIU). Piscataway: IEEE Press; 2021. p. 1–4.
25

Wilson AN, Kumar A, Jha A, et al. Embedded sensors, communication technologies, computing platforms and machine learning for UAVs: A review. IEEE Sens J 2022;22(3):1807–26.

26

Wan X, Shao YB, Zhang SY, et al. Terrain aided planetary UAV localization based on geo-referencing. IEEE Trans Geosci Remote Sens 2022;60:1–18.

27

Guo KX, Li XX, Xie LH. Ultra-wideband and odometry-based cooperative relative localization with application to multi-UAV formation control. IEEE Trans Cybern 2020;50(6):2590–603.

28
Hemann G, Singh S, Kaess M. Long-range GPS-denied aerial inertial navigation with LIDAR localization. 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS). Piscataway: IEEE Press; 2016. p. 1659–66.
29

Shen HM, Zong Q, Tian BL, et al. Voxel-based localization and mapping for multirobot system in GPS-denied environments. IEEE Trans Ind Electron 2022;69(10):10333–42.

30

Shen HM, Zong Q, Lu HC, et al. A distributed approach for lidar-based relative state estimation of multi-UAV in GPS-denied environments. Chin J Aeronaut 2022;35(1):59–69.

31
Xu H, Wang LQ, Zhang YC, et al. Decentralized visual-inertial-UWB fusion for relative state estimation of aerial swarm. 2020 IEEE international conference on robotics and automation (ICRA). Piscataway: IEEE Press; 2020. p. 8776–82.
32
Walter V, Saska M, Franchi A. Fast mutual relative localization of UAVs using ultraviolet LED markers. 2018 international conference on unmanned aircraft systems (ICUAS). Piscataway: IEEE Press; 2018. p. 1217–26.
33
Zhu F, Ren Y, Kong F, et al. Decentralized lidar-inertial swarm odometry. arXiv preprint:2209.06628; 2022.
34

Zhang WZ, Zhang W. An efficient UAV localization technique based on particle swarm optimization. IEEE Trans Veh Technol 2022;71(9):9544–57.

35

Gageik N, Benz P, Montenegro S. Obstacle detection and collision avoidance for a UAV with complementary low-cost sensors. IEEE Access 2015;3:599–609.

36

Yu Z, Zhang Y, Jiang B, et al. A review on fault-tolerant cooperative control of multiple unmanned aerial vehicles. Chin J Aeronaut 2022;35(1):1–18.

37

Wang YD, Sun J, He HB, et al. Deterministic policy gradient with integral compensator for robust quadrotor control. IEEE Trans Syst Man Cybern 2020;50(10):3713–25.

38
Hou YN, Liu LF, Wei Q, et al. A novel DDPG method with prioritized experience replay. 2017 IEEE international conference on systems, man, and cybernetics (SMC). Piscataway: IEEE Press; 2017. p. 316–21.
39
Rastogi D. Deep reinforcement learning for bipedal robots [dissertation]. Delft: Delft University of Technology; 2017.
40

Hu JW, Wang LH, Hu TM, et al. Autonomous maneuver decision making of dual-UAV cooperative air combat based on deep reinforcement learning. Electronics 2022;11(3):467.

41

Zhang JD, Yang QM, Shi GQ, et al. UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning. J Syst Eng Electron 2021;32(6):1421–38.

42

Lin Y, Wang MY, Zhou XL, et al. Dynamic spectrum interaction of UAV flight formation communication with priority: A deep reinforcement learning approach. IEEE Trans Cogn Commun Netw 2020;6(3):892–903.

43
Zhang YZ, Wu ZR, Ma YH, et al. Research on autonomous formation of Multi-UAV based on MADDPG algorithm. 2022 IEEE 17th international conference on control & automation (ICCA). Piscataway: IEEE Press; 2022. p. 249–54.
44
Abadi M, Barham P, Chen J, et al. Tensorflow: a system for large-scale machine learning. 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16). Berkeley: USENIX Association; 2016. p. 265–83.
45
Pham H, La H, Feil-Seifer D, et al. Autonomous UAV navigation using reinforcement learning. arXiv preprint: 1801.05086, 2018.
46
Pham H, La H, Feil-Seifer D, et al. Cooperative and distributed reinforcement learning of drones for field coverage. arXiv preprint:1803.07250, 2018.
47

Guo YH, Chen G, Zhao T. Learning-based collision-free coordination for a team of uncertain quadrotor UAVs. Aerosp Sci Technol 2021;119:107127.

48

Hu JY, Niu HL, Carrasco J, et al. Fault-tolerant cooperative navigation of networked UAV swarms for forest fire monitoring. Aerosp Sci Technol 2022;123:107494.

49

Hu ZJ, Gao XG, Wan KF, et al. Relevant experience learning: a deep reinforcement learning method for UAV autonomous motion planning in complex unknown environments. Chin J Aeronaut 2021;34(12):187–204.

50

Zhou WH, Li J, Liu ZH, et al. Improving multi-target cooperative tracking guidance for UAV swarms using multi-agent reinforcement learning. Chin J Aeronaut 2022;35(7):100–12.

Chinese Journal of Aeronautics
Pages 281-296
Cite this article:
MA B, LIU Z, JIANG F, et al. Reinforcement learning based UAV formation control in GPS-denied environment. Chinese Journal of Aeronautics, 2023, 36(11): 281-296. https://doi.org/10.1016/j.cja.2023.07.006

88

Views

18

Crossref

14

Web of Science

13

Scopus

Altmetrics

Received: 16 October 2022
Revised: 06 November 2022
Accepted: 03 January 2023
Published: 13 July 2023
© 2023 Chinese Society of Aeronautics and Astronautics.

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Return