A reinforcement learning approach to vehicle coordination for structured advanced air mobility

Sabrullah Deniz; Yufei Wu; Yang Shi; Zhenbo Wang

doi:10.1016/j.geits.2024.100157

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

Article Link

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Full Length Article | Open Access

A reinforcement learning approach to vehicle coordination for structured advanced air mobility

Sabrullah Deniz, Yufei Wu, Yang Shi, Zhenbo Wang(

)

Department of Mechanical, Aerospace, and Biomedical Engineering, The University of Tennessee, Knoxville, TN, 37996, USA

Show Author Information

HIGHLIGHTS

· A novel deep reinforcement learning approach to safe and efficient AAM traffic separation.

· A new MARL framework for AAM vehicle coordination in merging and intersection scenarios.

· Trade-off studies that reveal impacts of network design and hyperparameters on the performance of the algorithms.

· Extensive simulations that demonstrate the performance of the proposed methods.

Graphical Abstract

Abstract

Advanced Air Mobility (AAM) has emerged as a pioneering concept designed to optimize the efficacy and ecological sustainability of air transportation. Its core objective is to provide highly automated air transportation services for passengers or cargo, operating at low altitudes within urban, suburban, and rural regions. AAM seeks to enhance the efficiency and environmental viability of the aviation sector by revolutionizing the way air travel is conducted. In a complex aviation environment, traffic management and control are essential technologies for safe and effective AAM operations. One of the most difficult obstacles in the envisioned AAM systems is vehicle coordination at merging points and intersections. The escalating demand for air mobility services, particularly within urban areas, poses significant complexities to the execution of such missions. In this study, we propose a novel multi-agent reinforcement learning (MARL) approach to efficiently manage high-density AAM operations in structured airspace. Our approach provides effective guidance to AAM vehicles, ensuring conflict avoidance, mitigating traffic congestion, reducing travel time, and maintaining safe separation. Specifically, intelligent learning-based algorithms are developed to provide speed guidance for each AAM vehicle, ensuring secure merging into air corridors and safe passage through intersections. To validate the effectiveness of our proposed model, we conduct training and evaluation using BlueSky, an open-source air traffic control simulation environment. Through the simulation of thousands of aircraft and the integration of real-world data, our study demonstrates the promising potential of MARL in enabling safe and efficient AAM operations. The simulation results validate the efficacy of our approach and its ability to achieve the desired outcomes.

Keywords

Advanced Air Mobility (AAM)Urban Air Mobility (UAM)Air Traffic Control (ATC)Multi-Agent Reinforcement Learning (MARL)

References

[1]

Goodrich KH, Theodore CR. Description of the nasa urban air mobility maturitylevel (uml) scale. In: AIAA Scitech 2021 forum; 2021. p. 1627.

Crossref

[2]

Hasan S. Urban air mobility (uam) market study. Tech Rep 2019.

Google Scholar

[3]

Holden J, Goel N. Fast-forwarding to a future of on-demand urban airtransportation. 2016. San Francisco, CA.

[4]

Airbus. Urban air mobility by airbus. 2018.

[5]

Corgan. Connect evolved. 2019.

[6]

Forecast FA. Office of aviation policy and plans (apo-100) faa u.s. passenger airlineforecasts, fiscal years 2020-2040. 2020.

[7]

FAA. Concept of operations v2.0", enabling civ. low-altitude airsp. unmanned aircr.syst. oper. 2020. https://utm.arc.nasa.gov/index.shtml.

[8]

Bradford S. Urban air mobility (uam) concept of operations v1. 0. Boston: EmbraerX; 2020. p. 5.

[9]

Johnson M, Jung J, Rios J, Mercer J, Homola J, Prevot T, et al. Flight test evaluationof an unmanned aircraft system traffic management (utm) concept for multiplebeyond-visual-line-of-sight operations. In: USA/Europe air traffic managementresearch and development Seminar (ATM2017). ARC-E-DAA-TN39084; 2017.

[10]

Jung J, Rios JL, Drew CR, Modi HC, Jobe KK. Small unmanned aircraft system offnominal operations reporting system. 2020.

[11]

Google. Alphago — deepmind. 2018.

[12]

OpenAI. Openai. 2019.

[13]

Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput 1997;9(8):1735-80.

Crossref Google Scholar

[14]

Hoekstra JM, Ellerbroek J. Bluesky atc simulator project: an open data and opensource approach. In: Proceedings of the 7th international conference on research inair transportation, vol. 131. FAA/Eurocontrol USA/Europe; 2016. p. 132.

[15]

Bouton M, Nakhaei A, Fujimura K, Kochenderfer MJ. Cooperation-aware reinforcement learning for merging in dense traffic. In: 2019 IEEE intelligent transportation systems conference (ITSC). IEEE; 2019. p. 3441–7.

Crossref

[16]

Liang X, Du X, Wang G, Han Z. Deep reinforcement learning for traffic light control in vehicular networks. arXiv preprint arXiv:1803.11115 2018.

[17]

Genders W, Razavi S. Using a deep reinforcement learning agent for traffic signal control. arXiv preprint arXiv:1611.01142 2016.

[18]

Chen D, Li Z, Wang Y, Jiang L, Wang Y. Deep multi-agent reinforcement learning forhighway on-ramp merging in mixed traffic. arXiv preprint arXiv:2105.05701 2021.

[19]

Erzberger H. Automated conflict resolution for air traffic control (25Th Int. Congr.Aeronaut. Sci., no. March. 2014. p. 1–28.

[20]

Erzberger H, Heere K. Algorithm and operational concept for resolving short-range conflicts. Proc Inst Mech Eng G J Aerosp Eng 2010;224(2):225-43.

Crossref Google Scholar

[21]

Tumer K, Agogino AK. Adaptive management of air traffic flow: a multiagentcoordination approach.. In: AAAI; 2008. p. 1581–4.

[22]

Brittain M, Yang X, Wei P. A deep multi-agent reinforcement learning approach toautonomous separation assurance. arXiv preprint arXiv:2003.08353 2020.

[23]

Chu T, Wang J, Codecà L, Li Z. Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Trans Intell Transport Syst 2019;21(3):1086-95.

Crossref Google Scholar

[24]

Schuchardt BI, Geister D, Lüken T, Knabe F, Metz IC, Peinecke N, et al. Air traffic management as a vital part of urban air mobility—a review of dlr’s research work from 1995 to 2022. Aerospace 2023;10(1):81.

Crossref Google Scholar

[25]

Pinto Neto EC, Baum DM, Almeida Jr JRd, Camargo Jr JB, Cugnasca PS. Deep learning in air traffic management (atm): a survey on applications, opportunities, and open challenges. Aerospace 2023;10(4):358.

Crossref Google Scholar

[26]

de Oliveira ÍR, Neto ECP, Matsumoto TT, Yu H. Decentralized air trafficmanagement for advanced air mobility. In: 2021 integrated communicationsnavigation and surveillance conference (ICNS). IEEE; 2021. p. 1–8.

Crossref

[27]

Deniz S, Wang Z. A multi-agent reinforcement learning approach to traffic control atfuture urban air mobility intersections. In: AIAA SCITECH 2022 Forum; 2022. p. 1509.

Crossref

[28]

Deniz S, Wu Y, Shi Y, Wang Z. A multi-agent reinforcement learning approach totraffic control at merging point of urban air mobility. In: AIAA AVIATION 2022forum; 2022. p. 3912.

Crossref

[29]

Elevate U. Operations inside corridors. 2020. October.

[30]

Morales EF, Zaragoza JH. An introduction to reinforcement learning. In: Decisiontheory models for applications in artificial intelligence: concepts and solutions. IGIGlobal; 2012. p. 63–80.

Crossref

[31]

Garcia F, Rachelson E. Markov decision processes, Markov decision processes inartificial intelligence. 2013. p. 1–38.

Crossref

[32]

Pham D-T, Tran NP, Goh SK, Alam S, Duong V. Reinforcement learning for twoaircraft conflict resolution in the presence of uncertainty. In: 2019 IEEE-RIVFinternational conference on computing and communication technologies (RIVF).IEEE; 2019. p. 1–6.

Crossref

[33]

Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O. Proximal policyoptimization algorithms. arXiv preprint arXiv:1707.06347 2017.

[34]

Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, et al. Asynchronousmethods for deep reinforcement learning. In: International conference on machinelearning. PMLR; 2016. p. 1928–37.

[35]

Nachum O, Norouzi M, Xu K, Schuurmans D. Bridging the gap between value and policy based reinforcement learning. Adv Neural Inf Process Syst 2017;30.

Google Scholar

[36]

Peters J. Policy gradient methods. Scholarpedia 2010;5(11):3698.

Crossref Google Scholar

[37]

Sutton RS, Barto AG. Reinforcement learning: an introduction. MIT press; 2018.

[38]

Sutton RS, McAllester D, Singh S, Mansour Y. Policy gradient methods for reinforcement learning with function approximation. Adv Neural Inf Process Syst 1999;12.

Google Scholar

[39]

Morimura T, Uchibe E, Doya K. Natural actor-critic with baseline adjustment for variance reduction. Artif Life Robot 2008;13(1):275-9.

Crossref Google Scholar

[40]

Berner C, Brockman G, Chan B, Cheung V, Dębiak P, Dennison C, et al. Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680 2019.

[41]

Busoniu L, Babuska R, De Schutter B. A comprehensive survey of multiagent reinforcement learning. IEEE Trans Sys Man Cybernet Part C (Applications and Reviews) 2008;38(2):156-72.

Crossref Google Scholar

[42]

Kumar RR, Varakantham P. On solving cooperative marl problems with a few goodexperiences. arXiv preprint arXiv:2001.07993 2020.

[43]

Tan M. Multi-agent reinforcement learning: independent vs. cooperative agents. In: Proceedings of thetenth international conference on machine learning; 1993. p. 330–7.

Crossref

[44]

Matignon L, Laurent GJ, Le Fort-Piat N. Independent reinforcement learners in cooperative markov games: a survey regarding coordination problems. Knowl Eng Rev 2012;27(1):1-31.

Crossref Google Scholar

[45]

Kraemer L, Banerjee B. Multi-agent reinforcement learning as a rehearsal for decentralized planning. Neurocomputing 2016;190:82-94.

Crossref Google Scholar

[46]

Weiss G. Multiagent systems: a modern approach to distributed artificial intelligence. Int J Comput Intell Appl 2001;1:331-4.

Crossref Google Scholar

[47]

Schulman J, Moritz P, Levine S, Jordan M, Abbeel P. High-dimensional continuouscontrol using generalized advantage estimation. arXiv preprint arXiv:1506.02438 2015.

[48]

Wang Y, He H, Tan X. Truly proximal policy optimization. In: Uncertainty inartificial intelligence. PMLR; 2020. p. 113–22.

[49]

Chen G. A new framework for multi-agent reinforcement learning–centralizedtraining and exploration with decentralized execution via policy distillation. arXivpreprint arXiv: 1910.09152 2019.

[50]

Nwankpa C, Ijomah W, Gachagan A, Marshall S. Activation functions: comparisonof trends in practice and research for deep learning. arXiv preprint arXiv:1811.03378 2018.

[51]

Niu Z, Zhong G, Yu H. A review on the attention mechanism of deep learning. Neurocomputing 2021;452:48-62.

Crossref Google Scholar

[52]

Bergstra J, Bengio Y. Random search for hyper-parameter optimization. J Mach Learn Res 2012;13(2).

Google Scholar

[53]

Probst P, Wright MN, Boulesteix A-L. Hyperparameters and tuning strategies for random forest. Wiley Interdisciplinary Rev: Data Min Knowl Discov 2019;9(3):e1301.

Crossref Google Scholar

[54]

Wu J, Chen X-Y, Zhang H, Xiong L-D, Lei H, Deng S-H. Hyperparameter optimization for machine learning models based on bayesian optimization. J Electr Sci Tech 2019;17(1):26-40.

Google Scholar

Green Energy and Intelligent Transportation

Volume 3 Issue 2,
April 2024

Article number: 100157

DOI: 10.1016/j.geits.2024.100157

Cite this article:

Deniz S, Wu Y, Shi Y, et al. A reinforcement learning approach to vehicle coordination for structured advanced air mobility. Green Energy and Intelligent Transportation, 2024, 3(2): 100157. https://doi.org/10.1016/j.geits.2024.100157

Views

Crossref

Web of Science

Scopus

Google Scholar
Citation

Altmetrics

Received: 19 July 2023

Revised: 09 November 2023

Accepted: 10 November 2023

Published: 03 April 2024

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).