AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
View PDF
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

QAR Data Imputation Using Generative Adversarial Network with Self-Attention Mechanism

School of Computer Science and Technology, Tiangong University, Tianjin 300387, China
Institute of Aviation Safety, China Academy of Civil Aviation Science and Technology, Beijing 100028, China
Show Author Information

Abstract

Quick Access Recorder (QAR), an important device for storing data from various flight parameters, contains a large amount of valuable data and comprehensively records the real state of the airline flight. However, the recorded data have certain missing values due to factors, such as weather and equipment anomalies. These missing values seriously affect the analysis of QAR data by aeronautical engineers, such as airline flight scenario reproduction and airline flight safety status assessment. Therefore, imputing missing values in the QAR data, which can further guarantee the flight safety of airlines, is crucial. QAR data also have multivariate, multiprocess, and temporal features. Therefore, we innovatively propose the imputation models A-AEGAN (“A” denotes attention mechanism, “AE” denotes autoencoder, and “GAN” denotes generative adversarial network) and SA-AEGAN (“SA” denotes self-attentive mechanism) for missing values of QAR data, which can be effectively applied to QAR data. Specifically, we apply an innovative generative adversarial network to impute missing values from QAR data. The improved gated recurrent unit is then introduced as the neural unit of GAN, which can successfully capture the temporal relationships in QAR data. In addition, we modify the basic structure of GAN by using an autoencoder as the generator and a recurrent neural network as the discriminator. The missing values in the QAR data are imputed by using the adversarial relationship between generator and discriminator. We introduce an attention mechanism in the autoencoder to further improve the capability of the proposed model to capture the features of QAR data. Attention mechanisms can maintain the correlation among QAR data and improve the capability of the model to impute missing data. Furthermore, we improve the proposed model by integrating a self-attention mechanism to further capture the relationship between different parameters within the QAR data. Experimental results on real datasets demonstrate that the model can reasonably impute the missing values in QAR data with excellent results.

References

[1]

D. R. Insua, C. Alfaro, J. Gomez, P. Hernandez-Coronado, and F. Bernal, Forecasting and assessing consequences of aviation safety occurrences, Saf. Sci., vol. 111, pp. 243–252, 2019.

[2]

W. K. Lee, Risk assessment modeling in aviation safety management, J. Air Transp. Manag., vol. 12, no. 5, pp. 267–273, 2006.

[3]

X. Ni, H. Wang, C. Che, J. Hong, and Z. Sun, Civil aviation safety evaluation based on deep belief network and principal component analysis, Saf. Sci., vol. 112, pp. 90–95, 2019.

[4]
H. Lv, J. Yu, and T. Zhu, A novel method of overrun risk measurement and assessment using large scale QAR data, in Proc. 2018 IEEE Fourth Int. Conf. Big Data Computing Service and Applications, Bamberg, Germany, 2018, pp. 213–220.
DOI
[5]
W. Yan and J. H. Zhou, Early fault detection of aircraft components using flight sensor data, in Proc. 2018 IEEE 23 rd Int. Conf. Emerging Technologies and Factory Automation, Turin, Italy, 2018, pp. 1337–1342.
DOI
[6]

C. Tong, X. Yin, J. Li, T. Zhu, R. Lv, L. Sun, and J. J. P. C. Rodrigues, An innovative deep architecture for aircraft hard landing prediction based on time-series sensor data, Appl. Soft Comput., vol. 73, pp. 344–349, 2018.

[7]

S. Loisel and Y. Takane, Comparisons among several methods for handling missing data in principal component analysis (PCA), Adv. Data Anal. Classif., vol. 13, no. 2, pp. 495–518, 2019.

[8]
B. M. Marlin, R. S. Zemel, S. T. Roweis, and M. Slaney, Recommender systems: Missing data and statistical model estimation, in Proc. Twenty-Second Int. Joint Conf. Artificial Intelligence, Barcelona, Spain, 2011, pp. 2686–2691.
[9]

G. Jeon, A. K. Sangaiah, Y. S. Chen, and A. Paul, Special issue on machine learning approaches and challenges of missing data in the era of big data, Int. J. Mach. Learn. Cybern., vol. 10, no. 10, pp. 2589–2591, 2019.

[10]

X. Chai, G. Tang, S. Wang, R. Peng, W. Chen, and J. Li, Deep learning for regularly missing data reconstruction, IEEE Trans. Geosci. Remote Sens., vol. 58, no. 6, pp. 4406–4423, 2020.

[11]
I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, Generative adversarial nets, in Proc. 27 th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2014, pp. 2672–2680.
[12]
S. Gaur, D. D. Pandya, and D. Soni, Closest fit approach through linear interpolation to recover missing values in data mining, in Proc. Fourth Int. Congress on Information and Communication Technology - ICICT 2019, London, UK, 2019, pp. 513–521.
DOI
[13]

S. M. Mostafa, A. S. Eladimy, S. Hamad, and H. Amano, CBRL and CBRC: Novel algorithms for improving missing value imputation accuracy based on Bayesian ridge regression, Symmetry, vol. 12, no. 10, p. 1594, 2020.

[14]

W. Pannakkong, V. H. Pham, and V. N. Huynh, A novel hybridization of ARIMA. ANN, and K-means for time series forecasting, Int. J. Knowl. Syst. Sci., vol. 8, no. 4, pp. 30–53, 2017.

[15]

A. H. Adineh, Z. Narimani, and S. C. Satapathy, Importance of data preprocessing in time series prediction using SARIMA: A case study, Int. J. Knowl. Based Intell. Eng. Syst., vol. 24, no. 4, pp. 331–342, 2020.

[16]

R. Nimesh, S. Arora, K. K. Mahajan, and A. N. Gill, Predicting air quality using ARIMA, ARFIMA and HW smoothing, Model Assist. Stat. Appl., vol. 9, no. 2, pp. 137–149, 2014.

[17]
H. F. Yu, N. Rao, and I. S. Dhillon, Temporal regularized matrix factorization for high-dimensional time series prediction, in Proc. 30 th Int. Conf. Neural Information Processing Systems, Barcelona, Spain, 2016, pp. 847–855.
[18]

K. Sanjar, O. Bekhzod, J. Kim, A. Paul, and J. Kim, Missing data imputation for geolocation-based price prediction using KNN-MCF method, ISPRS Int. J. Geo-Inf., vol. 9, no. 4, p. 227, 2020.

[19]
K. Cho, B. van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, Learning phrase representations using RNN encoder-decoder for statistical machine translation, in Proc. 2014 Conf. Empirical Methods in Natural Language Processing, Doha, Qatar, 2014, pp. 1724–1734.
DOI
[20]
Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y. Liu, Recurrent neural networks for multivariate time series with missing values, arXiv preprint arXiv: 1606.01865, 2016.
[21]

P. Bansal, P. Deshpande, and S. Sarawagi, Missing value imputation on multidimensional time series, Proc. VLDB Endow., vol. 14, no. 11, pp. 2533–2545, 2021.

[22]
W. Cao, D. Wang, J. Li, H. Zhou, Y. Li, and L. Li, BRITS: Bidirectional recurrent imputation for time series, in Proc. 32 nd Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2018, pp. 6776–6786.
[23]

A. Graves and J. Schmidhuber, Framewise phoneme classification with bidirectional LSTM and other neural network architectures, Neural Netw., vol. 18, nos. 5&6, pp. 602–610, 2005.

[24]
J. Yoon, J. Jordon, and M. van der Schaar, GAIN: Missing data imputation using generative adversarial nets, in Proc. 35 th Int. Conf. Machine Learning, Stockholm, Sweden, 2018, pp. 5689–5698.
[25]
Y. Luo, X. Cai, Y. Zhang, J. Xu, and X. Yuan, Multivariate time series imputation with generative adversarial networks, in Proc. 32 nd Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2018, pp. 1603–1614.
[26]
Y. Luo, Y. Zhang, X. Cai, and X. Yuan, E2GAN: End-to-end generative adversarial network for multivariate time series imputation, in Proc. 28 th Int. Joint Conf. Artificial Intelligence, Macao, China, 2019, pp. 3094–3100.
DOI
[27]
Y. Kochura, Y. Gordienko, V. Taran, N. Gordienko, A. Rokovyi, O. Alienin, and S. Stirenko, Batch size influence on performance of graphic and tensor processing units during training and inference phases, in Proc. Advances in Computer Science for Engineering and Education II. doi: 10.1007/978-3-030-16621-2_61.
DOI
[28]
C. D. Mello, L. R. V. Messias, P. L. J. Drews Jr, and S. S. C. Botelho, Unsupervised learning method for encoder-decoder-based image restoration, in Proc. 9 th Brazilian Conf. Intelligent Systems, Rio Grande, Brazil, 2020, pp. 348–360.
DOI
[29]
C. Fu, C. Liu, C. T. Ishi, and H. Ishiguro, MAEC: Multi-instance learning with an adversarial auto-encoder-based classifier for speech emotion recognition, in Proc. 2021 IEEE Int. Conf. Acoustics, Speech and Signal Processing, Toronto, Canada, 2021, pp. 6299–6303.
DOI
[30]
W. Huang, D. Wang, and D. Xiong, AdaST: Dynamically adapting encoder states in the decoder for end-to-end speech-to-text translation, in Proc. Findings of the Association for Computational Linguistics : ACL-IJCNLP 2021, Virtual Event, 2021, pp. 2539–2545.
DOI
[31]
D. Bahdanau, K. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate, arXiv preprint arXiv: 1409.0473, 2016.
[32]

L. Huang, W. Chen, Y. Liu, H. Zhang, and H. Qu, Improving neural machine translation using gated state network and focal adaptive attention network, Neural Comput. Appl., vol. 33, no. 23, pp. 15955–15967, 2021.

[33]

S. K. Roy, A. Nicolson, and K. K. Paliwal, DeepLPC-MHANet: Multi-head self-attention for augmented kalman filter-based speech enhancement, IEEE Access, vol. 9, pp. 70516–70530, 2021.

[34]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attention is all you need, in Proc. 31 st Int. Conf. Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6000–6010.
Big Data Mining and Analytics
Pages 12-28
Cite this article:
Zhao J, Rong C, Dang X, et al. QAR Data Imputation Using Generative Adversarial Network with Self-Attention Mechanism. Big Data Mining and Analytics, 2024, 7(1): 12-28. https://doi.org/10.26599/BDMA.2023.9020001

488

Views

100

Downloads

1

Crossref

1

Web of Science

1

Scopus

0

CSCD

Received: 08 August 2022
Revised: 13 February 2023
Accepted: 06 March 2023
Published: 25 December 2023
© The author(s) 2023.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return