QAR Data Imputation Using Generative Adversarial Network with Self-Attention Mechanism

Jingqi Zhao; Chuitian Rong; Xin Dang; Huabo Sun

doi:10.26599/BDMA.2023.9020001

| Sign up

PDF (4.5 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Figures (11)

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Fig. 7

Fig. 8

Fig. 9

Tables (5)

Table 1

Table 2

Table 3

Table 4

Table 5

Open Access

QAR Data Imputation Using Generative Adversarial Network with Self-Attention Mechanism

Jingqi Zhao^¹, Chuitian Rong^¹(), Xin Dang^¹, Huabo Sun^²

1School of Computer Science and Technology, Tiangong University, Tianjin 300387, China

2Institute of Aviation Safety, China Academy of Civil Aviation Science and Technology, Beijing 100028, China

Show Author Information

Abstract

Quick Access Recorder (QAR), an important device for storing data from various flight parameters, contains a large amount of valuable data and comprehensively records the real state of the airline flight. However, the recorded data have certain missing values due to factors, such as weather and equipment anomalies. These missing values seriously affect the analysis of QAR data by aeronautical engineers, such as airline flight scenario reproduction and airline flight safety status assessment. Therefore, imputing missing values in the QAR data, which can further guarantee the flight safety of airlines, is crucial. QAR data also have multivariate, multiprocess, and temporal features. Therefore, we innovatively propose the imputation models A-AEGAN (“A” denotes attention mechanism, “AE” denotes autoencoder, and “GAN” denotes generative adversarial network) and SA-AEGAN (“SA” denotes self-attentive mechanism) for missing values of QAR data, which can be effectively applied to QAR data. Specifically, we apply an innovative generative adversarial network to impute missing values from QAR data. The improved gated recurrent unit is then introduced as the neural unit of GAN, which can successfully capture the temporal relationships in QAR data. In addition, we modify the basic structure of GAN by using an autoencoder as the generator and a recurrent neural network as the discriminator. The missing values in the QAR data are imputed by using the adversarial relationship between generator and discriminator. We introduce an attention mechanism in the autoencoder to further improve the capability of the proposed model to capture the features of QAR data. Attention mechanisms can maintain the correlation among QAR data and improve the capability of the model to impute missing data. Furthermore, we improve the proposed model by integrating a self-attention mechanism to further capture the relationship between different parameters within the QAR data. Experimental results on real datasets demonstrate that the model can reasonably impute the missing values in QAR data with excellent results.

Keywords

multivariate time series data imputation self-attention Generative Adversarial Network (GAN)

References

[1]

D. R. Insua, C. Alfaro, J. Gomez, P. Hernandez-Coronado, and F. Bernal, Forecasting and assessing consequences of aviation safety occurrences, Saf. Sci., vol. 111, pp. 243–252, 2019.

Symbol	Definition
X	Multivariate time series (QAR data)
M	Mask matrix
$δ$	Time lag matrix
$t_{i}$	i-th timestamp
$x_{t_{i}}$	Records at $t_{i}$
$G$	Generator in GAN
$D$	Discriminator in GAN

Missing rate (%)	Climb					Cruise					Descent
Missing rate (%)	GRU-D	BRITS	E²GAN	A-AEGAN	SA-AEGAN	GRU-D	BRITS	E²GAN	A-AEGAN	SA-AEGAN	GRU-D	BRITS	E²GAN	A-AEGAN	SA-AEGAN
10	0.904	0.853	0.782	0.412	0.327	0.266	0.241	0.182	0.097	0.032	1.372	1.253	0.925	0.526	0.392
20	1.031	0.961	0.954	0.533	0.425	0.513	0.473	0.347	0.206	0.091	1.804	1.677	1.203	0.704	0.544
30	1.238	1.279	1.204	0.74	0.614	0.792	0.796	0.539	0.371	0.215	2.533	2.272	1.627	1.082	0.852
40	1.677	1.592	1.615	1.135	0.962	1.124	1.062	0.752	0.568	0.384	3.175	2.931	2.114	1.578	1.396
50	2.469	2.370	2.231	1.697	1.431	1.397	1.439	1.046	0.833	0.615	4.022	3.702	2.738	2.040	1.851
60	3.052	3.105	2.935	2.307	1.981	2.106	1.972	1.620	1.376	1.132	5.317	4.815	3.485	2.597	2.273
70	4.082	3.984	3.877	3.159	2.804	2.994	2.875	2.417	2.175	1.721	6.725	5.991	4.402	3.472	3.081
80	5.461	5.334	5.028	4.232	3.942	4.012	3.804	3.379	3.062	2.589	8.306	7.225	5.979	4.951	4.327

Number of iterations	GRU-D	BRITS	E $^{2}$ GAN	A-AEGAN	SA-AEGAN
1	0.6833	0.7057	0.7136	0.7399	0.8377
2	0.7091	0.6891	0.7036	0.7474	0.7932
3	0.6930	0.7168	0.7086	0.7593	0.7894
4	0.6875	0.6953	0.7211	0.7531	0.8126
5	0.6901	0.7015	0.7167	0.7443	0.7807
Average	0.6926	0.7016	0.7125	0.7488	0.8027

$λ$	Climb		Cruise		Descent
$λ$	A-AEGAN	SA-AEGAN	A-AEGAN	SA-AEGAN	A-AEGAN	SA-AEGAN
0.1	2.836	2.982	2.576	2.663	3.411	3.632
0.5	2.273	2.327	1.998	1.919	2.852	3.100
1	1.732	1.611	1.331	1.186	2.226	2.210
5	1.075	0.903	0.662	0.521	1.553	1.434
10	0.671	0.574	0.256	0.184	0.967	0.876
15	0.537	0.493	0.203	0.093	0.702	0.668
20	0.633	0.421	0.321	0.232	0.929	0.541
25	0.892	0.562	0.658	0.625	1.265	0.705
30	1.346	0.872	1.203	1.255	1.832	1.054
35	2.060	1.411	1.757	1.901	2.361	1.499

$α$	Climb		Cruise		Descent
$α$	A-AEGAN	SA-AEGAN	A-AEGAN	SA-AEGAN	A-AEGAN	SA-AEGAN
1	2.971	2.904	2.208	1.972	3.376	3.056
2	2.451	2.237	1.573	1.386	2.980	2.614
3	1.933	1.623	1.126	0.903	2.437	2.093
4	1.425	1.246	0.739	0.534	1.915	1.655
5	1.137	0.960	0.565	0.382	1.574	1.398
6	1.398	1.214	0.696	0.529	1.796	1.542
7	1.785	1.638	1.085	0.938	2.214	1.898
8	2.206	2.132	1.534	1.396	2.733	2.471