Event Temporal Relation Extraction with Attention Mechanism and Graph Neural Network

Xiaoliang Xu; Tong Gao; Yuxiang Wang; Xinle Xuan

doi:10.26599/TST.2020.9010063

| Sign up

PDF (3 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Open Access

Event Temporal Relation Extraction with Attention Mechanism and Graph Neural Network

Xiaoliang Xu, Tong Gao, Yuxiang Wang(), Xinle Xuan

Department of Computer Science and Engineering, Hangzhou Dianzi University, Hangzhou 310018, China

Hangzhou Sanhui Digital Information Technology Co., Ltd, Hangzhou 310018, China

Show Author Information

Abstract

Event temporal relation extraction is an important part of natural language processing. Many models are being used in this task with the development of deep learning. However, most of the existing methods cannot accurately obtain the degree of association between different tokens and events, and event-related information cannot be effectively integrated. In this paper, we propose an event information integration model that integrates event information through multilayer bidirectional long short-term memory (Bi-LSTM) and attention mechanism. Although the above scheme can improve the extraction performance, it can still be further optimized. To further improve the performance of the previous scheme, we propose a novel relational graph attention network that incorporates edge attributes. In this approach, we first build a semantic dependency graph through dependency parsing, model a semantic graph that considers the edges’ attributes by using top-k attention mechanisms to learn hidden semantic contextual representations, and finally predict event temporal relations. We evaluate proposed models on the TimeBank-Dense dataset. Compared to previous baselines, the Micro-F1 scores obtained by our models improve by 3.9% and 14.5%, respectively.

Keywords

temporal relation extraction neural network attention mechanism graph attention network

References

[1]

Y. J.

Zhang

, P. F.

, and G. D.

Zhou

, Classifying temporal relations between events by deep biLSTM, in Proc. 2018 Int. Conf. on Asian Language Processing, Bandung, Indonesia, 2018, pp. 267-272.

Crossref

[2]

Derczynski

and R.

Gaizauskas

, Using signals to improve automatic classification of temporal relations, arXiv preprint arXiv: 1203.5055, 2012.

Google Scholar

[3]

G. A.

Miller

, Wordnet: A lexical database for English, Commun. ACM, vol. 38, no. 11, pp. 39-41, 1995.

Crossref Google Scholar

[4]

Mani

, M.

Verhagen

, B.

Wellner

, C. M.

Lee

, and J.

Pustejovsky

, Machine learning of temporal relations, in Proc. 21st Int. Conf. on Computational Linguistics and the 44th Annu. Meeting of the Association for Computational Linguistics, Sydney, Australia, 2006, pp. 753-760.

Crossref

[5]

Cheng

and Y.

Miyao

, Classifying temporal relations by bidirectional LSTM over dependency paths, in Proc. 55th Annu. Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, pp. 1-6.

Crossref

[6]

Y. L.

Meng

, A.

Rumshisky

, and A.

Romanov

, Temporal information extraction for question answering using syntactic dependencies in an LSTM-based architecture, in Proc. 2017 Conf. on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017, pp. 887-896.

Crossref

[7]

P. K.

Choubey

and R. H.

Huang

, A sequential model for classifying temporal relations between intra-sentence events, in Proc. 2017 Conf. on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017, pp. 1796-1802.

Crossref

[8]

C. H.

Zhang

, M.

Zhou

, X.

Han

, Z.

, and Y.

, Knowledge graph embedding for hyper-relational data, Tsinghua Science and Technology, vol. 22, no. 2, pp. 185-197, 2017.

Crossref Google Scholar

[9]

Verhagen

, R.

Gaizauskas

, F.

Schilder

, M.

Hepple

, G.

Katz

, and J.

Pustejovsky

, Semeval-2007 task 15: Tempeval temporal relation identification, in Proc. 4th Int. Workshop on Semantic Evaluations, Prague, Czech Republic, 2007, pp. 75-80.

Crossref

[10]

Verhagen

, R.

Saurí

, T.

Caselli

, and J.

Pustejovsky

, Semeval-2010 task 13: Tempeval-2, in Proc. 5th Int. Workshop on Semantic Evaluation, Uppsala, Sweden, 2010, pp. 57-62.

[11]

UzZaman

, H.

Llorens

, L.

Derczynski

, J.

Allen

, M.

Verhagen

, and J.

Pustejovsky

, SemEval-2013 task 1: Tempeval-3: Evaluating time expressions, events, and temporal relations, in Proc. 2nd Joint Conf. on Lexical and Computational Semantics (*SEM), Volume 2: Proc. 7th Int. Workshop on Semantic Evaluation (SemEval 2013), Atlanta, GA, USA, 2013, pp. 1-9.

[12]

Chambers

, S.

Wang

, and D.

Jurafsky

, Classifying temporal relations between events, in Proc. 45th Annu. Meeting of the Association for Computational Linguistics Companion Volume Proc. Demo and Poster Sessions, Prague, Czech Republic, 2007, pp. 173-176.

Crossref

[13]

Leeuwenberg

and M. F.

Moens

, Structured learning for temporal relation extraction from clinical records, in Proc. 15th Conf. of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Valencia, Spain, 2017, pp. 1150-1158.

Crossref

[14]

Chambers

, Navytime: Event and Time Ordering from Raw Text. Annapolis, MD, USA: Naval Academy, 2013.

[15]

Miwa

and M.

Bansal

, End-to-end relation extraction using LSTMs on sequences and tree structures, in Proc. 54th Annu. Meeting of the Association for Computational Linguistics, Berlin, Germany, 2016, 1105-1116.

Crossref

[16]

Han

, B. Y.

, and Z. R.

Wang

, An attention-based neural framework for uncertainty identification on social media texts, Tsinghua Science and Technology, vol. 25, no. 1, pp. 117-126, 2020.

Crossref Google Scholar

[17]

R. Y.

Xin

, J.

Zhang

, and Y. T.

Shao

, Complex network classification with convolutional neural network, Tsinghua Science and Technology, vol. 25, no. 4, pp. 447-457, 2020.

Crossref Google Scholar

[18]

Tourille

, O.

Ferret

, A.

Névéol

, and X.

Tannier

, Neural architecture for temporal relation extraction: A bi-LSTM approach for detecting narrative containers, in Proc 55th Annu. Meeting of the Association for Computational Linguistics, Vancouver, Canada, 2017, pp. 224-230.

Crossref

[19]

Laokulrat

, M.

Miwa

, Y.

Tsuruoka

, and T.

Chikayama

, Uttime: Temporal relation classification using deep syntactic features, in Proc. 2nd Joint Conf. on Lexical and Computational Semantics (*SEM), Volume 2: Proc. 7th International Workshop on Semantic Evaluation (SemEval 2013), Atlanta, GA, USA, 2013, pp. 88-92.

[20]

Gori

, G.

Monfardini

, and F.

Scarselli

, A new model for learning in graph domains, in Proc. 2005 IEEE Int. Joint Conf. on Neural Networks, Montreal, Canada, 2005, pp. 729-734.

[21]

Henaff

, J.

Bruna

, and Y.

LeCun

, Deep convolutional networks on graph-structured data, arXiv preprint arXiv: 1506.05163, 2015.

Google Scholar

[22]

Defferrard

, X.

Bresson

, and P.

Vandergheynst

, Convolutional neural networks on graphs with fast localized spectral filtering, in Proc. 30th Int. Conf. on Neural Information Processing Systems, Barcelona, Spain, 2016, pp. 3844-3852.

[23]

T. N.

Kipf

and M.

Welling

, Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv: 1609.02907, 2016.

Google Scholar

[24]

Marcheggiani

and I.

Titov

, Encoding sentences with graph convolutional networks for semantic role labeling, in Proc. 2017 Conf. on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017, pp. 1506-1515.

Crossref

[25]

Y. H.

Zhang

, P.

, and C. D.

Manning

, Graph convolution over pruned dependency trees improves relation extraction, in Proc. 2018 Conf. on Empirical Methods in Natural Language Processing, Brussels, Belgium, 2018, pp. 2205-2215.

Crossref

[26]

Veličković

, G.

Cucurull

, A.

Casanova

, A.

Romero

, P.

Liò

, and Y.

Bengio

, Graph attention networks, arXiv preprint arXiv: 1710.10903, 2017.

Google Scholar

[27]

Busbridge

, D.

Sherburn

, P.

Cavallo

, and N. Y.

Hammerla

, Relational graph attention networks, arXiv preprint arXiv: 1904.05811, 2019.

Google Scholar

[28]

, L. L.

Mou

, G.

, Y. C.

Chen

, H.

Peng

, and Z.

Jin

, Classifying relations via long short term memory networks along shortest dependency paths, in Proc. 2015 Conf. on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015, pp. 1785-1794.

Crossref

[29]

Devlin

, M. W.

Chang

, K.

Lee

, and K.

Toutanova

, BERT: Pre-training of deep bidirectional transformers for language understanding, in Proc. 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, Minneapolis, Minnesota, 2019, pp. 4171-4186.

[30]

Liu

, Z. C.

Luo

, and H. Y.

Huang

, Jointly multiple events extraction via attention-based graph information aggregation, in Proc. 2018 Conf. on Empirical Methods in Natural Language Processing, Brussels, Belgium, 2018, pp. 1247-1256.

Crossref

[31]

Zhou

, Z. Y.

, S. C.

Zheng

, J. M.

, H. Y.

Bao

, and B.

, Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling, Proc. 26th Int. Conf. on Computational Linguistics: Technical Papers, Osaka, Japan, 2016, pp. 3485-3495.

[32]

Zhu

, Y. K.

Lin

, Z. Y.

Liu

, J.

, T. S.

Chua

, and M. S.

Sun

, Graph neural networks with generated parameters for relation extraction, in Proc. 57th Annu. Meeting of the Association for Computational Linguistics, Florence, Italy, 2019, pp. 1331-1339.

Crossref

[33]

Santoro

, D.

Raposo

, D. G. T.

Barrett

, M.

Malinowski

, R.

Pascanu

, P.

Battaglia

, and T.

Lillicrap

, A simple neural network module for relational reasoning. in Proc. 31st Int. Conf. on Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 4974-4983.

[34]

Chambers

, T. C. B.

McDowell

, and S.

Bethard

, Dense event ordering with a multi-pass architecture, Trans. Assoc Comput Linguist, vol. 2, pp. 273-284, 2014.

Crossref Google Scholar

[35]

D. P.

Kingma

and J.

, Adam: A method for stochastic optimization, arXiv preprint arXiv: 1412.6980, 2014.

Google Scholar

[36]

Glorot

, A.

Bordes

, and Y.

Bengio

, Deep sparse rectifier neural networks, in Proc. 14th Int. Conf. on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 2011, pp. 315-323.

[37]

Y. L.

Meng

and A.

Rumshisky

, Context-aware neural model for temporal information extraction, in Proc. 56th Annu. Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, 2018, pp. 527-536.

Crossref

Tsinghua Science and Technology

Volume 27 Issue 1,
February 2022

Pages 79-90

DOI: 10.26599/TST.2020.9010063

Cite this article:

Xu X, Gao T, Wang Y, et al. Event Temporal Relation Extraction with Attention Mechanism and Graph Neural Network. Tsinghua Science and Technology, 2022, 27(1): 79-90. https://doi.org/10.26599/TST.2020.9010063