Discover the SciOpen Platform and Achieve Your Research Goals with Ease.
Search articles, authors, keywords, DOl and etc.
In recent years, great success has been achieved in many tasks of natural language processing (NLP), e.g., named entity recognition (NER), especially in the high-resource language, i.e., English, thanks in part to the considerable amount of labeled resources. More labeled resources, better word representations. However, most low-resource languages do not have such an abundance of labeled data as high-resource English, leading to poor performance of NER in these low-resource languages due to poor word representations. In the paper, we propose converse attention network (CAN) to augment word representations in low-resource languages from the high-resource language, improving the performance of NER in low-resource languages by transferring knowledge learned in the high-resource language. CAN first translates sentences in low-resource languages into high-resource English using an attention-based translation module. In the process of translation, CAN obtains the attention matrices that align word representations of high-resource language space and low-resource language space. Furthermore, CAN augments word representations learned in low-resource language space with word representations learned in high-resource language space using the attention matrices. Experiments on four low-resource NER datasets show that CAN achieves consistent and significant performance improvements, which indicates the effectiveness of CAN.
S. Lyu, J. Cheng, X. Wu, L. Cui, H. Chen, and C. Miao, Auxiliary learning for relation extraction, IEEE Trans. Emerg. Top. Comput. Intell., vol. 6, no. 1, pp. 182–191, 2022.
S. Lyu, X. Wu, J. Li, Q. Chen, and H. Chen, Do models learn the directionality of relations? A new evaluation: Relation direction recognition, IEEE Trans. Emerg. Top. Comput. Intell., vol. 6, no. 4, pp. 883–892, 2022.
G. Hu, S. Lyu, X. Wu, J. Li, and H. Chen, Contextual-aware information extractor with adaptive objective for Chinese medical dialogues, ACM Trans. Asian Low Resour. Lang. Inf. Process., vol. 21, no. 5, p. 96, 2022.
X. Zhao, L. Chen, and H. Chen, A weighted heterogeneous graph-based dialog system, IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 8, pp. 5212–5217, 2023.
M. Wang and C. D. Manning, Cross-lingual projected expectation regularization for weakly supervised learning, Trans. Assoc. Comput. Linguist., vol. 2, pp. 55–66, 2014.
R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, Natural language processing (almost) from scratch, J. Machine Learning Research, vol. 12, pp. 2493–2537, 2011.
O. Täckström, D. Das, S. Petrov, R. McDonald, and J. Nivre, Token and type constraints for cross-lingual part-of-speech tagging, Trans. Assoc. Comput. Linguist., vol. 1, pp. 1–12, 2013.
R. Hwa, P. Resnik, A. Weinberg, C. Cabezas, and O. Kolak, Bootstrapping parsers via syntactic projection across parallel texts, Nat. Lang. Eng., vol. 11, no. 3, pp. 311–325, 2005.
S. Chaudhari, V. Mithal, G. Polatkan, and R. Ramanath, An attentive survey of attention models, ACM Trans. Intell. Syst. Technol., vol. 12, no. 5, pp. 1–32, 2021.
M. Wang, W. Che, and C. Manning, Effective bilingual constraints for semi-supervised learning of named entity recognizers, Proc. AAAI Conf. Artif. Intell., vol. 27, no. 1, pp. 919–925, 2013.
H. He and X. Sun, A unified model for cross-domain and semi-supervised named entity recognition in Chinese social media, Proc. AAAI Conf. Artif. Intell., vol. 31, no. 1, pp. 3216–3222, 2017.
P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, Enriching word vectors with subword information, Trans. Assoc. Comput. Linguist., vol. 5, pp. 135–146, 2017.
The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).