| Sign up

PDF (525.3 KB)

Cite

Collect

Submit Manuscript

Show Outline

Figures (4)

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Tables (4)

TableA null

Table 1

Table 2

Table 3

Open Access

Leveraging Integrated Learning for Open-Domain Chinese Named Entity Recognition

Jin Diao^¹, Zhangbing Zhou^{¹^,²}(), Guangli Shi^¹

1 School of Information Engineering, China University of Geosciences, Beijing 100083, China

2 TELECOM SudParis, Evry 91011, France

Show Author Information

Abstract

Named entity recognition (NER) is a fundamental technique in natural language processing that provides preconditions for tasks, such as natural language question reasoning, text matching, and semantic text similarity. Compared to English, the challenge of Chinese NER lies in the noise impact caused by the complex meanings, diverse structures, and ambiguous semantic boundaries of the Chinese language itself. At the same time, compared with specific domains, open-domain entity types are more complex and changeable, and the number of entities is considerably larger. Thus, the task of Chinese NER is more difficult. However, existing open-domain NER methods have low recognition rates. Therefore, this paper proposes a method based on the bidirectional long short-term memory conditional random field (BiLSTM-CRF) model, which leverages integrated learning to improve the efficiency of Chinese NER. Compared with single models, including CRF, BiLSTM-CRF, and gated recurrent unit-CRF, the proposed method can significantly improve the accuracy of open-domain Chinese NER.

Keywords

Chinese named entity recognition integrated learning open-domain

References

Z. Nasar, S. W. Jaffry, and M. K. Malik, Named entity recognition and relation extraction: State-of-the-art, ACM Comput. Surv., vol. 54, no. 1, p. 20, 2022.

Crossref Google Scholar

Y. An, X. Xia, X. Chen, F. X. Wu, and J. Wang, Chinese clinical named entity recognition via multi-head self-attention based BiLSTM-CRF, Artif. Intell. Med., vol. 127, p. 102282, 2022.

Crossref Google Scholar

M. Collins and Y. Singer, Unsupervised models for named entity classification, in Proc. 1999 Joint SIGDAT Conf. on Empirical Methods in Natural Language Processing and Very Large Corpora, College Park, MD, USA, 1999, pp.100−110.

S. Cucerzan and D. Yarowsky, Language independent named entity recognition combining morphological and contextual evidence, in Proc. 1999 Joint SIGDAT Conf. on Empirical Methods in Natural Language Processing and Very Large Corpora, College Park, MD, USA, 1999, pp. 90−99.

S. Song, N. Zhang, and H. Huang, Named entity recognition based on conditional random fields, Cluster Comput., vol. 22, no. 3, pp. 5195–5206, 2019.

Crossref Google Scholar

G. Wu, G. Tang, Z. Wang, Z. Zhang, and Z. Wang, An attention-based BiLSTM-CRF model for Chinese clinic named entity recognition, IEEE Access, vol. 7, pp. 113942–113949, 2019.

Crossref Google Scholar

J. Lei, B. Tang, X. Lu, K. Gao, M. Jiang, and H. Xu, A comprehensive study of named entity recognition in Chinese clinical text, J. Am. Med. Inform. Assoc., vol. 21, no. 5, pp. 808–814, 2014.

Crossref Google Scholar

C. Xu, F. Wang, J. Han, and C. Li, Exploiting multiple embeddings for Chinese named entity recognition, in Proc. 28^th ACM Int. Conf. on Information and Knowledge Management, Atlanta, GA, USA, 2019, pp. 2269−2272.https://doi.org/10.1145/3357384.3358117

Crossref

Z. Huang, W. Xu, and K. Yu, Bidirectional LSTM-CRF models for sequence tagging, arXiv preprint arXiv: 1508.01991, 2015.

G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, and C. Dyer, Neural architectures for named entity recognition, in Proc. 2016 Conf. of the North American Chapter of the Association for Computational, San Diego, CA, USA, 2016, pp. 260−270.https://doi.org/10.18653/v1/N16-1030

Crossref

R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, Natural language processing (almost) from scratch, J. Mach. Learn. Res., vol. 12, pp. 2493–2537, 2011.

Google Scholar

C. Dong, J. Zhang, C. Zong, M. Hattori, and H. Di, Character-based LSTM-CRF with radical-level features for Chinese named entity recognition, in Proc. 5^th CCF Conf. on Natural Language Processing and Chinese Computing, Kunming, China, 2016, pp. 239−250.https://doi.org/10.1007/978-3-319-50496-4_20

Crossref

X. Liu, Y. Zhou, and Z. Wang, Deep neural network-based recognition of entities in Chinese online medical inquiry texts, Future Gener. Comput. Syst., vol. 114, pp. 581–604, 2021.

Crossref Google Scholar

S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.

Crossref Google Scholar

X. Ma and E. H. Hovy, End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF, in Proc. 54^th Annu. Meeting of the Association for Computational Linguistics, Berlin, Germany, 2016, pp. 1064−1074.https://doi.org/10.18653/v1/P16-1101

Crossref

B. V. Dasarathy and B. V. Sheela, A composite classifier system design: Concepts and methodology, Proc. IEEE, vol. 67, no. 5, pp. 708–713, 1979.

Crossref Google Scholar

E. Hillebrand, M. Lukas, and W. Wei, Bagging weak predictors, Int. J. Forecast., vol. 37, no. 1, pp. 237–254, 2021.

Crossref Google Scholar

L. Breiman, Pasting small votes for classification in large databases and on-line, Mach. Learn., vol. 36, no. 1, pp. 85–103, 1999.

Crossref Google Scholar

L. Bai, J. Liang, and F. Cao, A multiple k-means clustering ensemble algorithm to find nonlinearly separable clusters, Inf. Fusion, vol. 61, pp. 36–47, 2020.

Crossref Google Scholar

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, New York, NY, USA: Springer, 2009.https://doi.org/10.1007/978-0-387-84858-7

Crossref

Y. Cao, Q. Miao, J. Liu, and L. Gao, Advance and prospects of AdaBoost algorithm, Acta Autom. Sin., vol. 39, no. 6, pp. 745–758, 2013.

Crossref Google Scholar

T. N. Rincy and R. Gupta, Ensemble learning techniques and its efficiency in machine learning: A survey, in Proc. 2^nd Int. Conf. on Data, Engineering and Applications (IDEA), Bhopal, India, 2020, pp. 1−6.https://doi.org/10.1109/IDEA49133.2020.9170675

Crossref

A. Ledezma, R. Aler, A. Sanchis, and D. Borrajo, GA-stacking: Evolutionary stacked generalization, Intell. Data Anal., vol. 14, no. 1, pp. 89–119, 2010.

Crossref Google Scholar

G. Sigletos, G. Paliouras, C. D. Spyropoulos, and M. Hatzopoulos, Combining information extraction systems using voting and stacked generalization, J. Mach. Learn. Res., vol. 6, pp. 1751–1782, 2005.

Google Scholar

International Journal of Crowd Science

Volume 6 Issue 2,
June 2022

Pages 74-79

DOI: 10.26599/IJCS.2022.9100015

Cite this article:

Diao J, Zhou Z, Shi G. Leveraging Integrated Learning for Open-Domain Chinese Named Entity Recognition. International Journal of Crowd Science, 2022, 6(2): 74-79. https://doi.org/10.26599/IJCS.2022.9100015

Item	Content
Operating system	Centos
Python	3.7
Tensorflow	1.13.1
CPU	Intel(R) Xeon(R) Silver 4110 CPU @ 2.10 GHz
GPU	Tesla P4 (8G)
Internal storage	32 GB

Entity	Precision	Recall	F1	# phrases
CNIL-BiLSTM	90.02%	88.27%	89.14%	-
LOC	89.84%	83.29%	86.44%	56498
NUM	98.05%	96.17%	97.10%	136001
ORG	74.40%	76.78%	75.57%	16995
PER	82.28%	78.18%	80.18%	47366
TIM	93.00%	90.62%	91.80%	74096

Model	Precision	Recall	F1
CRF	75.25%	75.92%	75.58%
BiLSTM-CRF	85.67%	81.36%	86.44%
GRU-CRF	86.06%	84.92%	85.49%
CASCADED HMM	80.32%	79.45%	79.88%
CNIL-BiLSTM	90.02%	88.27%	89.14%

Algorithm 1　CNIL-BiLSTM
Input: text statement sequence $X$
Output: probability $Y$ of label sequence
1 Initializing statement sequences $X_{1}$ and $X_{2}$ according to label categories;
2 Adopt BiLSTM-CRF model for $X_{1}$ sequence;
3 Initialize the state transition matrix $N_{i j}$ and randomly initialize the parameter $θ$ ;
4 foreach epoch do
5　　foreach batch do
6　　　Forward propagation of BiLSTM algorithm:
7　　　　Forward propagation of backward state;
8　　　　Forward propagation of forward state;
9　　　Forward and backward propagation of CRF layer;
10　　　Backward propagation of BiLSTM algorithm:
11　　　　Backward propagation of backward state;
12　　　　Backward propagation of forward state;
13　　　Update parameters $N_{i, j}$ , $θ$ ;
14　　end
15 end
16 Adopt the BiLSTM-CRF model for the $X_{2}$ sequence and repeat steps 3 to 15;
17 The results of the two models are fused by the relative majority voting method;
18 The final sequence probability $Y$ is outputted;