| Sign up

PDF (888.7 KB)

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Open Access

Knowledge Error Detection via Textual and Structural Joint Learning

Xiaoyu Wang^¹, Xiang Ao^²(), Fuwei Zhang^², Zhao Zhang^², Qing He^¹

1Henan Institute of Advanced Technology, Zhengzhou University, Zhengzhou 450003, China, and with Key Laboratory of AI Safety, Chinese Academy of Sciences (CAS), Beijing 100190, China, and also with Key Lab of Intelligent Information Processing, Institute of Computing Technology, CAS, Beijing 100190, China

2Key Laboratory of AI Safety, CAS, Beijing 100190, China, and with Key Lab of Intelligent Information Processing, Institute of Computing Technology, CAS, Beijing 100190, China

Show Author Information

Abstract

Knowledge graphs are essential tools for representing real-world facts and finding wide applications in various domains. However, the process of constructing knowledge graphs often introduces noises and errors, which can negatively impact the performance of downstream applications. Current methods for knowledge graph error detection primarily focus on graph structure and overlook the importance of textual information in error detection. Therefore, this paper proposes a novel error detection framework that combines both structural and textual information. The framework utilizes a confidence module for error detection while generating knowledge embeddings. The performance of this approach outperforms baseline methods in error detection and link prediction experiments, particularly achieving state-of-the-art performance in the error detection task.

Keywords

Knowledge Graph (KG)error detection textual information confidence score

References

[1]

F. M. Suchanek, G. Kasneci, and G. Weikum, Yago: A core of semantic knowledge, in Proc. 16th Int. World Wide Web Conf., Banff, Canada, 2007, pp. 697–706.

[2]

K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor, Freebase: A collaboratively created graph database for structuring human knowledge, in Proc. 2008 ACM SIGMOD Int. Conf. on Management of Data, Vancouver, Canada, 2008, pp. 1247–1250.

[3]

T. P. Tanon, D. Vrandečić, S. Schaffert, T. Steiner, and L. Pintscher, From freebase to wikidata: The great migration, in Proc. 25th Int. World Wide Web Conf., Montréal, Canada, 2016, pp. 1419–1428.

[4]

S. Heindorf, M. Potthast, B. Stein, and G. Engels, Vandalism detection in wikidata, in Proc. 25th ACM Conf. on Information and Knowledge Management, Indianapolis, IN, USA, 2016, pp. 327–336.

[5]

A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, and O. Yakhnenko, Translating embeddings for modeling multi-relational data, in Proc. 26th Int. Conf. on Neural Information Processing Systems, Lake Tahoe, NV, USA, 2013, pp. 2787–2795.

[6]

Y. Lin, Z. Liu, M. Sun, Y. Liu, and X. Zhu, Learning entity and relation embeddings for knowledge graph completion, in Proc. 29th AAAI Conf. on Artificial Intelligence, Austin, TX, USA, 2015, pp. 2181–2187.

[7]

T. Trouillon, J. Welbl, S. Riedel, É. Gaussier, and G. Bouchard, Complex embeddings for simple link prediction, in Proc. 33rd Int. Conf. on Machine Learning, New York, NY, USA, 2016, pp. 2071–2080.

[8]

B. Yang, W. Yih, X. He, J. Gao, and L. Deng, Embedding entities and relations for learning and inference in knowledge bases, arXiv preprint arXiv: 1412.6575, 2014.

[9]

X. Wang, T. Gao, Z. Zhu, Z. Zhang, Z. Liu, J. Li, and J. Tang, Kepler: A unified model for knowledge embedding and pre-trained language representation, Transactions of the Association for Computational Linguistics, vol. 9, pp. 176–194, 2021.

Crossref Google Scholar

[10]

H. Xiao, M. Huang, L. Meng, and X. Zhu, SSP: Semantic space projection for knowledge graph embedding with text descriptions, in Proc. 31st AAAI Conf. on Artificial Intelligence, San Francisco, CA, USA, 2017, pp. 3104–3110.

[11]

R. Xie, Z. Liu, J. Jia, H. Luan, and M. Sun, Representation learning of knowledge graphs with entity descriptions, in Proc. 30th AAAI Conf. on Artificial Intelligence, Phoenix, AZ, USA, 2016, pp. 2659–2665.

[12]

J. Xu, K. Chen, X. Qiu, and X. Huang, Knowledge graph representation with jointly structural and textual encoding, arXiv preprint arXiv: 1611.08661, 2016.

[13]

K. Bougiatiotis, R. Fasoulis, F. Aisopos, A. Nentidis, and G. Paliouras, Guiding graph embeddings using path-ranking methods for error detection innoisy knowledge graphs, arXiv preprint arXiv: 2002.08762, 2020.

[14]

Q. Zhang, J. Dong, K. Duan, X. Huang, Y. Liu, and L. Xu, Contrastive knowledge graph error detection, in Proc. 31st ACM Int. Conf. on Information and Knowledge Management, Atlanta, GA, USA, 2022, pp. 2590–2599.

[15]

R. Xie, Z. Liu, F. Lin, and L. Lin, Does william shakespeare really write hamlet? Knowledge representation learning with confidence, in Proc. 32nd AAAI Conf. on Artificial Intelligence, New Orleans, LA, USA, 2018, pp. 4954–4961.

[16]

S. Jia, Y. Xiang, X. Chen, and K. Wang, Triple trustworthiness measurement for knowledge graph, in Proc. 28th Int. World Wide Web Conf., San Francisco, CA, USA, 2019, pp. 2865–2871.

[17]

Z. Sun, Z. Deng, J. Nie, and J. Tang, Rotate: Knowledge graph embedding by relational rotation in complex space, arXiv preprint arXiv: 1902.10197, 2019.

[18]

G. Beskales, I. F. Ilyas, and L. Golab, Sampling the repairs of functional dependency violations under hard constraints, Proceedings of the VLDB Endowment, vol. 3, nos. 1&2, p. 197, 2010.

Crossref Google Scholar

[19]

G. Beskales, I. F. Ilyas, L. Golab, and A. Galiullin, On the relative trust between inconsistent data and inaccurate constraints, in Proc. IEEE 29th Int. Conf. on Data Engineering, Brisbane, Australia, 2013, pp. 541–552.

[20]

X. Chu, I. F. Ilyas, and P. Papotti, Holistic data cleaning: Putting violations into context, in Proc. IEEE 29th Int. Conf. on Data Engineering, Brisbane, Australia, 2013, pp. 458–469.

[21]

Z. Khayyat, I. F. Ilyas, A. Jindal, S. Madden, M. Ouzzani, P. Papotti, J. A. Quiané-Ruiz, N. Tang, and S. Yin, Bigdansing: A system for big data cleansing, in Proc. 2015 ACM SIGMOD Int. Conf. on Management of Data, Melbourne, Australia, 2015, pp. 1215–1230.

[22]

A. Melo and H. Paulheim, Detection of relation assertion errors in knowledge graphs, in Proc. 9th Knowledge Capture Conf., Austin, TX, USA, 2017, pp. 1–8.

[23]

K. Cheng, X. Li, Y. E. Xu, X. L. Dong, and Y. Sun, PGE: Robust product graph embedding learning for error detection, arXiv preprint arXiv: 2202.09747, 2022.

[24]

R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, Natural language processing (almost) from scratch, Journal of Machine Learning Research, vol. 12, pp. 2493–2537, 2011.

[25]

N. Kalchbrenner, E. Grefenstette, and P. Blunsom, A convolutional neural network for modelling sentences, in Proc. 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, USA, 2014, pp. 655–665.

[26]

D. Q. Nguyen, T. D. Nguyen, D. Q. Nguyen, and D. Phung, A novel embedding model for knowledge base completion based on convolutional neural network, in Proc. 2018 Conference of the North American Chapter of the Association for Computational Linguistics : Human Language Technologies, Volume 2 (Short Papers), New Orleans, LA, USA, 2018, pp. 327–333.

[27]

C. Belth, X. Zheng, J. Vreeken, and D. Koutra, What is normal, what is strange, and what is missing in a knowledge graph: Unified characterization via inductive summarization, in Proc. Web Conf. 2020, Taipei, China, 2020, pp. 1115–1126.

Big Data Mining and Analytics

Volume 8 Issue 1,
February 2025

Pages 233-240

DOI: 10.26599/BDMA.2024.9020040

Cite this article:

Wang X, Ao X, Zhang F, et al. Knowledge Error Detection via Textual and Structural Joint Learning. Big Data Mining and Analytics, 2025, 8(1): 233-240. https://doi.org/10.26599/BDMA.2024.9020040

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号