SL-CMTGP: An effective knowledge interaction matching model between multitasks for large-scale biomedical ontologies

Donglei Sun^¹, Qing Lv^¹(), Pei-Wei Tsai^², Xingsi Xue^³, Kai Zhang^²(), Ketao Dai^¹

¹ School of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan, Shanxi_ 030024, China.

² Department of Computing Technologies, Swinburne University of Technology, Melbourne, 3000, Australia.

³ Fujian Provincial Key Laboratory of Big Data Mining and Applications, Fujian University of Technology, Fuzhou, Fujian, 350118, China.

Show Author Information

Abstract

Biomedical ontologies encapsulate the vast knowledge within the medical domain, facilitating communication and data exchange. However, the heterogeneity of these ontologies often impedes knowledge exchange, especially in large-scale biomedical ontologies. Biomedical Ontology Matching (BOM) based on partitioning addresses this issue by dividing extensive ontologies into manageable sub-ontologies and identifying equivalence relationships among heterogeneous entities. Recently, Genetic Programming (GP) has been widely employed as an effective technique for optimizing and combining ontology similarity features (SFs). Nevertheless, the traditional GP methods struggle with the matching tasks due to the numerous and complex SFs of the partitioned sub-ontologies. To tackle these challenges, this paper proposes an efficient multi-task matching model to solve large-scale BOM problems. Firstly, an anchor-based partitioning method is introduced, which reduces the search space while retaining more informative sub-ontologies, ensuring high- quality subsequent matching. Secondly, a novel self-learning compact multi-task genetic programming (SL- CMTGP) method is proposed for constructing entity SFs. This method autonomously explores correlations among different matching tasks and leverages an implicit knowledge transfer mechanism to perform evolutionary operations, significantly enhancing BOM matching quality while reducing computational complexity. Lastly, a new approximate evaluation metric is introduced to improve the guidance of evolutionary algorithms, addressing the bias problem and overcoming local optima in individual tasks. Experimental evaluations were conducted on six test cases from the Anatomy, Large Biomedical Ontologies, and Disease and Phenotype tracks of the Ontology Alignment Evaluation Initiative (OAEI). The results demonstrate that the proposed method consistently achieves high-quality matching outcomes and significantly improves BOM efficiency across different test cases.

Keywords

biomedical ontology matching; ontology partitioning genetic programming evolutionary multi-task fitness function

Big Data Mining and Analytics

Cite this article:

Sun D, Lv Q, Tsai P-W, et al. SL-CMTGP: An effective knowledge interaction matching model between multitasks for large-scale biomedical ontologies. Big Data Mining and Analytics, 2025, https://doi.org/10.26599/BDMA.2025.9020004