Abstract
Biomedical ontologies encapsulate the vast knowledge within the medical domain, facilitating communication and data exchange. However, the heterogeneity of these ontologies often impedes knowledge exchange, especially in large-scale biomedical ontologies. Biomedical Ontology Matching (BOM) based on partitioning addresses this issue by dividing extensive ontologies into manageable sub-ontologies and identifying equivalence relationships among heterogeneous entities. Recently, Genetic Programming (GP) has been widely employed as an effective technique for optimizing and combining ontology similarity features (SFs). Nevertheless, the traditional GP methods struggle with the matching tasks due to the numerous and complex SFs of the partitioned sub-ontologies. To tackle these challenges, this paper proposes an efficient multi-task matching model to solve large-scale BOM problems. Firstly, an anchor-based partitioning method is introduced, which reduces the search space while retaining more informative sub-ontologies, ensuring high- quality subsequent matching. Secondly, a novel self-learning compact multi-task genetic programming (SL- CMTGP) method is proposed for constructing entity SFs. This method autonomously explores correlations among different matching tasks and leverages an implicit knowledge transfer mechanism to perform evolutionary operations, significantly enhancing BOM matching quality while reducing computational complexity. Lastly, a new approximate evaluation metric is introduced to improve the guidance of evolutionary algorithms, addressing the bias problem and overcoming local optima in individual tasks. Experimental evaluations were conducted on six test cases from the Anatomy, Large Biomedical Ontologies, and Disease and Phenotype tracks of the Ontology Alignment Evaluation Initiative (OAEI). The results demonstrate that the proposed method consistently achieves high-quality matching outcomes and significantly improves BOM efficiency across different test cases.