AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (1.8 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Impact of Domain Knowledge and Multi-Modality on Intelligent Molecular Property Prediction: A Systematic Survey

Peng Cheng National Laboratory, Shenzhen 518000, China, and also with School of Future Technology, South China University of Technology, Guangzhou 511442, China
Peng Cheng National Laboratory, Shenzhen 518000, China, and also with School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
Peng Cheng National Laboratory, Shenzhen 518000, China
Show Author Information

Abstract

The precise prediction of molecular properties is essential for advancements in drug development, particularly in virtual screening and compound optimization. The recent introduction of numerous deep learningbased methods has shown remarkable potential in enhancing Molecular Property Prediction (MPP), especially improving accuracy and insights into molecular structures. Yet, two critical questions arise: does the integration of domain knowledge augment the accuracy of molecular property prediction and does employing multi-modal data fusion yield more precise results than unique data source methods? To explore these matters, we comprehensively review and quantitatively analyze recent deep learning methods based on various benchmarks. We discover that integrating molecular information significantly improves Molecular Property Prediction (MPP) for both regression and classification tasks. Specifically, regression improvements, measured by reductions in Root Mean Square Error (RMSE), are up to 4.0%, while classification enhancements, measured by the area under the receiver operating characteristic curve (ROC-AUC), are up to 1.7%. Additionally, we discover that, as measured by ROC-AUC, augmenting 2D graphs with 3D information improves performance for classification tasks by up to 13.2% and enriching 2D graphs with 1D SMILES boosts multi-modal learning performance for regression tasks by up to 9.1%. The two consolidated insights offer crucial guidance for future advancements in drug discovery.

References

[1]

J. Shen and C. A Nicolaou, Molecular property prediction: Recent trends in the era of artificial intelligence, Drug Discov Today Technol., vol. 32–33, pp. 29–36, 2019.

[2]

Z. Li, M. Jiang, S. Wang, and S. Zhang, Deep learning methods for molecular representation and property prediction, Drug Discov. Today, vol. 27, no. 12, p. 103373, 2022.

[3]
H. Ma, C. Yan, Y. Guo, S. Wang, Y. Wang, H. Sun, and J. Huang, Improving molecular property prediction on limited data with deep multi-label learning, in Proc. 2020 IEEE Int. Conf. Bioinformatics and Biomedicine (BIBM ), Seoul, Republic of Korea, 2020, pp. 2779–2784.
[4]

X. Lin, Z. Quan, Z. J. Wang, H. Huang, and X. Zeng, A novel molecular representation with BiGRU neural networks for learning atom, Brief. Bioinform., vol. 21, no. 6, pp. 2099–2111, 2020.

[5]

Q. Lv, G. Chen, L. Zhao, W. Zhong, and C. Y. C. Chen, Mol2Context-vec: Learning molecular representation from context awareness for drug discovery, Brief. Bioinform., vol. 22, no. 6, p. bbab317, 2021.

[6]

S. Han, H. Fu, Y. Wu, G. Zhao, Z. Song, F. Huang, Z. Zhang, S. Liu, and W. Zhang, HimGNN: A novel hierarchical molecular graph representation learning framework for property prediction, Brief. Bioinform., vol. 24, no. 5, p. bbad305, 2023.

[7]

G. Bouritsas, F. Frasca, S. Zafeiriou, and M. M. Bronstein, Improving graph neural network expressivity via subgraph isomorphism counting, IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 1, pp. 657–668, 2023.

[8]
Y. Song, S. Zheng, Z. Niu, Z. H. Fu, Y. Lu, and Y. Yang, Communicative representation learning on attributed molecular graphs, in Proc. 29 th Int. Joint Conf. Artificial Intelligence, Yokohama, Japan, 2020, pp. 2831–2838.
[9]
H. Li, D. Zhao, and J. Zeng, KPGT: Knowledge-guided pre-training of graph transformer for molecular property prediction, in Proc. 28 th ACM SIGKDD Conf. Knowledge Discovery and Data Mining, Washington, DC, USA, 2022, pp. 857–867.
[10]

Z. Xiong, D. Wang, X. Liu, F. Zhong, X. Wan, X. Li, Z. Li, X. Luo, K. Chen, H. Jiang, et al., Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism, J. Med. Chem., vol. 63, no. 16, pp. 8749–8760, 2020.

[11]
Y. Rong, Y. Bian, T. Xu, W. Xie, Y. Wei, W. Huang, and J. Huang, Self-supervised graph transformer on large-scale molecular data, in Proc. 34 th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, pp. 12559–12571.
[12]

J. Ross, B. Belgodere, V. Chenthamarakshan, I. Padhi, Y. Mroueh, and P. Das, Large-scale chemical language representations capture molecular structure and properties, Nat. Mach. Intell., vol. 4, no. 12, pp. 1256–1264, 2022.

[13]
S. Yin and G. Zhong, LGI-GT: Graph transformers with local and global operators interleaving, in Proc. 32 nd Int. Joint Conf. Artificial Intelligence, Macao, China, 2023, pp. 4504–4512.
[14]
S. Luo, T. Chen, Y. Xu, S. Zheng, T. Y. Liu, L. Wang, and D. He, One transformer can understand both 2D & 3D molecular data, arXiv preprint arXiv: 2210.01765, 2023.
[15]

X. Zeng, H. Xiang, L. Yu, J. Wang, K. Li, R. Nussinov, and F. Cheng, Accurate prediction of molecular properties and drug targets using a self-supervised image representation learning framework, Nat. Mach. Intell., vol. 4, no. 11, pp. 1004–1016, 2022.

[16]

J. H. Chen and Y. J. Tseng, Different molecular enumeration influences in deep learning: An example using aqueous solubility, Brief. Bioinform., vol. 22, no. 3, p. bbaa092, 2021.

[17]

S. Liu, J. Li, K. C. Bennett, B. Ganoe, T. Stauch, M. Head-Gordon, A. Hexemer, D. Ushizima, and T. Head-Gordon, Multiresolution 3D-DenseNet for chemical shift prediction in NMR crystallography, J. Phys. Chem. Lett., vol. 10, no. 16, p. 4558–4565, 2019.

[18]
S. Liu, H. Wang, W. Liu, J. Lasenby, H. Guo, and J. Tang, Pre-training molecular graph representation with 3D geometry, arXiv preprint arXiv: 2110.07728, 2022.
[19]
S. Li, J. Zhou, T. Xu, D. Dou, and H. Xiong, GeomGCL: Geometric graph contrastive learning for molecular property prediction, in Proc. 36 th AAAI Conf. Artificial Intelligence, Virtual Event, 2022, pp. 4541–4549.
[20]
J. Zhu, Y. Xia, L. Wu, S. Xie, T. Qin, W. Zhou, H. Li, and T. Y. Liu, Unified 2D and 3D pre-training of molecular representations, in Proc. 28 th ACM SIGKDD Conf. Knowledge Discovery and Data Mining, Washington, DC, USA, 2022, pp. 2626–2636.
[21]
Z. Guo, W. Yu, C. Zhang, M. Jiang, and N. V. Chawla, GraSeq: Graph and sequence fusion learning for molecular property prediction, in Proc. 29 th ACM Int. Conf. Information & Knowledge Management, Virtual Event, 2020, pp. 435–443.
[22]

Y. Wang, J. Wang, Z. Cao, and A. B. Farimani, Molecular contrastive learning of representations via graph neural networks, Nat. Mach. Intell., vol. 4, no. 3, pp. 279–287, 2022.

[23]

Y. Fang, Q. Zhang, N. Zhang, Z. Chen, X. Zhuang, X. Shao, X. Fan, and H. Chen, Knowledge graph-enhanced molecular contrastive learning with functional prompt, Nat. Mach. Intell., vol. 5, no. 5, pp. 542–553, 2023.

[24]

H. Li, R. Zhang, Y. Min, D. Ma, D. Zhao, and J. Zeng, A knowledge-guided pre-training framework for improving molecular representation learning, Nat. Commun., vol. 14, no. 1, p. 7568, 2023.

[25]
Z. Hao, C. Lu, Z. Huang, H. Wang, Z. Hu, Q. Liu, E. Chen, and C. Lee, ASGN: An active semi-supervised graph neural network for molecular property prediction, in Proc. 26 th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, Virtual Event, 2020, pp. 731–752.
[26]
F. Y. Sun, J. Hoffmann, V. Verma, and J. Tang, InfoGraph: Unsupervised and semi-supervised graph-level representation learning via mutual information maximization, arXiv preprint arXiv: 1908.01000, 2020.
[27]

D. Zhang, W. Feng, Y. Wang, Z. Qi, Y. Shan, and J. Tang, DropConn: Dropout connection based random GNNs for molecular property prediction, IEEE Trans. Knowl. Data Eng., vol. 36, no. 2, pp. 518–529, 2024.

[28]
Y. Sun, Y. Chen, W. Ma, W. Huang, K. Liu, Z. Ma, W. Y. Ma, and Y. Lan, PEMP: Leveraging physics properties to enhance molecular property prediction, in Proc. 31 st ACM Int. Conf. Information & Knowledge Management, Atlanta, GA, USA, 2022, pp. 3505–3513.
[29]
W. Chen, A. Tripp, and J. M. Hernández-Lobato, Meta-learning adaptive deep kernel Gaussian processes for molecular property prediction, arXiv preprint arXiv: 2205.02708, 2023.
[30]
X. Zhuang, Q. Zhang, B. Wu, K. Ding, Y. Fang, and H. Chen, Graph sampling-based meta-learning for molecular property prediction, arXiv preprint arXiv: 2306.16780, 2023.
[31]

S. Biswas, Y. Chung, J. Ramirez, H. Wu, and W. H. Green, Predicting critical properties and acentric factors of fluids using multitask machine learning, J. Chem. Inf. Model., vol. 63, no. 15, pp. 4574–4588, 2023.

[32]

Z. Tan, Y. Li, W. Shi, and S. Yang, A multitask approach to learn molecular properties, J. Chem. Inf. Model., vol. 61, no. 8, pp. 3824–3834, 2021.

[33]

Z. Wu, B. Ramsundar, E. N. Feinberg, J. Gomes, C. Geniesse, A. S. Pappu, K. Leswing, and V. Pande, MoleculeNet: A benchmark for molecular machine learning, Chem. Sci., vol. 9, no. 2, pp. 513–530, 2018.

[34]

D. Weininger, SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules, J. Chem. Inf. Comput. Sci., vol. 28, no. 1, pp. 31–36, 1988.

[35]
D. Weininger, A. Weininger, and J. L. Weininger, SMILES. 2. Algorithm for generation of unique smiles notation, J. Chem. Inf. Comput. Sci., vol. 29, no. 2, pp. 97–101, 1989.
[36]
D. Weininger, SMILES. 3. DEPICT. Graphical depiction of chemical structures, J. Chem. Inf. Comput. Sci., vol. 30, no. 3, pp. 237–243, 1990.
[37]

D. Rogers and M. Hahn, Extended-connectivity fingerprints, J. Chem. Inf. Model., vol. 50, no. 5, pp. 742–754, 2010.

[38]

J. L. Durant, B. A. Leland, D. R. Henry, and J. G. Nourse, Reoptimization of mdl keys for use in drug discovery, J. Chem. Inf. Comput. Sci., vol. 42, no. 6, pp. 1273–1280, 2002.

[39]

M. Krenn, F. Häse, A. Nigam, P. Friederich, and A. Aspuru-Guzik, Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation, Mach. Learn.: Sci. Technol., vol. 1, no. 4, p. 045024, 2020.

[40]
A. D. McNaught and A. Wilkinson, Compendium of Chemical Terminology, 2nd ed. Oxford, UK: Blackwell Science, 1997.
[41]

S. R. Heller, A. McNaught, I. Pletnev, S. Stein, and D. Tchekhovskoi, InChi, the IUPAC international chemical identifier, J. Cheminform., vol. 7, no. 1, p. 23, 2015.

[42]
G. Landrum, RDKit: A software suite for cheminformatics, computational chemistry, and predictive modeling, Greg Landrum, vol. 8, p. 31, 2013.
[43]

W. L. DeLano, PyMOL: An open-source molecular graphics tool, CCP4 Newsl. Protein Crystallogr, vol. 40, no. 1, pp. 82–92, 2002.

[44]

J. Sunseri and D. R. Koes, Libmolgrid: Graphics processing unit accelerated molecular gridding for deep learning applications, J. Chem. Inf. Model., vol. 60, no. 3, pp. 1079–1084, 2020.

[45]

J. Degen, C. Wegscheid-Gerlach, A. Zaliani, and M. Rarey, On the art of compiling and using ‘drug-like’ chemical fragment spaces, ChemMedChem, vol. 3, no. 10, pp. 1503–1507, 2008.

[46]
X. Q. Lewell, D. B. Judd, S. P. Watson, and M. M. Hann, RECAP-retrosynthetic combinatorial analysis procedure: A powerful new technique for identifying privileged molecular fragments with useful applications in combinatorial chemistry, J. Chem. Inf. Comput. Sci., vol. 38, no. 3, pp. 511–522, 1998.
[47]

G. W. Bemis and M. A. Murcko, The properties of known drugs. 1. Molecular frameworks, J. Med. Chem., vol. 39, no. 15, pp. 2887–2893, 1996.

[48]

T. Liu, M. Naderi, C. Alvin, S. Mukhopadhyay, and M. Brylinski, Break down in order to build up: Decomposing small molecules for fragment-based drug design with eMolFrag, J. Chem. Inf. Model., vol. 57, no. 4, pp. 627–631, 2017.

[49]
F. Kruger, N. Stiefl, and G. A. Landrum, rdScaffoldNetwork: The scaffold network implementation in RDKit, J. Chem. Inf. Model., vol. 60, no. 7, pp. 3331–3335, 2020.
[50]

C. K. Wu, X. C. Zhang, Z. J. Yang, A. P. Lu, T. J. Hou, and D. S. Cao, Learning to SMILES: Ban-based strategies to improve latent representation learning from molecules, Brief. Bioinform., vol. 22, no. 6, p. bbab327, 2021.

[51]

K. Yang, K. Swanson, W. Jin, C. Coley, P. Eiden, H. Gao, A. Guzman-Perez, T. Hopper, B. Kelley, M. Mathea, et al., Analyzing learned molecular representations for property prediction, J. Chem. Inf. Model., vol. 59, no. 8, pp. 3370–3388, 2019.

[52]

Y. Ji, G. Wan, Y. Zhan, and B. Du, Metapath-fused heterogeneous graph network for molecular property prediction, Inf. Sci., vol. 629, pp. 155–168, 2023.

[53]

X. Fang, L. Liu, J. Lei, D. He, S. Zhang, J. Zhou, F. Wang, H. Wu, and H. Wang, Geometry-enhanced molecular representation learning for property prediction, Nat. Mach. Intell., vol. 4, no. 2, pp. 127–134, 2022.

[54]
G. Zhou, Z. Gao, Q. Ding, H. Zheng, H. Xu, Z. Wei, L. Zhang, and G. Ke, Uni-Mol: A universal 3D molecular representation learning framework, chemRxiv. doi: 10.26434/chemrxiv-2022-jjm0j-v4.
[55]
S. Chithrananda, G. Grand, and B. Ramsundar, ChemBERTa: Large-scale self-supervised pretraining for molecular property prediction, arXiv preprint arXiv: 2010.09885, 2020.
[56]
A. Yüksel, E. Ulusoy, A. Ünlü, and T. Doǧan, SELFormer: Molecular representation learning via SELFIES language models, Mach. Learn. : Sci. Technol., vol. 4, no. 2, p. 025035, 2023.
[57]
X. C. Zhang, J. C. Yi, G. P. Yang, C. K. Wu, T. J. Hou, and D. S. Cao, ABC-Net: A divide-and-conquer based deep learning architecture for SMILES recognition from molecular images, Brief. Bioinform., vol. 23, no. 2, p. bbac033, 2022.
[58]

S. Liu, W. Nie, C. Wang, J. Lu, Z. Qiao, L. Liu, J. Tang, C. Xiao, and A. Anandkumar, Multi-modal molecule structure–text model for text-based retrieval and editing, Nat. Mach. Intell., vol. 5, no. 12, pp. 1447–1457, 2023.

[59]
P. Liu, X. Qiu, X. Chen, S. Wu, and X. Huang, Multi-timescale long short-term memory neural network for modelling sentences and documents, in Proc. 2015 Conf. Empirical Methods in Natural Language Processing, Lisbon, Portugal, 2015, pp. 2326–2335.
[60]
J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, Gated feedback recurrent neural networks, in Proc. 32 nd Int. Conf. Int. Conf. Machine Learning, Lille, France, 2015, pp. 2067–2075.
[61]

A. L. Nazarova, L. Yang, K. Liu, A. Mishra, R. K. Kalia, K. I. Nomura, A. Nakano, P. Vashishta, and P. Rajak, Dielectric polymer property prediction using recurrent neural networks with optimizations, J. Chem. Inf. Model., vol. 61, no. 5, pp. 2175–2186, 2021.

[62]

Z. Wang, Y. Su, W. Shen, S. Jin, J. H. Clark, J. Ren, and X. Zhang, Predictive deep learning models for environmental properties: The direct calculation of octanol–water partition coefficients from molecular graphs, Green Chem., vol. 21, no. 16, pp. 4555–4565, 2019.

[63]

M. Withnall, E. Lindelöf, O. Engkvist, and H. Chen, Building attention and edge message passing neural networks for bioactivity and physical–chemical property prediction, J. Cheminform., vol. 12, no. 1, p. 1, 2020.

[64]

P. Li, Y. Li, C. Y. Hsieh, S. Zhang, X. Liu, H. Liu, S. Song, and X. Yao, TrimNet: Learning molecular representation from triplet messages for biomedicine, Brief. Bioinform., vol. 22, no. 4, p. bbaa266, 2021.

[65]
X. Zhang, C. Chen, Z. Meng, Z. Yang, H. Jiang, and X. Cui, CoAtGIN: Marrying convolution and attention for graph-based molecule property prediction, in Proc. 2022 IEEE Int. Conf. Bioinformatics and Biomedicine (BIBM ), Las Vegas, NV, USA, 2022, pp. 374–379.
[66]

X. Fan, M. Gong, Y. Wu, A. K. Qin, and Y. Xie, Propagation enhanced neural message passing for graph representation learning, IEEE Trans. Knowl. Data Eng., vol. 35, no. 2, pp. 1952–1964, 2023.

[67]

Y. Li, P. Li, X. Yang, C. Y. Hsieh, S. Zhang, X. Wang, R. Lu, H. Liu, and X. Yao, Introducing block design in graph neural networks for molecular properties prediction, Chem. Eng. J., vol. 414, p. 128817, 2021.

[68]
H. Ma, Y. Bian, Y. Rong, W. Huang, T. Xu, W. Xie, G. Ye, and J. Huang, Multi-view graph neural networks for molecular property prediction, arXiv preprint arXiv: 2005.13607, 2020.
[69]

X. Liu, X. Wang, J. Wu, and K. Xia, Hypergraph-based persistent cohomology (HPC) for molecular representations in drug design, Brief. Bioinform., vol. 22, no. 5, p. bbaa411, 2021.

[70]
J. Feng, Z. Wang, Y. Li, B. Ding, Z. Wei, and H. Xu, MGMAE: Molecular representation learning by reconstructing heterogeneous graphs with a high mask ratio, in Proc. 31 st ACM Int. Conf. Information & Knowledge Management, Atlanta, GA, USA, 2022, pp. 509–519.
[71]

T. Hasebe, Knowledge-embedded message-passing neural networks: Improving molecular property prediction with human knowledge, ACS Omega, vol. 6, no. 42, pp. 27955–27967, 2021.

[72]
S. Yang, Z. Li, G. Song, and L. Cai, Deep molecular representation learning via fusing physical and chemical information, in Proc. Annu. Conf. Neural Information Processing Systems, Virtual Event, 2021, pp. 16346–16357.
[73]

X. Zang, X. Zhao, and B. Tang, Hierarchical molecular graph self-supervised learning for property prediction, Commun. Chem., vol. 6, no. 1, p. 34, 2023.

[74]

N. Liu, S. Jian, D. Li, Y. Zhang, Z. Lai, and H. Xu, Hierarchical adaptive pooling by capturing high-order dependency for graph representation learning, IEEE Trans. Knowl. Data Eng., vol. 35, no. 4, pp. 3952–3965, 2023.

[75]

J. Gao, J. Gao, X. Ying, M. Lu, and J. Wang, Higher-order interaction goes neural: A substructure assembling graph attention network for graph classification, IEEE Trans. Knowl. Data Eng., vol. 35, no. 2, pp. 1594–1608, 2023.

[76]

X. B. Ye, Q. Guan, W. Luo, L. Fang, Z. R. Lai, and J. Wang, Molecular substructure graph attention network for molecular property identification in drug discovery, Pattern Recog., vol. 128, p. 108659, 2022.

[77]

W. Zhu, Y. Zhang, D. Zhao, J. Xu, and L. Wang, HiGNN: A hierarchical informative graph neural network for molecular property prediction equipped with feature-wise attention, J. Chem. Inf. Model., vol. 63, no. 1, pp. 43–55, 2023.

[78]
C. Lu, Q. Liu, C. Wang, Z. Huang, P. Lin, and L. He, Molecular property prediction: A multilevel quantum interactions modeling perspective, in Proc. 33 rd AAAI Conf. Artificial Intelligence, Honolulu, HI, USA, 2019, pp. 1052–1060.
[79]
M. Fey, J. G. Yuen, and F. Weichert, Hierarchical inter-message passing for learning on molecular graphs, arXiv preprint arXiv: 2006.12179, 2020.
[80]
F. Wu, D. Radev, and S. Z. Li, Molformer: Motif-based transformer on 3D heterogeneous molecular graphs, in Proc. 37 th AAAI Conf. Artificial Intelligence, Washington, DC, USA, 2023, pp. 5312–5320.
[81]
F. B. Fuchs, D. E. Worrall, V. Fischer, and M. Welling, SE(3)-transformers: 3D roto-translation equivariant attention networks, arXiv preprint arXiv: 2006.10503, 2020.
[82]
K. T. Schütt, O. T. Unke, and M. Gastegger, Equivariant message passing for the prediction of tensorial properties and molecular spectra, arXiv preprint arXiv: 2102.03150, 2021.
[83]
J. Brandstetter, R. Hesselink, E. van der Pol, E. J. Bekkers, and M. Welling, Geometric and physical quantities improve E(3) equivariant message passing, arXiv preprint arXiv: 2110.02905, 2022.
[84]
J. Gasteiger, F. Becker, and S. Günnemann, GemNet: Universal directional graph neural networks for molecules, arXiv preprint arXiv: 2106.08903, 2022.
[85]
J. Gasteiger, S. Giri, J. T. Margraf, and S. Günnemann, Fast and uncertainty-aware directional message passing for non-equilibrium molecules, arXiv preprint arXiv: 2011.14115, 2022.
[86]
M. Shuaibi, A. Kolluru, A. Das, A. Grover, A. Sriram, Z. Ulissi, and C. L. Zitnick, Rotation invariant graph neural networks using spin convolutions, arXiv preprint arXiv: 2106.09575, 2021.
[87]
S. Wang, Y. Guo, Y. Wang, H. Sun, and J. Huang, SMILES-BERT: Large scale unsupervised pre-training for molecular property prediction, in Proc. 10 th ACM Int. Conf. Bioinformatics, Computational Biology and Health Informatics, Niagara Falls, NY, USA, 2019, pp. 429–436.
[88]
Y. Wang, X. Chen, Y. Min, and J. Wu, MolCloze: A unified cloze-style self-supervised molecular structure learning model for chemical property prediction, in Proc. 2021 IEEE Int. Conf. Bioinformatics and Biomedicine (BIBM ), Houston, TX, USA, 2021, pp. 2896–2903.
[89]

B. Winter, C. Winter, J. Schilling, and A. Bardow, A smile is all you need: Predicting limiting activity coefficients from smiles with natural language processing, Digit Discov, vol. 1, no. 6, pp. 859–869, 2022.

[90]

J. Su, M. Ahmed, Y. Lu, S. Pan, W. Bo, and Y. Liu, RoFormer: Enhanced transformer with rotary position embedding, Neurocomputing, vol. 568, p. 127063, 2024.

[91]
Ł. Maziarka, T. Danel, S. Mucha, K. Rataj, J. Tabor, and S. Jastrzebski, Molecule attention transformer, arXiv preprint arXiv: 2002.08264, 2020.
[92]
W. Park, W. Chang, D. Lee, J. Kim, and S. W. Hwang, GRPE: Relative positional encoding for graph transformer, arXiv preprint arXiv: 2201.12787, 2022.
[93]
M. S. Hussain, M. J. Zaki, and D. Subramanian, Global self-attention as a replacement for graph convolution, in Proc. 28 th ACM SIGKDD Conf. Knowledge Discovery and Data Mining, Washington, DC, USA, 2022, pp. 655–665.
[94]
D. Masters, J. Dean, K. Klaser, Z. Li, S. Maddrell-Mander, A. Sanders, H. Helal, D. Beker, L. Rampášek, and D. Beaini, GPS++: An optimised hybrid MPNN/transformer for molecular property prediction, arXiv preprint arXiv: 2212.02229, 2022.
[95]
Z. Chen, H. Tan, T. Wang, T. Shen, T. Lu, Q. Peng, C. Cheng, and Y. Qi, Graph propagation transformer for graph representation learning, arXiv preprint arXiv: 2305.11424, 2023.
[96]

G. P. Ren, K. J. Wu, and Y. He, Enhancing molecular representations via graph transformation layers, J. Chem. Inf. Model., vol. 63, no. 9, pp. 2679–2688, 2023.

[97]

J. Gao, Z. Shen, Y. Xie, J. Lu, Y. Lu, S. Chen, Q. Bian, Y. Guo, L. Shen, J. Wu, et al., TransFoxMol: Predicting molecular property with focused attention, Brief. Bioinform., vol. 24, no. 5, p. bbad306, 2023.

[98]

Y. Jiang, S. Jin, X. Jin, X. Xiao, W. Wu, X. Liu, Q. Zhang, X. Zeng, G. Yang, and Z. Niu, Pharmacophoric-constrained heterogeneous graph transformer model for molecular property prediction, Commun. Chem., vol. 6, no. 1, p. 60, 2023.

[99]

M. Hirohara, Y. Saito, Y. Koda, K. Sato, and Y. Sakakibara, Convolutional neural network based on smiles representation of compounds for detecting chemical motif, BMC Bioinformatics, vol. 19, no. S19, p. 526, 2018.

[100]

P. Jiang, Y. Chi, X. S. Li, Z. Meng, X. Liu, X. S. Hua, and K. Xia, Molecular persistent spectral image (Mol-PSI) representation for machine learning models in drug design, Brief. Bioinform., vol. 23, no. 1, p. bbab527, 2022.

[101]
D. Kuzminykh, D. Polykovskiy, A. Kadurin, A. Zhebrak, I. Baskov, S. Nikolenko, R. Shayakhmetov, and A. Zhavoronkov, 3D molecular representations based on the wave transform for convolutional neural networks, Mol. Pharm., vol. 15, no. 10, pp. 4378–4385, 2018.
[102]
H. Cai, H. Zhang, D. Zhao, J. Wu, and L. Wang, FP-GNN: A versatile deep learning architecture for enhanced molecular property prediction, Brief. Bioinform., vol. 23, no. 6, p. bbac408, 2022.
[103]

X. Wang, Z. Li, M. Jiang, S. Wang, S. Zhang, and Z. Wei, Molecule property prediction based on spatial graph embedding, J. Chem. Inf. Model., vol. 59, no. 9, pp. 3817–3828, 2019.

[104]

J. Liu, X. Lei, Y. Zhang, and Y. Pan, The prediction of molecular toxicity based on BiGRU and GraphSAGE, Comput. Biol. Med., vol. 153, p. 106524, 2023.

[105]
Y. Luo, K. Yang, M. Hong, X. Liu, and Z. Nie, MolFM: A multimodal molecular foundation model, arXiv preprint arXiv: 2307.09484, 2023.
[106]
Y. Sun, M. Islam, E. Zahedi, M. Kuenemann, H. Chouaib, and P. Hu, Molecular property prediction based on bimodal supervised contrastive learning, in Proc. 2022 IEEE Int. Conf. Bioinformatics and Biomedicine (BIBM ), Las Vegas, NV, USA, 2022, pp. 394–397.
[107]
P. Liu, Y. Ren, J. Tao, and Z. Ren, GIT-Mol: A multi-modal large language model for molecular science with graph, image, and text, arXiv preprint arXiv: 2308.06911, 2024.
[108]

Q. Tang, F. Nie, Q. Zhao, and W. Chen, A merged molecular representation deep learning method for blood–brain barrier permeability prediction, Brief. Bioinform., vol. 23, no. 5, p. bbac357, 2022.

[109]

T. Zhang, S. Chen, A. Wulamu, X. Guo, Q. Li, and H. Zheng, TransG-Net: Transformer and graph neural network based multi-modal data fusion network for molecular properties prediction, Appl. Intell., vol. 53, no. 12, pp. 16077–16088, 2023.

[110]

D. Chen, K. Gao, D. D. Nguyen, X. Chen, Y. Jiang, G. W. Wei, and F. Pan, Algebraic graph-assisted bidirectional transformers for molecular property prediction, Nat. Commun., vol. 12, no. 1, p. 3521, 2021.

[111]

W. X. Shen, X. Zeng, F. Zhu, Y. L. Wang, C. Qin, Y. Tan, Y. Y. Jiang, and Y. Z. Chen, Out-of-the-box deep learning prediction of pharmaceutical properties by broadly learned knowledge-based molecular representations, Nat. Mach. Intell., vol. 3, no. 4, pp. 334–343, 2021.

[112]
Y. Liu, L. Wang, M. Liu, X. Zhang, B. Oztekin, and S. Ji, Spherical message passing for 3D molecular graphs, arXiv preprint arXiv: 2102.05013, 2022.
[113]

Z. Wang, M. Liu, Y. Luo, Z. Xu, Y. Xie, L. Wang, L. Cai, Q. Qi, Z. Yuan, T. Yang, et al., Advanced graph and sequence neural networks for molecular property prediction and drug discovery, Bioinformatics, vol. 38, no. 9, pp. 2579–2586, 2022.

[114]
J. Zhu, Y. Xia, L. Wu, S. Xie, W. Zhou, T. Qin, H. Li, and T. Y. Liu, Dual-view molecular pre-training, in Proc. 29 th ACM SIGKDD Conf. Knowledge Discovery and Data Mining, Long Beach, CA, USA, 2023, pp. 3615–3627.
[115]
M. Sun, J. Xing, H. Wang, B. Chen, and J. Zhou, MoCL: Data-driven molecular fingerprint via knowledge-aware contrastive learning from molecular graph, in Proc. 27 th ACM SIGKDD Conf. Knowledge Discovery & Data Mining, Singapore, 2021, p. 3585–3594.
[116]

Y. Wang, R. Magar, C. Liang, and A. B. Farimani, Improving molecular contrastive learning via faulty negative mitigation and decomposed fragment contrast, J. Chem. Inf. Model., vol. 62, no. 11, pp. 2713–2725, 2022.

[117]
Y. You, T. Chen, Y. Sui, T. Chen, Z. Wang, and Y. Shen, Graph contrastive learning with augmentations, in Proc. 34 th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, pp. 5812–5823.
[118]
J. Xia, C. Zhao, B. Hu, Z. Gao, C. Tan, Y. Liu, S. Li, and S. Z. Li, Mole-BERT: Rethinking pre-training graph neural networks for molecules, chemRxiv. doi: 10.26434/chemrxiv-2023-dngg4.
[119]

Z. Wu, D. Jiang, J. Wang, X. Zhang, H. Du, L. Pan, C. Y. Hsieh, D. Cao, and T. Hou, Knowledge-based BERT: A method to extract molecular features like computational chemists, Brief. Bioinform., vol. 23, no. 3, p. bbac131, 2022.

[120]

Z. Wu, J. Wang, H. Du, D. Jiang, Y. Kang, D. Li, P. Pan, Y. Deng, D. Cao, C. Y. Hsieh, et al., Chemistry-intuitive explanation of graph neural networks for molecular property prediction with substructure masking, Nat. Commun., vol. 14, no. 1, p. 2585, 2023.

[121]
S. Kim, J. Nam, J. Kim, H. Lee, S. Ahn, and J. Shin, Fragment-based multi-view molecular contrastive learning, in Proc. ICLR 2023, https://openreview.net/forum?id=9lGwd4q8KJc, 2024.
[122]
F. Wu, H. Qin, S. Li, S. Z. Li, X. Zhan, and J. Xu, InstructBio: A large-scale semi-supervised learning paradigm for biochemical problems, arXiv preprint arXiv: 2304.03906, 2023.
[123]
Q. Lv, G. Chen, Z. Yang, W. Zhong, and C. Y. C. Chen, Meta learning with graph attention networks for low-data drug discovery, IEEE Trans. Neural Netw. Learn. Syst.
[124]
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, in Proc. 2019 Conf. North American Chapter of the Association for Computational Linguistics : Human Language Technologies, Volume 1 (Long and Short Papers ), Minneapolis, MN, USA, 2019, pp. 4171–4186.
[125]
L. Floridi and M. Chiriatti, GPT-3: Its nature, scope, limits, and consequences, Minds Mach., vol. 30, pp. 681–694, 2020.
[126]
X. C. Zhang, C. K. Wu, Z. J. Yang, Z. X. Wu, J. C. Yi, C. Y. Hsieh, T. J. Hou, and D. S. Cao, MG-BERT: Leveraging unsupervised atomic representation learning for molecular property prediction, Brief. Bioinform., vol. 22, no. 6, p. bbab152, 2021.
[127]
W. Ahmad, E. Simon, S. Chithrananda, G. Grand, and B. Ramsundar, ChemBERTa-2: Towards chemical foundation models, arXiv preprint arXiv: 2209.01712, 2022.
[128]

R. Irwin, S. Dimitriadis, J. He, and E. J. Bjerrum, Chemformer: A pre-trained transformer for computational chemistry, Mach. Learn.: Sci. Technol., vol. 3, no. 1, p. 015022, 2022.

[129]
W. Hu, B. Liu, J. Gomes, M. Zitnik, P. Liang, V. Pande, and J. Leskovec, Strategies for pre-training graph neural networks, arXiv preprint arXiv: 1905.12265, 2020.
[130]
J. Godwin, M. Schaarschmidt, A. Gaunt, A. Sanchez-Gonzalez, Y. Rubanova, P. Veličković, J. Kirkpatrick, and P. Battaglia, Simple GNN regularisation for 3D molecular property prediction & beyond, arXiv preprint arXiv: 2106.07971, 2022.
[131]
S. Liu, H. Guo, and J. Tang, Molecular geometry pretraining with SE(3)-invariant denoising distance matching, arXiv preprint arXiv: 2206.13602, 2023.
[132]
S. Feng, Y. Ni, Y. Lan, Z. M. Ma, and W. Y. Ma, Fractional denoising for 3D molecular pre-training, in Proc. 40 th Int. Conf. Machine Learning, Honolulu, HI, USA, 2023, pp. 9938–9961.
[133]
R. Jiao, J. Han, W. Huang, Y. Rong, and Y. Liu, Energy-motivated equivariant pretraining for 3D molecular graphs, in Proc. 37 th AAAI Conf. Artificial Intelligence, Washington, DC, USA, 2023, pp. 8096–8104.
[134]
X. Gao, W. Gao, W. Xiao, Z. Wang, C. Wang, and L. Xiang, Supervised pretraining for molecular force fields and properties prediction, arXiv preprint arXiv: 2211.14429, 2022.
[135]
X. Wang, H. Zhao, W. W. Tu, and Q. Yao, Automated 3D pre-training for molecular property prediction, in Proc. 29 th ACM SIGKDD Conf. Knowledge Discovery and Data Mining, Long Beach, CA, USA, 2023, pp. 2419–2430.
[136]
L. Zeng, L. Li, and J. Li, MolKD: Distilling cross-modal knowledge in chemical reactions for molecular property prediction, arXiv preprint arXiv: 2305.01912, 2023.
[137]
J. Broberg, M. Bånkestad, and E. Ylipää, Pre-training transformers for molecular property prediction using reaction prediction, arXiv preprint arXiv: 2207.02724, 2022.
[138]

X. C. Zhang, C. K. Wu, J. C. Yi, X. X. Zeng, C. Q. Yang, A. P. Lu, T. J. Hou, and D. S. Cao, Pushing the boundaries of molecular property prediction for drug discovery with multitask learning BERT enhanced by SMILES enumeration, Research, vol. 2022, p. 0004, 2022.

[139]

H. Abdel-Aty and I. R. Gould, Large-scale distributed training of transformers for chemical fingerprinting, J. Chem. Inf. Model., vol. 62, no. 20, pp. 4852–4862, 2022.

[140]

Z. Zheng, Y. Tan, H. Wang, S. Yu, T. Liu, and C. Liang, CasANGCL: Pre-training and fine-tuning model based on cascaded attention network and graph contrastive learning for molecular property prediction, Brief. Bioinform., vol. 24, no. 1, p. bbac566, 2023.

[141]
X. Guan and D. Zhang, T-MGCL: Molecule graph contrastive learning based on transformer for molecular property prediction, IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 20, no. 6, pp. 3851–3862, 2023.
[142]

H. Liu, Y. Huang, X. Liu, and L. Deng, Attention-wise masked graph contrastive learning for predicting molecular property, Brief. Bioinform., vol. 23, no. 5, p. bbac303, 2022.

[143]

S. Lin, C. Liu, P. Zhou, Z. Y. Hu, S. Wang, R. Zhao, Y. Zheng, L. Lin, E. Xing, and X. Liang, Prototypical graph contrastive learning, IEEE Trans. Neural Netw. Learn. Syst., vol. 35, no. 2, pp. 2747–2758, 2024.

[144]
J. Cui, H. Chai, Y. Gong, Y. Ding, Z. Hua, C. Gao, and Q. Liao, MocGCL: Molecular graph contrastive learning via negative selection, in Proc. 2023 Int. Joint Conf. Neural Networks (IJCNN ), Gold Coast, Australia, 2023, pp. 1–8.
[145]
K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, Momentum contrast for unsupervised visual representation learning, in Proc. 2020 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Seattle, WA, USA, 2020, pp. 9726–9735.
[146]
M. J. Zaki and W. Meira Jr, Data Mining and Analysis : Fundamental Concepts and Algorithms. Cambridge, UK: Cambridge University Press, 2014.
[147]
Y. Wang, Y. Min, E. Shao, and J. Wu, Molecular graph contrastive learning with parameterized explainable augmentations, in Proc. 2021 IEEE Int. Conf. Bioinformatics and Biomedicine (BIBM ), Houston, TX, USA, 2021, pp. 1558–1563.
[148]
M. Liu, Y. Yang, X. Gong, L. Liu, and Q. Liu, HierMRL: Hierarchical structure-aware molecular representation learning for property prediction, in Proc. 2022 IEEE Int. Conf. Bioinformatics and Biomedicine (BIBM ), Las Vegas, NV, USA, 2022, pp. 386–389.
[149]

J. Wang, J. Guan, and S. Zhou, Molecular property prediction by contrastive learning with attention-guided positive sample selection, Bioinformatics, vol. 39, no. 5, p. btad258, 2023.

[150]
K. Moon, H. J. Im, and S. Kwon, 3D graph contrastive learning for molecular property prediction, Bioinformatics, vol. 39, no. 6, p. btad371, 2023.
[151]
T. Kuang, Y. Ren, and Z. Ren, 3D-mol: A novel contrastive learning framework for molecular property prediction with 3D information, arXiv preprint arXiv: 2309.17366, 2024.
[152]

X. Wu, J. Duan, Y. Pan, and M. Li, Medical knowledge graph: Data sources, construction, reasoning, and applications, Big Data Mining and Analytics, vol. 6, no. 2, pp. 201–217, 2023.

[153]
R. Hua, X. Wang, C. Cheng, Q. Zhu, and X. Zhou, A chemical domain knowledge-aware framework for multi-view molecular property prediction, in Proc. 7 th China Conf. Knowledge Graph and Semantic Computing Evaluations, Qinhuangdao, China, 2022, pp. 1–11.
[154]
Y. Fang, Q. Zhang, H. Yang, X. Zhuang, S. Deng, W. Zhang, M. Qin, Z. Chen, X. Fan, and H. Chen, Molecular contrastive learning with chemical element knowledge graph, in Proc. 36 th AAAI Conf. Artificial Intelligence, Virtual Event, 2022, pp. 3968–3976.
[155]
M. Xu, H. Wang, B. Ni, H. Guo, and J. Tang, Self-supervised graph-level representation learning with local and global structure, in Proc. 38 th Int. Conf. Machine Learning, Virtual Event, 2021, pp. 11548–11558.
[156]

X. Xu, C. Deng, Y. Xie, and S. Ji, Group contrastive self-supervised learning on graphs, IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 3, pp. 3169–3180, 2023.

[157]
X. Shen, Y. Liu, Y. Wu, and L. Xie, MoLGNN: Self-supervised motif learning graph neural network for drug discovery, in Proc. Machine Learning for Molecules Workshop at NeurIPS 2020, https://ml4molecules.github.io/papers2020/ML4Molecules_2020_paper_4.pdf, 2024.
[158]
X. Luo, W. Ju, M. Qu, Y. Gu, C. Chen, M. Deng, X. S. Hua, and M. Zhang, CLEAR: Cluster-enhanced contrast for self-supervised graph representation learning, IEEE Trans. Neural Netw. Learn. Syst., vol. 35, no. 1, pp. 899–912, 2024.
[159]
R. Benjamin, U. Singer, and K. Radinsky, Graph neural networks pretraining through inherent supervision for molecular property prediction, in Proc. 31 st ACM Int. Conf. Information & Knowledge Management, Atlanta, GA, USA, 2022, pp. 2903–2912.
[160]
G. Shi, Y. Zhu, J. K. Liu, and X. Li, Hegcl: Advance self-supervised learning in heterogeneous graph-level representation, IEEE Trans. Neural Netw. Learn. Syst.
[161]

A. Xie, Z. Zhang, J. Guan, and S. Zhou, Self-supervised learning with chemistry-aware fragmentation for effective molecular property prediction, Brief. Bioinform., vol. 24, no. 5, p. bbad296, 2023.

[162]

Z. Ji, R. Shi, J. Lu, F. Li, and Y. Yang, ReLMole: Molecular representation learning based on two-level graph similarities, J. Chem. Inf. Model., vol. 62, no. 22, pp. 5361–5372, 2022.

[163]
R. Hadsell, S. Chopra, and Y. LeCun, Dimensionality reduction by learning an invariant mapping, in Proc. 2006 IEEE Computer Society Conf. Computer Vision and Pattern Recognition (CVPR’06 ), New York, NY, USA, 2006, pp. 1735–1742.
[164]
G. A. Pinheiro, J. L. F. Da Silva, and M. G. Quiles, SMICLR: Contrastive learning on multiple molecular representations for semisupervised and unsupervised representation learning, J. Chem. Inf. Model., vol. 62, no. 17, pp. 3948–3960, 2022.
[165]
C. Zhang, X. Yan, and Y. Liu, Pseudo-siamese neural network based graph and sequence representation learning for molecular property prediction, in Proc. 2022 IEEE Int. Conf. Bioinformatics and Biomedicine (BIBM ), Las Vegas, NV, USA, 2022, pp. 3911–3913.
[166]
H. Stärk, D. Beaini, G. Corso, P. Tossou, C. Dallago, S. Günnemann, and P. Liò., 3D infomax improves GNNs for molecular property prediction, in Proc. 39 th Int. Conf. Machine Learning, Baltimore, MD, USA, 2022, pp. 20479–20502.
[167]
Y. Zhu, D. Chen, Y. Du, Y. Wang, Q. Liu, and S. Wu, Molecular contrastive pretraining with collaborative featurizations, Journal of Chemical Information and Modeling, vol. 64, no. 4, pp. 1112–1122, 2024.
[168]
A. Tarvainen and H. Valpola, Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, in Proc. 31 st Int. Conf. Advances in Neural Information Processing Systems, Long Beach, California, USA, 2017, pp. 1195–1204.
[169]

J. Chen, Y. W. Si, C. W. Un, and S. W. I. Siu, Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network, J. Cheminform., vol. 13, no. 1, p. 93, 2021.

[170]
D. Berthelot, N. Carlini, I. Goodfellow, A. Oliver, N. Papernot, and C. Raffel, MixMatch: A holistic approach to semi-supervised learning, in Proc. 33 rd Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2019, pp. 5049–5059.
[171]

K. Yu, S. Visweswaran, and K. Batmanghelich, Semi-supervised hierarchical drug embedding in hyperbolic space, J. Chem. Inf. Model., vol. 60, no. 12, pp. 5647–5657, 2020.

[172]
H. Ma, F. Jiang, Y. Rong, Y. Guo, and J. Huang, Robust self-training strategy for various molecular biology prediction tasks, in Proc. 13 th ACM Int. Conf. Bioinformatics, Computational Biology and Health Informatics, Northbrook, IL, USA, 2022, pp. 1–5.
[173]
Z. Zhang and M. R. Sabuncu, Generalized cross entropy loss for training deep neural networks with noisy labels, in Proc. 32 nd Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2018, pp. 8792–8802.
[174]
G. Liu, T. Zhao, E. Inae, T. Luo, and M. Jiang, Semi-supervised graph imbalanced regression, arXiv preprint arXiv: 2305.12087, 2023.
[175]
A. R. Zamir, A. Sax, W. Shen, L. Guibas, J. Malik, and S. Savarese, Taskonomy: Disentangling task transfer learning, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 3712–3722.
[176]

X. Li, X. Yan, Q. Gu, H. Zhou, D. Wu, and J. Xu, DeepChemStable: Chemical stability prediction with an attention-based graph convolution network, J. Chem. Inf. Model., vol. 59, no. 3, pp. 1044–1049, 2019.

[177]
X. Chen and K. He, Exploring simple siamese representation learning, in Proc. 2021 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Nashville, TN, USA, 2021, pp. 15745–15753.
[178]

H. Li, X. Zhao, S. Li, F. Wan, D. Zhao, and J. Zeng, Improving molecular property prediction through a task similarity enhanced transfer learning strategy, iScience, vol. 25, no. 10, p. 105231, 2022.

[179]

W. Ju, Z. Liu, Y. Qin, B. Feng, C. Wang, Z. Guo, X. Luo, and M. Zhang, Few-shot molecular property prediction via hierarchically structured learning on relation graphs, Neural Netw., vol. 163, pp. 122–131, 2023.

[180]
C. Q. Nguyen, C. Kreatsoulas, and K. M. Branson, Meta-learning GNN initializations for low-resource molecular property prediction, arXiv preprint arXiv: 2003.05996, 2020.
[181]

L. Torres, J. P. Arrais, and B. Ribeiro, Few-shot learning via graph embeddings with convolutional networks for low-data molecular property prediction, Neural Comput. Appl., vol. 35, no. 18, pp. 13167–13185, 2023.

[182]
H. S. de Ocáriz Borde and F. Barbero, Graph neural network expressivity and meta-learning for molecular property regression, arXiv preprint arXiv: 2209.13410, 2022.
[183]

K. P. Ham and L. Sael, Evidential meta-model for molecular property prediction, Bioinformatics, vol. 39, no. 10, p. btad604, 2023.

[184]
Z. Meng, Y. Li, P. Zhao, Y. Yu, and I. King, Meta-learning with motif-based task augmentation for few-shot molecular property prediction, in Proc. 2023 SIAM Int. Conf. Data Mining (SDM ), Minneapolis-St. Paul Twin Cities, MN, USA, 2023, pp. 811–819.
[185]
Z. Guo, C. Zhang, W. Yu, J. Herr, O. Wiest, M. Jiang, and N. V. Chawla, Few-shot graph learning for molecular property prediction, in Proc. Web Conf. 2021, Ljubljana, Slovenia, 2021, pp. 2559–2567.
[186]
Y. Wang, A. Abuduweili, Q. Yao, and D. Dou, Property-aware relation networks for few-shot molecular property prediction, arXiv preprint arXiv: 2107.07994, 2021.
[187]
S. Yao, Z. Feng, J. Song, L. Jia, Z. Zhong, and M.Song, Chemical property relation guided few-shot molecular property prediction, in Proc. 2022 Int. Joint Conf. Neural Networks (IJCNN ), Padua, Italy, 2022, pp. 1–8.
[188]
J. Dong, N. N. Wang, Z. J. Yao, L. Zhang, Y. Cheng, D. Ouyang, A. P. Lu, and D. S. Cao, ADMETlab: A platform for systematic ADMET evaluation based on a comprehensively collected ADMET database, J. Cheminform., vol. 10, p. 29, 2018.
[189]

D. van Tilborg, A. Alenicheva, and F. Grisoni, Exposing the limitations of molecular machine learning with activity cliffs, J. Chem. Inf. Model., vol. 62, no. 23, pp. 5938–5951, 2022.

[190]
Y. Ji, L. Zhang, J. Wu, B. Wu, L. Li, L. K. Huang, T. Xu, Y. Rong, J. Ren, D. Xue, et al., DrugOOD: Out-of-distribution dataset curator and benchmark for AI-aided drug discovery–a focus on affinity prediction problems with noise annotations, in Proc. 37 th AAAI Conf. Artificial Intelligence, Washington, DC, USA, 2023, pp. 8023–8031.
[191]

S. Chmiela, A. Tkatchenko, H. E. Sauceda, I. Poltavsky, K. T. Schütt, and K. R. Müller, Machine learning of accurate energy-conserving molecular force fields, Sci. Adv., vol. 3, no. 5, p. e1603015, 2017.

[192]
C. Morris, N. M. Kriege, F. Bause, K. Kersting, P. Mutzel, and M. Neumann, TUDataset: A collection of benchmark datasets for learning with graphs, arXiv preprint arXiv: 2007.08663, 2020.
[193]
W. Hu, M. Fey, M. Zitnik, Y. Dong, H. Ren, B. Liu, M. Catasta, and J. Leskovec, Open graph benchmark: Datasets for machine learning on graphs, in Proc. 34 th Int. Conf. Neural Information Processing Systems, Vancouver, Canada, 2020, pp. 22118–22133.
[194]

A. Wojtuch, T. Danel, S. Podlewska, and Ł. Maziarka, Extended study on atomic featurization in graph neural networks for molecular property prediction, J. Cheminform., vol. 15, no. 1, p. 81, 2023.

[195]

Z. Zeng, Y. Yao, Z. Liu, and M. Sun, A deep-learning system bridging molecule structure and biomedical text with comprehension comparable to human professionals, Nat. Commun., vol. 13, no. 1, p. 862, 2022.

[196]
K. Xu, W. Hu, J. Leskovec, and S. Jegelka, How powerful are graph neural networks? arXiv preprint arXiv: 1810.00826, 2019.
[197]
B. Su, D. Du, Z. Yang, Y. Zhou, J. Li, A. Rao, H. Sun, Z. Lu, and J. R. Wen, A molecular multimodal foundation model associating molecule graphs with natural language, arXiv preprint arXiv: 2209.05481, 2022.
[198]
X. Tang, A. Tran, J. Tan, and M. B. Gerstein, MolLM: A unified language model to integrate biomedical text with 2D and 3D molecular representations, bioRxiv. doi: 10.1101/2023.11.25.568656.
Big Data Mining and Analytics
Pages 858-888
Cite this article:
Kuang T, Liu P, Ren Z. Impact of Domain Knowledge and Multi-Modality on Intelligent Molecular Property Prediction: A Systematic Survey. Big Data Mining and Analytics, 2024, 7(3): 858-888. https://doi.org/10.26599/BDMA.2024.9020028

56

Views

0

Downloads

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Altmetrics

Received: 18 February 2024
Revised: 10 April 2024
Accepted: 15 April 2024
Published: 28 August 2024
© The author(s) 2024.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return