Identifying cancer driver genes has paramount significance in elucidating the intricate mechanisms underlying cancer development, progression, and therapeutic interventions. Abundant omics data and interactome networks provided by numerous extensive databases enable the application of graph deep learning techniques that incorporate network structures into the deep learning framework. However, most existing models primarily focus on individual network, inevitably neglecting the incompleteness and noise of interactions. Moreover, samples with imbalanced classes in driver gene identification hamper the performance of models. To address this, we propose a novel deep learning framework MMGN, which integrates multiplex networks and pan-cancer multiomics data using graph neural networks combined with negative sample inference to discover cancer driver genes, which not only enhances gene feature learning based on the mutual information and the consensus regularizer, but also achieves balanced class of positive and negative samples for model training. The reliability of MMGN has been verified by the Area Under the Receiver Operating Characteristic curves (AUROC) and the Area Under the Precision-Recall Curves (AUPRC). We believe MMGN has the potential to provide new prospects in precision oncology and may find broader applications in predicting biomarkers for other intricate diseases. Implementations of MMGN can be found at https://github.com/xingyili/MMGN.
A. Colaprico, C. Olsen, M. H. Bailey, G. J. Odom, T. Terkelsen, T. C. Silva, A. V. Olsen, L. Cantini, A. Zinovyev, E. Barillot, et al., Interpreting pathways to discover cancer driver genes with moonlight, Nat. Commun., vol. 11, no. 1, p. 69, 2020.
W. Peng, Z. Zhou, W. Dai, N. Yu, and J. Wang, Multi-network graph contrastive learning for cancer driver gene identification, IEEE Trans. Network Sci. Eng., vol. 11, no. 4, pp. 3430–3440, 2024.
M. Olivier, R. Asmis, G. A. Hawkins, T. D. Howard, and L. A. Cox, The need for multi-omics biomarker signatures in precision medicine, Int. J. Mol. Sci., vol. 20, no. 19, p. 4781, 2019.
W. Peng, P. Yu, W. Dai, X. Fu, L. Liu, and Y. Pan, A graph convolution network-based model for prioritizing personalized cancer driver genes of individual patients, IEEE Trans. NanoBiosci., vol. 22, no. 4, pp. 744–754, 2023.
X. Li, J. Hao, J. Li, Z. Zhao, X. Shang, and M. Li, Pathway activation analysis for pan-cancer personalized characterization based on riemannian manifold, Int. J. Mol. Sci., vol. 25, no. 8, p. 4411, 2024.
C. Liu, Y. Ma, J. Zhao, R. Nussinov, Y. C. Zhang, F. Cheng, and Z. K. Zhang, Computational network biology: Data, models, and applications, Phys. Rep., vol. 846, pp. 1–66, 2020.
R. Li, X. Yuan, M. Radfar, P. Marendy, W. Ni, T. J. O’Brien, and P. M. Casillas-Espinosa, Graph signal processing, graph neural network and graph learning on biological data: A systematic review, IEEE Rev. Biomed. Eng., vol. 16, pp. 109–135, 2023.
G. Muzio, L. O’Bray, and K. Borgwardt, Biological network analysis with deep learning, Brief. Bioinform., vol. 22, no. 2, pp. 1515–1530, 2021.
X. Li, Y. Li, X. Shang, and H. Kong, A sequence-based machine learning model for predicting antigenic distance for H3N2 influenza virus, Front. Microbiol., vol. 15, p. 1345794, 2024.
Y. Li, C. Huang, L. Ding, Z. Li, Y. Pan, and X. Gao, Deep learning in bioinformatics: Introduction, application, and perspective in the big data era, Methods, vol. 166, pp. 4–21, 2019.
H. C. Yi, Z. H. You, D. S. Huang, and C. K. Kwoh, Graph representation learning in bioinformatics: Trends, methods and applications, Brief. Bioinform., vol. 23, no. 1, p. bbab340, 2022.
T. Ching, D. S. Himmelstein, B. K. Beaulieu-Jones, A. A. Kalinin, B. T. Do, G. P. Way, E. Ferrero, P. M. Agapow, M. Zietz, M. M. Hoffman, et al., Opportunities and obstacles for deep learning in biology and medicine, J. R. Soc. Interface, vol. 15, no. 141, p. 20170387, 2018.
R. Schulte-Sasse, S. Budach, D. Hnisz, and A. Marsico, Integration of multiomics data with graph convolutional networks to identify new cancer genes and their associated molecular mechanisms, Nat. Mach. Intell., vol. 3, no. 6, pp. 513–526, 2021.
W. Peng, Q. Tang, W. Dai, and T. Chen, Improving cancer driver gene identification using multi-task learning on graph convolutional network, Brief. Bioinform., vol. 23, no. 1, p. bbab432, 2022.
W. Peng, R. Wu, W. Dai, and N. Yu, Identifying cancer driver genes based on multi-view heterogeneous graph convolutional network and self-attention mechanism, BMC Bioinf., vol. 24, no. 1, p. 16, 2023.
The Cancer Genome Atlas Research Network, J. N. Weinstein, E. A. Collisson, G. B. Mills, K. R. M. Shaw, B. A. Ozenberger, K. Ellrott, I. Shmulevich, C. Sander, and J. M. Stuart, The cancer genome atlas pan-cancer analysis project, Nat. Genet., vol. 45, no. 10, pp. 1113–1120, 2013.
C. Ogris, D. Guala, T. Helleday, and E. L. L. Sonnhammer, A novel method for crosstalk analysis of biological networks: Improving accuracy of pathway annotation, Nucleic Acids Res., vol. 45, no. 2, p. e8, 2017.
M. Costanzo, B. VanderSluis, E. N. Koch, A. Baryshnikova, C. Pons, G. Tan, W. Wang, M. Usaj, J. Hanchard, S. D. Lee, et al., A global genetic interaction network maps a wiring diagram of cellular function, Science, vol. 353, no. 6306, p. aaf1420, 2016.
R. Herwig, C. Hardt, M. Lienhard, and A. Kamburov, Analyzing and interpreting genome data at the network level with ConsensusPathDB, Nat. Protoc., vol. 11, no. 10, pp. 1889–1907, 2016.
P. V. Hornbeck, J. M. Kornhauser, S. Tkachev, B. Zhang, E. Skrzypek, B. Murray, V. Latham, and M. Sullivan, PhosphoSitePlus: A comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse, Nucleic Acids Res., vol. 40, no. D1, pp. D261–D270, 2012.
D. S. Lee, J. Park, K. A. Kay, N. A. Christakis, Z. N. Oltvai, and A. L. Barabási, The implications of human metabolic network topology for disease comorbidity, Proc. Natl. Acad. Sci. USA, vol. 105, no. 29, pp. 9880–9885, 2008.
D. Repana, J. Nulsen, L. Dressler, M. Bortolomeazzi, S. K. Venkata, A. Tourna, A. Yakovleva, T. Palmieri, and F. D. Ciccarelli, The Network of Cancer Genes (NCG): A comprehensive catalogue of known and candidate cancer genes from cancer sequencing screens, Genome Biol., vol. 20, no. 1, p. 1, 2019.
J. Kim, S. So, H. J. Lee, J. C. Park, J. J. Kim, and H. Lee, DigSee: Disease gene search engine with evidence sentences (version cancer), Nucleic Acids Res., vol. 41, no. W1, pp. W510–W517, 2013.
J. Lever, E. Y. Zhao, J. Grewal, M. R. Jones, and S. J. M. Jones, CancerMine: A literature-mined resource for drivers, oncogenes and tumor suppressors in cancer, Nat. Methods, vol. 16, no. 6, pp. 505–507, 2019.
K. L. Abbott, E. T. Nyre, J. Abrahante, Y. Y. Ho, R. Isaksson Vogel, and T. K. Starr, The candidate cancer gene database: A database of cancer driver genes from forward genetic screens in mice, Nucleic Acids Res., vol. 43, no. D1, pp. D844–D848, 2015.
G. Ju, J. Lei, S. Cai, S. Liu, X. Yin, and C. Peng, The emerging, multifaceted role of WTAP in cancer and cancer therapeutics, Cancers, vol. 15, no. 11, p. 3053, 2023.
M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, et al., Gene ontology: Tool for the unification of biology, Nat. Genet., vol. 25, no. 1, pp. 25–29, 2000.
Z. Wang, Z. Liu, B. Liu, G. Liu, and S. Wu, Dissecting the roles of ephrin-A3 in malignant peripheral nerve sheath tumor by talens, Oncol. Rep., vol. 34, no. 1, pp. 391–398, 2015.
C. Dong, P. Li, Y. Wu, Z. Guo, and R. He, The 1q21.3 region driver gene EFNA3 promotes disease progression via inhibition of lung adenocarcinoma cell apoptosis, Transl. Cancer Res., vol. 11, no. 5, pp. 1309–1320, 2022.
J. Y. Lim, S. W. Kim, B. Kim, and S. J. Park, Knockdown of CARD14 inhibits cell proliferation and migration in breast cancer cells, Anticancer Res., vol. 40, no. 4, pp. 1953–1962, 2020.