| Sign up

PDF (16.5 MB)

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Open Access

Multiplex Networks and Pan-Cancer Multiomics-Based Driver Gene Identification Using Graph Neural Networks

Xingyi Li^¹, Junming Li^², Jun Hao^³, Xingyu Liao^³(), Min Li^⁴(), Xuequn Shang^³

1School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China, and with Research & Development Institute of Northwestern Polytechnical University in Shenzhen, Shenzhen 518063, China

2School of Software, Northwestern Polytechnical University, Xi’an 710072, China, and with Research & Development Institute of Northwestern Polytechnical University in Shenzhen, Shenzhen 518063, China

3School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China

4School of Computer Science and Engineering, Central South University, Changsha 410083, China

Show Author Information

Abstract

Identifying cancer driver genes has paramount significance in elucidating the intricate mechanisms underlying cancer development, progression, and therapeutic interventions. Abundant omics data and interactome networks provided by numerous extensive databases enable the application of graph deep learning techniques that incorporate network structures into the deep learning framework. However, most existing models primarily focus on individual network, inevitably neglecting the incompleteness and noise of interactions. Moreover, samples with imbalanced classes in driver gene identification hamper the performance of models. To address this, we propose a novel deep learning framework MMGN, which integrates multiplex networks and pan-cancer multiomics data using graph neural networks combined with negative sample inference to discover cancer driver genes, which not only enhances gene feature learning based on the mutual information and the consensus regularizer, but also achieves balanced class of positive and negative samples for model training. The reliability of MMGN has been verified by the Area Under the Receiver Operating Characteristic curves (AUROC) and the Area Under the Precision-Recall Curves (AUPRC). We believe MMGN has the potential to provide new prospects in precision oncology and may find broader applications in predicting biomarkers for other intricate diseases. Implementations of MMGN can be found at https://github.com/xingyili/MMGN.

Keywords

cancer driver gene multiplex networks pan-cancer multiomics data graph neural networks negative sample inference

Electronic Supplementary Material

Download File(s)

BDMA-2024-0084_ESM.pdf (328.3 KB)

References

[1]

A. Colaprico, C. Olsen, M. H. Bailey, G. J. Odom, T. Terkelsen, T. C. Silva, A. V. Olsen, L. Cantini, A. Zinovyev, E. Barillot, et al., Interpreting pathways to discover cancer driver genes with moonlight, Nat. Commun., vol. 11, no. 1, p. 69, 2020.

Crossref Google Scholar

[2]

X. Li, M. Li, J. Xiang, Z. Zhao, and X. Shang, SEPA: Signaling entropy-based algorithm to evaluate personalized pathway activation for survival analysis on pan-cancer data, Bioinformatics, vol. 38, no. 9, pp. 2536–2543, 2022.

[3]

W. Peng, Z. Zhou, W. Dai, N. Yu, and J. Wang, Multi-network graph contrastive learning for cancer driver gene identification, IEEE Trans. Network Sci. Eng., vol. 11, no. 4, pp. 3430–3440, 2024.

Crossref Google Scholar

[4]

M. Olivier, R. Asmis, G. A. Hawkins, T. D. Howard, and L. A. Cox, The need for multi-omics biomarker signatures in precision medicine, Int. J. Mol. Sci., vol. 20, no. 19, p. 4781, 2019.

Crossref Google Scholar

[5]

W. Peng, P. Yu, W. Dai, X. Fu, L. Liu, and Y. Pan, A graph convolution network-based model for prioritizing personalized cancer driver genes of individual patients, IEEE Trans. NanoBiosci., vol. 22, no. 4, pp. 744–754, 2023.

Crossref Google Scholar

[6]

X. Li, J. Hao, J. Li, Z. Zhao, X. Shang, and M. Li, Pathway activation analysis for pan-cancer personalized characterization based on riemannian manifold, Int. J. Mol. Sci., vol. 25, no. 8, p. 4411, 2024.

Crossref Google Scholar

[7]

C. Liu, Y. Ma, J. Zhao, R. Nussinov, Y. C. Zhang, F. Cheng, and Z. K. Zhang, Computational network biology: Data, models, and applications, Phys. Rep., vol. 846, pp. 1–66, 2020.

Crossref Google Scholar

[8]

R. Li, X. Yuan, M. Radfar, P. Marendy, W. Ni, T. J. O’Brien, and P. M. Casillas-Espinosa, Graph signal processing, graph neural network and graph learning on biological data: A systematic review, IEEE Rev. Biomed. Eng., vol. 16, pp. 109–135, 2023.

Crossref Google Scholar

[9]

G. Muzio, L. O’Bray, and K. Borgwardt, Biological network analysis with deep learning, Brief. Bioinform., vol. 22, no. 2, pp. 1515–1530, 2021.

Crossref Google Scholar

[10]

X. Li, Y. Li, X. Shang, and H. Kong, A sequence-based machine learning model for predicting antigenic distance for H3N2 influenza virus, Front. Microbiol., vol. 15, p. 1345794, 2024.

Crossref Google Scholar

[11]

Y. Li, C. Huang, L. Ding, Z. Li, Y. Pan, and X. Gao, Deep learning in bioinformatics: Introduction, application, and perspective in the big data era, Methods, vol. 166, pp. 4–21, 2019.

Crossref Google Scholar

[12]

H. C. Yi, Z. H. You, D. S. Huang, and C. K. Kwoh, Graph representation learning in bioinformatics: Trends, methods and applications, Brief. Bioinform., vol. 23, no. 1, p. bbab340, 2022.

Crossref Google Scholar

[13]

T. Ching, D. S. Himmelstein, B. K. Beaulieu-Jones, A. A. Kalinin, B. T. Do, G. P. Way, E. Ferrero, P. M. Agapow, M. Zietz, M. M. Hoffman, et al., Opportunities and obstacles for deep learning in biology and medicine, J. R. Soc. Interface, vol. 15, no. 141, p. 20170387, 2018.

Crossref Google Scholar

[14]

R. Schulte-Sasse, S. Budach, D. Hnisz, and A. Marsico, Integration of multiomics data with graph convolutional networks to identify new cancer genes and their associated molecular mechanisms, Nat. Mach. Intell., vol. 3, no. 6, pp. 513–526, 2021.

Crossref Google Scholar

[15]

W. Peng, Q. Tang, W. Dai, and T. Chen, Improving cancer driver gene identification using multi-task learning on graph convolutional network, Brief. Bioinform., vol. 23, no. 1, p. bbab432, 2022.

Crossref Google Scholar

[16]

R. Su, L. Dong, Y. Li, M. Gao, P. C. He, W. Liu, J. Wei, Z. Zhao, L. Gao, L. Han, et al., METTL16 exerts an m⁶A-independent function to facilitate translation and tumorigenesis, Nat. Cell Biol., vol. 24, no. 2, pp. 205–216, 2022.

[17]

W. Peng, R. Wu, W. Dai, and N. Yu, Identifying cancer driver genes based on multi-view heterogeneous graph convolutional network and self-attention mechanism, BMC Bioinf., vol. 24, no. 1, p. 16, 2023.

Crossref Google Scholar

[18]

W. Zhao, X. Gu, S. Chen, J. Wu, and Z. Zhou, MODIG: Integrating multi-omics and multi-dimensional gene network for cancer driver gene identification based on graph attention network model, Bioinformatics, vol. 38, no. 21, pp. 4901–4907, 2022.

[19]

C. Park, D. Kim, J. Han, and H. Yu, Unsupervised attributed multiplex network embedding, in Proc. 34^th AAAI Conf. Artificial Intelligence, New York, NY, USA, 2020, pp. 5371–5378.

[20]

L. Ruff, N. Görnitz, L. Deecke, S. A. Siddiqui, R. A. Vandermeulen, A. Binder, E. Müller, and M. Kloft, Deep one-class classification, in Proc. 35^th Int. Conf. Machine Learning, Stockholm, Sweden, 2018, pp. 4390–4399.

[21]

T. Chen and C. Guestrin, XGBoost: A scalable tree boosting system, in Proc. 22^nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, San Francisco, CA, USA, 2016, pp. 785–794.

[22]

The Cancer Genome Atlas Research Network, J. N. Weinstein, E. A. Collisson, G. B. Mills, K. R. M. Shaw, B. A. Ozenberger, K. Ellrott, I. Shmulevich, C. Sander, and J. M. Stuart, The cancer genome atlas pan-cancer analysis project, Nat. Genet., vol. 45, no. 10, pp. 1113–1120, 2013.

Crossref Google Scholar

[23]

C. Ogris, D. Guala, T. Helleday, and E. L. L. Sonnhammer, A novel method for crosstalk analysis of biological networks: Improving accuracy of pathway annotation, Nucleic Acids Res., vol. 45, no. 2, p. e8, 2017.

Crossref Google Scholar

[24]

M. Costanzo, B. VanderSluis, E. N. Koch, A. Baryshnikova, C. Pons, G. Tan, W. Wang, M. Usaj, J. Hanchard, S. D. Lee, et al., A global genetic interaction network maps a wiring diagram of cellular function, Science, vol. 353, no. 6306, p. aaf1420, 2016.

Crossref Google Scholar

[25]

R. Herwig, C. Hardt, M. Lienhard, and A. Kamburov, Analyzing and interpreting genome data at the network level with ConsensusPathDB, Nat. Protoc., vol. 11, no. 10, pp. 1889–1907, 2016.

Crossref Google Scholar

[26]

M. Kanehisa and S. Goto, KEGG: Kyoto encyclopedia of genes and genomes, Nucleic Acids Res., vol. 28, no. 1, pp. 27–30, 2000.

[27]

J. D. Zhang and S. Wiemann, KEGGgraph: A graph approach to KEGG PATHWAY in R and bioconductor, Bioinformatics, vol. 25, no. 11, pp. 1470–1471, 2009.

[28]

M. Giurgiu, J. Reinhard, B. Brauner, I. Dunger-Kaltenbach, G. Fobo, G. Frishman, C. Montrone, and A. Ruepp, CORUM: The comprehensive resource of mammalian protein complexes—2019, Nucleic Acids Res., vol. 47, no. D1, pp. D559–D563, 2019.

[29]

P. V. Hornbeck, J. M. Kornhauser, S. Tkachev, B. Zhang, E. Skrzypek, B. Murray, V. Latham, and M. Sullivan, PhosphoSitePlus: A comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse, Nucleic Acids Res., vol. 40, no. D1, pp. D261–D270, 2012.

Crossref Google Scholar

[30]

D. S. Lee, J. Park, K. A. Kay, N. A. Christakis, Z. N. Oltvai, and A. L. Barabási, The implications of human metabolic network topology for disease comorbidity, Proc. Natl. Acad. Sci. USA, vol. 105, no. 29, pp. 9880–9885, 2008.

Crossref Google Scholar

[31]

V. Matys, E. Fricke, R. Geffers, E. Gößling, M. Haubrock, R. Hehl, K. Hornischer, D. Karas, A. E. Kel, O. V. Kel-Margoulis, et al., TRANSFAC^®: Transcriptional regulation, from patterns to profiles, Nucleic Acids Res., vol. 31, no. 1, pp. 374–378, 2003.

[32]

D. Repana, J. Nulsen, L. Dressler, M. Bortolomeazzi, S. K. Venkata, A. Tourna, A. Yakovleva, T. Palmieri, and F. D. Ciccarelli, The Network of Cancer Genes (NCG): A comprehensive catalogue of known and candidate cancer genes from cancer sequencing screens, Genome Biol., vol. 20, no. 1, p. 1, 2019.

Crossref Google Scholar

[33]

S. A. Forbes, G. Tang, N. Bindal, S. Bamford, E. Dawson, C. Cole, C. Y. Kok, M. Jia, R. Ewing, A. Menzies, et al., COSMIC (the catalogue of somatic mutations in cancer): A resource to investigate acquired mutations in human cancer, Nucleic Acids Res., vol. 38, no. S1, pp. D652–D657, 2010.

[34]

J. Kim, S. So, H. J. Lee, J. C. Park, J. J. Kim, and H. Lee, DigSee: Disease gene search engine with evidence sentences (version cancer), Nucleic Acids Res., vol. 41, no. W1, pp. W510–W517, 2013.

Crossref Google Scholar

[35]

J. S. Amberger, C. A. Bocchini, F. Schiettecatte, A. F. Scott, and A. Hamosh, OMIM.org: Online mendelian inheritance in man (OMIM^®), an online catalog of human genes and genetic disorders, Nucleic Acids Res., vol. 43, no. D1, pp. D789–D798, 2015.

[36]

T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, in Proc. 5 ^th Int. Conf. Learning Representations, arXiv preprint arXiv: 1609.02907, 2017.

[37]

P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, Graph attention networks, in Proc. 6 ^th Int. Conf. Learning Representations, arXiv preprint arXiv: 1710.10903, 2018.

[38]

B. Perozzi, R. Al-Rfou, and S. Skiena, DeepWalk: Online learning of social representations, in Proc. 20^th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, New York, NY, USA, 2014, pp. 701–710.

[39]

J. Lever, E. Y. Zhao, J. Grewal, M. R. Jones, and S. J. M. Jones, CancerMine: A literature-mined resource for drivers, oncogenes and tumor suppressors in cancer, Nat. Methods, vol. 16, no. 6, pp. 505–507, 2019.

Crossref Google Scholar

[40]

K. L. Abbott, E. T. Nyre, J. Abrahante, Y. Y. Ho, R. Isaksson Vogel, and T. K. Starr, The candidate cancer gene database: A database of cancer driver genes from forward genetic screens in mice, Nucleic Acids Res., vol. 43, no. D1, pp. D844–D848, 2015.

Crossref Google Scholar

[41]

G. Ju, J. Lei, S. Cai, S. Liu, X. Yin, and C. Peng, The emerging, multifaceted role of WTAP in cancer and cancer therapeutics, Cancers, vol. 15, no. 11, p. 3053, 2023.

Crossref Google Scholar

[42]

T. Shibata, E. Tokunaga, S. Hattori, K. Watari, Y. Murakami, N. Yamashita, E. Oki, J. Itou, M. Toi, Y. Maehara, et al., Y-box binding protein YBX1 and its correlated genes as biomarkers for poor outcomes in patients with breast cancer, Oncotarget, vol. 9, no. 98, pp. 37216–37228, 2018.

[43]

J. Yao, P. J. Lei, Q. L. Li, J. Chen, S. B. Tang, Q. Xiao, X. Lin, X. Wang, L. Y. Li, and M. Wu, GLIS2 promotes colorectal cancer through repressing enhancer activation, Oncogenesis, vol. 9, no. 5, p. 57, 2020.

[44]

M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, et al., Gene ontology: Tool for the unification of biology, Nat. Genet., vol. 25, no. 1, pp. 25–29, 2000.

Crossref Google Scholar

[45]

Z. Wang, Z. Liu, B. Liu, G. Liu, and S. Wu, Dissecting the roles of ephrin-A3 in malignant peripheral nerve sheath tumor by talens, Oncol. Rep., vol. 34, no. 1, pp. 391–398, 2015.

Crossref Google Scholar

[46]

C. Dong, P. Li, Y. Wu, Z. Guo, and R. He, The 1q21.3 region driver gene EFNA3 promotes disease progression via inhibition of lung adenocarcinoma cell apoptosis, Transl. Cancer Res., vol. 11, no. 5, pp. 1309–1320, 2022.

Crossref Google Scholar

[47]

J. Y. Lim, S. W. Kim, B. Kim, and S. J. Park, Knockdown of CARD14 inhibits cell proliferation and migration in breast cancer cells, Anticancer Res., vol. 40, no. 4, pp. 1953–1962, 2020.

Crossref Google Scholar

[48]

D. Vanneste, J. Staal, M. Haegman, Y. Driege, M. Carels, E. Van Nuffel, P. De Bleser, Y. Saeys, R. Beyaert, and I. S. Afonina, CARD14 signalling ensures cell survival and cancer associated gene expression in prostate cancer cells, Biomedicines, vol. 10, no. 8, p. 2008, 2022.

Big Data Mining and Analytics

Volume 7 Issue 4,
December 2024

Pages 1262-1272

DOI: 10.26599/BDMA.2024.9020043

Cite this article:

Li X, Li J, Hao J, et al. Multiplex Networks and Pan-Cancer Multiomics-Based Driver Gene Identification Using Graph Neural Networks. Big Data Mining and Analytics, 2024, 7(4): 1262-1272. https://doi.org/10.26599/BDMA.2024.9020043

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号