PDF (16.5 MB)
Collect
Submit Manuscript
Show Outline
Figures (6)

Tables (1)
Table 1
Open Access

Multiplex Networks and Pan-Cancer Multiomics-Based Driver Gene Identification Using Graph Neural Networks

School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China, and with Research & Development Institute of Northwestern Polytechnical University in Shenzhen, Shenzhen 518063, China
School of Software, Northwestern Polytechnical University, Xi’an 710072, China, and with Research & Development Institute of Northwestern Polytechnical University in Shenzhen, Shenzhen 518063, China
School of Computer Science, Northwestern Polytechnical University, Xi’an 710072, China
School of Computer Science and Engineering, Central South University, Changsha 410083, China
Show Author Information

Abstract

Identifying cancer driver genes has paramount significance in elucidating the intricate mechanisms underlying cancer development, progression, and therapeutic interventions. Abundant omics data and interactome networks provided by numerous extensive databases enable the application of graph deep learning techniques that incorporate network structures into the deep learning framework. However, most existing models primarily focus on individual network, inevitably neglecting the incompleteness and noise of interactions. Moreover, samples with imbalanced classes in driver gene identification hamper the performance of models. To address this, we propose a novel deep learning framework MMGN, which integrates multiplex networks and pan-cancer multiomics data using graph neural networks combined with negative sample inference to discover cancer driver genes, which not only enhances gene feature learning based on the mutual information and the consensus regularizer, but also achieves balanced class of positive and negative samples for model training. The reliability of MMGN has been verified by the Area Under the Receiver Operating Characteristic curves (AUROC) and the Area Under the Precision-Recall Curves (AUPRC). We believe MMGN has the potential to provide new prospects in precision oncology and may find broader applications in predicting biomarkers for other intricate diseases. Implementations of MMGN can be found at https://github.com/xingyili/MMGN.

Electronic Supplementary Material

Download File(s)
BDMA-2024-0084_ESM.pdf (328.3 KB)

References

[1]

A. Colaprico, C. Olsen, M. H. Bailey, G. J. Odom, T. Terkelsen, T. C. Silva, A. V. Olsen, L. Cantini, A. Zinovyev, E. Barillot, et al., Interpreting pathways to discover cancer driver genes with moonlight, Nat. Commun., vol. 11, no. 1, p. 69, 2020.

[2]
X. Li, M. Li, J. Xiang, Z. Zhao, and X. Shang, SEPA: Signaling entropy-based algorithm to evaluate personalized pathway activation for survival analysis on pan-cancer data, Bioinformatics, vol. 38, no. 9, pp. 2536–2543, 2022.
[3]

W. Peng, Z. Zhou, W. Dai, N. Yu, and J. Wang, Multi-network graph contrastive learning for cancer driver gene identification, IEEE Trans. Network Sci. Eng., vol. 11, no. 4, pp. 3430–3440, 2024.

[4]

M. Olivier, R. Asmis, G. A. Hawkins, T. D. Howard, and L. A. Cox, The need for multi-omics biomarker signatures in precision medicine, Int. J. Mol. Sci., vol. 20, no. 19, p. 4781, 2019.

[5]

W. Peng, P. Yu, W. Dai, X. Fu, L. Liu, and Y. Pan, A graph convolution network-based model for prioritizing personalized cancer driver genes of individual patients, IEEE Trans. NanoBiosci., vol. 22, no. 4, pp. 744–754, 2023.

[6]

X. Li, J. Hao, J. Li, Z. Zhao, X. Shang, and M. Li, Pathway activation analysis for pan-cancer personalized characterization based on riemannian manifold, Int. J. Mol. Sci., vol. 25, no. 8, p. 4411, 2024.

[7]

C. Liu, Y. Ma, J. Zhao, R. Nussinov, Y. C. Zhang, F. Cheng, and Z. K. Zhang, Computational network biology: Data, models, and applications, Phys. Rep., vol. 846, pp. 1–66, 2020.

[8]

R. Li, X. Yuan, M. Radfar, P. Marendy, W. Ni, T. J. O’Brien, and P. M. Casillas-Espinosa, Graph signal processing, graph neural network and graph learning on biological data: A systematic review, IEEE Rev. Biomed. Eng., vol. 16, pp. 109–135, 2023.

[9]

G. Muzio, L. O’Bray, and K. Borgwardt, Biological network analysis with deep learning, Brief. Bioinform., vol. 22, no. 2, pp. 1515–1530, 2021.

[10]

X. Li, Y. Li, X. Shang, and H. Kong, A sequence-based machine learning model for predicting antigenic distance for H3N2 influenza virus, Front. Microbiol., vol. 15, p. 1345794, 2024.

[11]

Y. Li, C. Huang, L. Ding, Z. Li, Y. Pan, and X. Gao, Deep learning in bioinformatics: Introduction, application, and perspective in the big data era, Methods, vol. 166, pp. 4–21, 2019.

[12]

H. C. Yi, Z. H. You, D. S. Huang, and C. K. Kwoh, Graph representation learning in bioinformatics: Trends, methods and applications, Brief. Bioinform., vol. 23, no. 1, p. bbab340, 2022.

[13]

T. Ching, D. S. Himmelstein, B. K. Beaulieu-Jones, A. A. Kalinin, B. T. Do, G. P. Way, E. Ferrero, P. M. Agapow, M. Zietz, M. M. Hoffman, et al., Opportunities and obstacles for deep learning in biology and medicine, J. R. Soc. Interface, vol. 15, no. 141, p. 20170387, 2018.

[14]

R. Schulte-Sasse, S. Budach, D. Hnisz, and A. Marsico, Integration of multiomics data with graph convolutional networks to identify new cancer genes and their associated molecular mechanisms, Nat. Mach. Intell., vol. 3, no. 6, pp. 513–526, 2021.

[15]

W. Peng, Q. Tang, W. Dai, and T. Chen, Improving cancer driver gene identification using multi-task learning on graph convolutional network, Brief. Bioinform., vol. 23, no. 1, p. bbab432, 2022.

[16]
R. Su, L. Dong, Y. Li, M. Gao, P. C. He, W. Liu, J. Wei, Z. Zhao, L. Gao, L. Han, et al., METTL16 exerts an m6A-independent function to facilitate translation and tumorigenesis, Nat. Cell Biol., vol. 24, no. 2, pp. 205–216, 2022.
[17]

W. Peng, R. Wu, W. Dai, and N. Yu, Identifying cancer driver genes based on multi-view heterogeneous graph convolutional network and self-attention mechanism, BMC Bioinf., vol. 24, no. 1, p. 16, 2023.

[18]
W. Zhao, X. Gu, S. Chen, J. Wu, and Z. Zhou, MODIG: Integrating multi-omics and multi-dimensional gene network for cancer driver gene identification based on graph attention network model, Bioinformatics, vol. 38, no. 21, pp. 4901–4907, 2022.
[19]
C. Park, D. Kim, J. Han, and H. Yu, Unsupervised attributed multiplex network embedding, in Proc. 34 th AAAI Conf. Artificial Intelligence, New York, NY, USA, 2020, pp. 5371–5378.
[20]
L. Ruff, N. Görnitz, L. Deecke, S. A. Siddiqui, R. A. Vandermeulen, A. Binder, E. Müller, and M. Kloft, Deep one-class classification, in Proc. 35 th Int. Conf. Machine Learning, Stockholm, Sweden, 2018, pp. 4390–4399.
[21]
T. Chen and C. Guestrin, XGBoost: A scalable tree boosting system, in Proc. 22 nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, San Francisco, CA, USA, 2016, pp. 785–794.
[22]

The Cancer Genome Atlas Research Network, J. N. Weinstein, E. A. Collisson, G. B. Mills, K. R. M. Shaw, B. A. Ozenberger, K. Ellrott, I. Shmulevich, C. Sander, and J. M. Stuart, The cancer genome atlas pan-cancer analysis project, Nat. Genet., vol. 45, no. 10, pp. 1113–1120, 2013.

[23]

C. Ogris, D. Guala, T. Helleday, and E. L. L. Sonnhammer, A novel method for crosstalk analysis of biological networks: Improving accuracy of pathway annotation, Nucleic Acids Res., vol. 45, no. 2, p. e8, 2017.

[24]

M. Costanzo, B. VanderSluis, E. N. Koch, A. Baryshnikova, C. Pons, G. Tan, W. Wang, M. Usaj, J. Hanchard, S. D. Lee, et al., A global genetic interaction network maps a wiring diagram of cellular function, Science, vol. 353, no. 6306, p. aaf1420, 2016.

[25]

R. Herwig, C. Hardt, M. Lienhard, and A. Kamburov, Analyzing and interpreting genome data at the network level with ConsensusPathDB, Nat. Protoc., vol. 11, no. 10, pp. 1889–1907, 2016.

[26]
M. Kanehisa and S. Goto, KEGG: Kyoto encyclopedia of genes and genomes, Nucleic Acids Res., vol. 28, no. 1, pp. 27–30, 2000.
[27]
J. D. Zhang and S. Wiemann, KEGGgraph: A graph approach to KEGG PATHWAY in R and bioconductor, Bioinformatics, vol. 25, no. 11, pp. 1470–1471, 2009.
[28]
M. Giurgiu, J. Reinhard, B. Brauner, I. Dunger-Kaltenbach, G. Fobo, G. Frishman, C. Montrone, and A. Ruepp, CORUM: The comprehensive resource of mammalian protein complexes—2019, Nucleic Acids Res., vol. 47, no. D1, pp. D559–D563, 2019.
[29]

P. V. Hornbeck, J. M. Kornhauser, S. Tkachev, B. Zhang, E. Skrzypek, B. Murray, V. Latham, and M. Sullivan, PhosphoSitePlus: A comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse, Nucleic Acids Res., vol. 40, no. D1, pp. D261–D270, 2012.

[30]

D. S. Lee, J. Park, K. A. Kay, N. A. Christakis, Z. N. Oltvai, and A. L. Barabási, The implications of human metabolic network topology for disease comorbidity, Proc. Natl. Acad. Sci. USA, vol. 105, no. 29, pp. 9880–9885, 2008.

[31]
V. Matys, E. Fricke, R. Geffers, E. Gößling, M. Haubrock, R. Hehl, K. Hornischer, D. Karas, A. E. Kel, O. V. Kel-Margoulis, et al., TRANSFAC®: Transcriptional regulation, from patterns to profiles, Nucleic Acids Res., vol. 31, no. 1, pp. 374–378, 2003.
[32]

D. Repana, J. Nulsen, L. Dressler, M. Bortolomeazzi, S. K. Venkata, A. Tourna, A. Yakovleva, T. Palmieri, and F. D. Ciccarelli, The Network of Cancer Genes (NCG): A comprehensive catalogue of known and candidate cancer genes from cancer sequencing screens, Genome Biol., vol. 20, no. 1, p. 1, 2019.

[33]
S. A. Forbes, G. Tang, N. Bindal, S. Bamford, E. Dawson, C. Cole, C. Y. Kok, M. Jia, R. Ewing, A. Menzies, et al., COSMIC (the catalogue of somatic mutations in cancer): A resource to investigate acquired mutations in human cancer, Nucleic Acids Res., vol. 38, no. S1, pp. D652–D657, 2010.
[34]

J. Kim, S. So, H. J. Lee, J. C. Park, J. J. Kim, and H. Lee, DigSee: Disease gene search engine with evidence sentences (version cancer), Nucleic Acids Res., vol. 41, no. W1, pp. W510–W517, 2013.

[35]
J. S. Amberger, C. A. Bocchini, F. Schiettecatte, A. F. Scott, and A. Hamosh, OMIM.org: Online mendelian inheritance in man (OMIM®), an online catalog of human genes and genetic disorders, Nucleic Acids Res., vol. 43, no. D1, pp. D789–D798, 2015.
[36]
T. N. Kipf and M. Welling, Semi-supervised classification with graph convolutional networks, in Proc. 5 th Int. Conf. Learning Representations, arXiv preprint arXiv: 1609.02907, 2017.
[37]
P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, Graph attention networks, in Proc. 6 th Int. Conf. Learning Representations, arXiv preprint arXiv: 1710.10903, 2018.
[38]
B. Perozzi, R. Al-Rfou, and S. Skiena, DeepWalk: Online learning of social representations, in Proc. 20 th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, New York, NY, USA, 2014, pp. 701–710.
[39]

J. Lever, E. Y. Zhao, J. Grewal, M. R. Jones, and S. J. M. Jones, CancerMine: A literature-mined resource for drivers, oncogenes and tumor suppressors in cancer, Nat. Methods, vol. 16, no. 6, pp. 505–507, 2019.

[40]

K. L. Abbott, E. T. Nyre, J. Abrahante, Y. Y. Ho, R. Isaksson Vogel, and T. K. Starr, The candidate cancer gene database: A database of cancer driver genes from forward genetic screens in mice, Nucleic Acids Res., vol. 43, no. D1, pp. D844–D848, 2015.

[41]

G. Ju, J. Lei, S. Cai, S. Liu, X. Yin, and C. Peng, The emerging, multifaceted role of WTAP in cancer and cancer therapeutics, Cancers, vol. 15, no. 11, p. 3053, 2023.

[42]
T. Shibata, E. Tokunaga, S. Hattori, K. Watari, Y. Murakami, N. Yamashita, E. Oki, J. Itou, M. Toi, Y. Maehara, et al., Y-box binding protein YBX1 and its correlated genes as biomarkers for poor outcomes in patients with breast cancer, Oncotarget, vol. 9, no. 98, pp. 37216–37228, 2018.
[43]
J. Yao, P. J. Lei, Q. L. Li, J. Chen, S. B. Tang, Q. Xiao, X. Lin, X. Wang, L. Y. Li, and M. Wu, GLIS2 promotes colorectal cancer through repressing enhancer activation, Oncogenesis, vol. 9, no. 5, p. 57, 2020.
[44]

M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, et al., Gene ontology: Tool for the unification of biology, Nat. Genet., vol. 25, no. 1, pp. 25–29, 2000.

[45]

Z. Wang, Z. Liu, B. Liu, G. Liu, and S. Wu, Dissecting the roles of ephrin-A3 in malignant peripheral nerve sheath tumor by talens, Oncol. Rep., vol. 34, no. 1, pp. 391–398, 2015.

[46]

C. Dong, P. Li, Y. Wu, Z. Guo, and R. He, The 1q21.3 region driver gene EFNA3 promotes disease progression via inhibition of lung adenocarcinoma cell apoptosis, Transl. Cancer Res., vol. 11, no. 5, pp. 1309–1320, 2022.

[47]

J. Y. Lim, S. W. Kim, B. Kim, and S. J. Park, Knockdown of CARD14 inhibits cell proliferation and migration in breast cancer cells, Anticancer Res., vol. 40, no. 4, pp. 1953–1962, 2020.

[48]
D. Vanneste, J. Staal, M. Haegman, Y. Driege, M. Carels, E. Van Nuffel, P. De Bleser, Y. Saeys, R. Beyaert, and I. S. Afonina, CARD14 signalling ensures cell survival and cancer associated gene expression in prostate cancer cells, Biomedicines, vol. 10, no. 8, p. 2008, 2022.
Big Data Mining and Analytics
Pages 1262-1272
Cite this article:
Li X, Li J, Hao J, et al. Multiplex Networks and Pan-Cancer Multiomics-Based Driver Gene Identification Using Graph Neural Networks. Big Data Mining and Analytics, 2024, 7(4): 1262-1272. https://doi.org/10.26599/BDMA.2024.9020043
Metrics & Citations  
Article History
Copyright
Rights and Permissions
Return