AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (5.3 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

DeepRetention: A Deep Learning Approach for Intron Retention Detection

Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China

Zhenpeng Wu and Jiantao Zheng contribute equally to this paper.

Show Author Information

Abstract

As the least understood mode of alternative splicing, Intron Retention (IR) is emerging as an interesting area and has attracted more and more attention in the field of gene regulation and disease studies. Existing methods detect IR exclusively based on one or a few predefined metrics describing local or summarized characteristics of retained introns. These metrics are not able to describe the pattern of sequencing depth of intronic reads, which is an intuitive and informative characteristic of retained introns. We hypothesize that incorporating the distribution pattern of intronic reads will improve the accuracy of IR detection. Here we present DeepRetention, a novel approach for IR detection by modeling the pattern of sequencing depth of introns. Due to the lack of a gold standard dataset of IR, we first compare DeepRetention with two state-of-the-art methods, i.e. iREAD and IRFinder, on simulated RNA-seq datasets with retained introns. The results show that DeepRetention outperforms these two methods. Next, DeepRetention performs well when it is applied to third-generation long-read RNA-seq data, while IRFinder and iREAD are not applicable to detecting IR from the third-generation sequencing data. Further, we show that IRs predicted by DeepRetention are biologically meaningful on an RNA-seq dataset from Alzheimer’s Disease (AD) samples. The differential IRs are found to be significantly associated with AD based on statistical evaluation of an AD-specific functional gene network. The parent genes of differential IRs are enriched in AD-related functions. In summary, DeepRetention detects IR from a new angle of view, providing a valuable tool for IR analysis.

References

[1]
F. E. Baralle and J. Giudice, Alternative splicing as a regulator of development and tissue identity, Nat. Rev. Mol. Cell Biol., vol. 18, no. 7, pp. 437451, 2017.
[2]
D. Sulakhe, M. D’Souza, S. Wang, S. Balasubramanian, P. Athri, B. Q. Xie, S. Canzar, G. Agam, T. C. Gilliam, and N. Maltsev, Exploring the functional impact of alternative splicing on human protein isoforms using available annotation sources, Brief. Bioinform., vol. 20, no. 5, pp. 17541768, 2019.
[3]
H. D. Li, G. S. Omenn, and Y. F. Guan, A proteogenomic approach to understand splice isoform functions through sequence and expression-based computational modeling, Brief. Bioinform., vol. 17, no. 6, pp. 10241031, 2016.
[4]
Q. F. Chen, W. Li, P. H. Wu, L. J. Shen, and Z. L. Huang, Alternative splicing events are prognostic in hepatocellular carcinoma, Aging (Albany NY), vol. 11, no. 13, pp. 47204735, 2019.
[5]
D. P. Vanichkina, U. Schmitz, J. J. L. Wong, and J. E. J. Rasko, Challenges in defining the role of intron retention in normal biology and disease, Semin. Cell Dev. Biol., vol. 75, pp. 4049, 2018.
[6]
M. Parra, W. G. Zhang, J. Vu, M. Dewitt, and J. G. Conboy, Antisense targeting of decoy exons can reduce intron retention and increase protein expression in human erythroblasts, RNA, vol. 26, no. 8, pp. 9961005, 2020.
[7]
U. Braunschweig, N. L. Barbosa-Morais, Q. Pan, E. N. Nachman, B. Alipanahi, T. Gonatopoulos-Pournatzis, B. Frey, M. Irimia, and B. J. Blencowe, Widespread intron retention in mammals functionally tunes transcriptomes, Genome Res., vol. 24, no. 11, pp. 17741786, 2014.
[8]
T. Ni, W. J. Yang, M. Han, Y. B. Zhang, T. Shen, H. B. Nie, Z. H. Zhou, Y. L. Dai, Y. Q. Yang, and P. C. N. Liu, et al., Global intron retention mediated gene regulation during CD4+ T cell activation, Nucleic Acids Res., vol. 44, no. 14, pp. 68176829, 2016.
[9]
D. X. Zhang, Q. Hu, X. Z. Liu, Y. B. Ji, H. P. Chao, Y. Liu, A. Tracz, J. Kirk, S. Buonamici, P. Zhu, et al., Intron retention is a hallmark and spliceosome represents a therapeutic vulnerability in aggressive prostate cancer, Nat. Commun., vol. 11, no. 1, p. 2089, 2020.
[10]
J. T. Zheng, C. X. Lin, Z. Y. Fang, and H. D. Li, Intron retention as a mode for RNA-seq data analysis, Front. Genet., vol. 11, p. 586, 2020.
[11]
A. C. Smart, C. A. Margolis, H. Pimentel, M. X. He, D. N. Miao, D. Adeegbe, T. Fugmann, K. K. Wong, and E. M. Van Allen, Intron retention is a source of neoepitopes in cancer, Nat. Biotechnol., vol. 36, no. 11, pp. 10561058, 2018.
[12]
H. D. Li, C. C. Funk, K. McFarland, E. B. Dammer, M. Allen, M. M. Carrasquillo, Y. Levites, P. Chakrabarty, J. D. Burgess, X. Wang, et al., Integrative functional genomic analysis of intron retention in human and mouse brain with Alzheimer’s disease, Alzheimers Dement., vol. 17, no. 6, pp. 9841004, 2021.
[13]
W. Jiang and L. Chen, Alternative splicing: Human disease and quantitative analysis from high-throughput sequencing, Comput. Struct. Biotechnol. J., vol. 19, pp. 183195, 2021.
[14]
L. Broseus and W. Ritchie, Challenges in detecting and quantifying intron retention from next generation sequencing data, Comput. Struct. Biotechnol. J., vol. 18, pp. 501508, 2020.
[15]
H. D. Li, C. C. Funk, and N. D. Price, iREAD: A tool for intron retention detection from RNA-seq data, BMC Genomics, vol. 21, no. 1, p. 128, 2020.
[16]
R. Middleton, D. D. Gao, A. Thomas, B. Singh, A. Au, J. J. L. Wong, A. Bomane, B. Cosson, E. Eyras, J. E. J. Rasko, et al., IRFinder: Assessing the impact of intron retention on mammalian gene expression, Genome Biol., vol. 18, no. 1, p. 51, 2017.
[17]
L. Broseus and W. Ritchie, S-IRFindeR: stable and accurate measurement of intron retention, BioRxiv, .
[18]
K. Jaganathan, S. K. Panagiotopoulou, J. F. McRae, S. F. Darbandi, D. Knowles, Y. I. Li, J. A. Kosmicki, J. Arbelaez, W. W. Cui, G. B. Schwartz, et al., Predicting splicing from primary sequence with deep learning, Cell, vol. 176, no. 3, pp. 535548.e24, 2019.
[19]
P. Danecek, J. K. Bonfield, J. Liddle, J. Marshall, V. Ohan, M. O. Pollard, A. Whitwham, T. Keane, S. A. Mccarthy, R. M. Davies, et al., Twelve years of SAMtools and BCFtools, Gigascience, vol. 10, no. 2, p. giab008, 2021.
[20]
H. Wang, Garbage recognition and classification system based on convolutional neural network vgg16, in Proc. 2020 3rd Int. Conf. on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE), Shenzhen, China, 2020, pp. 252255.
[21]
A. Wiranata, S. A. Wibowo, R. Patmasari, R. Rahmania, and R. Mayasari, Investigation of padding schemes for faster R-CNN on vehicle detection, in Proc. 2018 Int. Conf. on Control, Electronics, Renewable Energy and Communications (ICCEREC), Bandung, Indonesia, 2018, pp. 208212.
[22]
Q. Le and T. Mikolov, Distributed representations of sentences and documents, in Proc. 31st Int. Conf. on Machine Learning, Beijing China, 2014, pp. II-1188II-1196.
[23]
G. R. Grant, M. H. Farkas, A. D. Pizarro, N. F. Lahens, J. Schug, B. P. Brunk, C. J. Stoeckert, J. B. Hogenesch, and E. A. Pierce, Comparative analysis of RNA-seq alignment algorithms and the RNA-seq Unified Mapper (RUM), Bioinformatics, vol. 27, no. 18, pp. 25182528, 2011.
[24]
B. Bai, C. M. Hales, P. C. Chen, Y. Gozal, E. B. Dammer, J. J. Fritz, X. S. Wang, Q. W. Xia, D. M. Duong, C. Street, et al., U1 small nuclear ribonucleoprotein complex and RNA splicing alterations in Alzheimer’s disease, Proc. Natl. Acad. Sci. USA, vol. 110, no. 41, pp. 1656216567, 2013.
[25]
R. E. Workman, A. D. Tang, P. S. Tang, M. Jain, J. R. Tyson, R. Razaghi, P. C. Zuzarte, T. Gilpatrick, A. Payne, J. Quick, et al., Nanopore native RNA sequencing of a human poly(A) transcriptome, Nat. Methods, vol. 16, no. 12, pp. 12971305, 2019.
[26]
H. Dvinge and R. K. Bradley, Widespread intron retention diversifies most cancer transcriptomes, Genome Med., vol. 7, no. 1, p. 45, 2015.
[27]
G. Yeo and C. B. Burge, Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals, J. Computat. Biol., vol. 11, nos. 2&3, pp. 377394, 2004.
[28]
Z. Birnbaum, On a use of the Mann-Whitney statistic, in Proc. 3r⁢d Berkeley Symp. on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics, Berkeley, CA, USA, 2020, pp. 1317.
[29]
C. T. Ong and S. Adusumalli, Increased intron retention is linked to Alzheimer’s disease, Neural Regenerat. Res., vol. 15, no. 2, pp. 259&260, 2020.
[30]
J. Yao, D. Ding, X. P. Li, T. Shen, H. H. Fu, H. Zhong, G. Wei, and T. Ni, Prevalent intron retention fine-tunes gene expression and contributes to cellular senescence, Aging Cell, vol. 19, no. 12, p. e13276, 2020.
[31]
Z. X. Lu, Q. Huang, J. W. Park, S. H. Shen, L. Lin, C. J. Tokheim, M. D. Henry, and Y. Xing, Transcriptome-wide landscape of pre-mRNA alternative splicing associated with metastatic colonization, Mol. Cancer Res., vol. 13, no. 2, pp. 305318, 2014.
[32]
I. Paz, I. Kosti, M. Jr. Ares, M. Cline, and Y. Mandel-Gutfreund., RBPmap: A web server for mapping binding sites of RNA-binding proteins, Nucleic Acids Res., vol. 42,no. W1, pp. W361W367, 2014.
[33]
H. Thorvaldsdóttir, J. T. Robinson, and J. P. Mesirov, Integrative Genomics Viewer (IGV): High-performance genomics data visualization and exploration, Brief. Bioinform., vol. 14, no. 2, pp. 178192, 2013.
[34]
J. T. Robinson, H. Thorvaldsdóttir, A. M. Wenger, A. Zehir, and J. P. Mesirov, Variant review with the integrative genomics viewer, Cancer Res., vol. 77, no. 21, pp. e31e34, 2017.
[35]
J. Liu, M. Li, W. Lan, F. X. Wu, Y. Pan, and J. X. Wang, Classification of Alzheimer’s disease using whole brain hierarchical network, IEEE/ACM Trans. Computat. Biol. Bioinform., vol. 15, no. 2, pp. 624632, 2018.
[36]
C. X. Lin, H. D. Li, C. Deng, S. Erhardt, J. Wang, X. Q. Peng, and J. X. Wang, AlzCode: A platform for multiview analysis of genes related to Alzheimer’s disease, Bioinformatics, vol. 38, no. 7, pp. 20302032, 2022.
[37]
M. D. Robinson, D. J. McCarthy, and G. K. Smyth, edgeR: A bioconductor package for differential expression analysis of digital gene expression data, Bioinformatics, vol. 26, no. 1, pp. 139&140, 2010.
[38]
C. X. Lin, H. D. Li, C. Deng, W. S. Liu, S. Erhardt, F. X. Wu, X. M. Zhao, Y. F. Guan, J. Wang, D. F. Wang, et al., An integrated brain-specific network identifies genes associated with neuropathologic and clinical traits of Alzheimer’s disease, Brief. Bioinform., vol. 23, no. 1, p. bbab522, 2022.
[39]
Z. P. Wu, J. T. Zheng, and H. D. Li, Identification of disease-associated genes based on differential intron retention, in Proc. 2020 IEEE Int. Conf. on Bioinformatics and Biomedicine (BIBM), Seoul, Republic of Korea, 2020, pp. 228231.
[40]
Y. Y. Zhou, B. Zhou, L. Pache, M. Chang, A. H. Khodabakhshi, O. Tanaseichuk, C. Benner, and S. K. Chanda, Metascape provides a biologist-oriented resource for the analysis of systems-level datasets, Nat. Commun., vol. 10, no. 1, p. 1523, 2019.
[41]
The Gene Ontology Consortium, The gene ontology resource: 20 years and still GOing strong, Nucleic Acids Res., vol. 47, no. D1, pp. D330D338, 2019.
[42]
R. L. Schnaar, Gangliosides of the vertebrate nervous system, J. Mol. Biol., vol. 428, no. 16, pp. 33253336, 2016.
[43]
M. Fang, P. Zhang, Y. X. Zhao, and X. Y. Liu, Bioinformatics and co-expression network analysis of differentially expressed lncRNAs and mRNAs in hippocampus of APP/PS1 transgenic mice with Alzheimer disease, Am. J. Transl. Res., vol. 9, no. 3, pp. 13811391, 2017.
[44]
D. W. Hampton, D. J. Webber, B. Bilican, M. Goedert, M. G. Spillantini, and S. Chandran, Cellmediated neuroprotection in a mouse model of human tauopathy, J. Neurosci., vol. 30, no. 30, pp. 99739983, 2010.
[45]
H. L. Lu, L. Liu, S. Han, B. B. Wang, J. Qin, K. L. Bu, Y. Z. Zhang, Z. Z. Li, L. N. Ma, J. Tian, et al., Expression of tiRNA and tRF in APP/PS1 transgenic mice and the change of related proteins expression, Ann. Transl. Med., vol. 9, no. 18, p. 1457, 2021.
[46]
M. S. Unger, E. Li, L. Scharnagl, R. Poupardin, B. Altendorfer, H. Mrowetz, B. Hutter-Paier, T. M. Weiger, M. T. Heneka, J. Attems, et al., CD8+ T-cells infiltrate Alzheimer’s disease brains and regulate neuronal-and synapse-related gene expression in APP-PS1 transgenic mice, Brain Behav. Immun., vol. 89, pp. 6786, 2020.
[47]
Y. B. Cui, S. S. Ma, C. Y. Zhang, W. Cao, M. Liu, D. P. Li, P. J. Lv, Q. Xing, R. N. Qu, and N. Yao, Human umbilical cord mesenchymal stem cells transplantation improves cognitive function in Alzheimer’s disease mice by decreasing oxidative stress and promoting hippocampal neurogenesis, Behav. Brain Res., vol. 320, pp. 291301, 2017.
[48]
S. Adusumalli, Z. K. Ngian, W. Q. Lin, T. Benoukraf, and C. T. Ong, Increased intron retention is a post-transcriptional signature associated with progressive aging and Alzheimer’s disease, Aging Cell, vol. 18, no. 3, p. e12928, 2019.
[49]
S. H. Wang, Q. H. Zhou, M. Yang, and Y. D. Zhang, ADVIAN: Alzheimer’s disease VGG-inspired attention network based on convolutional block attention module and multiple way data augmentation, Front. Aging Neurosci., vol. 13, p. 687456, 2021.
[50]
Y. Yan, X. J. Yao, S. H. Wang, and Y. D. Zhang, A survey of computer-aided tumor diagnosis based on convolutional neural network, Biology (Basel), vol. 10, no. 11, p. 1084, 2021.
Big Data Mining and Analytics
Pages 115-126
Cite this article:
Wu Z, Zheng J, Liu J, et al. DeepRetention: A Deep Learning Approach for Intron Retention Detection. Big Data Mining and Analytics, 2023, 6(2): 115-126. https://doi.org/10.26599/BDMA.2022.9020023

1556

Views

388

Downloads

4

Crossref

6

Web of Science

7

Scopus

0

CSCD

Altmetrics

Received: 11 March 2022
Revised: 07 June 2022
Accepted: 06 July 2022
Published: 25 January 2023
© The author(s) 2023.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return