AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (4.8 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

A Novel Structural Measure Separating Non-Coding RNAs from Genomic Backgrounds

School of Information Technology, Middle Georgia State University, Macon, GA 31206, USA.
Department of Plant Biology and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA.
Department of Computer Science and Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA.
Show Author Information

Abstract

RNA secondary structure has become the most exploitable feature for ab initio detection of non-coding RNA (ncRNA) genes from genome sequences. Previous work has used Minimum Free Energy (MFE) based methods developed to identify ncRNAs by measuring sequence fold stability and certainty. However, these methods yielded variable performances across different ncRNA species. Designing novel reliable structural measures will help to develop effective ncRNA gene finding tools. This paper introduces a new RNA structural measure based on a novel RNA secondary structure ensemble constrained by characteristics of native RNA tertiary structures. The new method makes it possible to achieve a performance leap from the previous structure-based methods. Test results on standard ncRNA datasets (benchmarks) demonstrate that this method can effectively separate most ncRNAs families from genome backgrounds.

References

[1]
Eddy S. R., Non-coding RNA genes and the modern RNA world, Nature Reviews Genetics, vol. 2, no. 12, pp. 919-929, 2001.
[2]
Wan Y., Kertesz M., Spitale R. C., Segal E., and Chang H. Y., Understanding the transcriptome through RNA structure, Nature Reviews Genetics, vol. 12, no. 9, pp. 641-655, 2011.
[3]
Ding Y. and Lawrence C., A statistical sampling algorithm for RNA secondary structure prediction, Nucleic Acids Research, vol. 31, no. 24, pp. 7280-7301, 2003.
[4]
Eddy S. R., Computational genomics of noncoding RNA genes, Cell, vol. 109, no. 2, pp. 137-140, 2002.
[5]
Gorodkin J., Hofacker I. L., Torarinsson E., Yao Z., Havgaard J. H., and Ruzzo W. L., De novo prediction of structured RNAs from genomic sequences, Trends in Biotechnology, vol. 28, no. 1, pp. 9-19, 2010.
[6]
Gorodkin J. and Hofacker I. L., From structure prediction to genomic screens for novel non-coding RNAs, PLoS Computational Biology, vol. 7, no. 8, p. e1002100, 2011.
[7]
Machado-Lima A., del Portillo H. A., and Durham A. M., Computational methods in noncoding RNA research, Journal of Mathematical Biology, vol. 56, nos. 1-2, pp. 15-49, 2008.
[8]
Uzilov A. V., Keegan J. M., and Mathews D. H., Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change, BMC Bioinformatics, vol. 7, p. 173, 2006.
[9]
Rivas E. and Eddy S. R., Noncoding RNA gene detection using comparative sequence analysis, BMC Bioinformatics, vol. 2, p. 8, 2001.
[10]
Washietl S., Hofacker I. L., and Stadler P. F., Fast and reliable prediction of noncoding RNAs, Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 7, pp. 2454-2459, 2005.
[11]
Washietl S., Hofacker I. L., Lukasser M., Hüttenhofer A., and Stadler P. F., Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome, Nature Biotechnology, vol. 23, no. 11, pp. 1383-1390, 2005.
[12]
Pedersen J. S., Bejerano G., Siepel A., Rosenbloom K., Lindblad-Toh K., Lander E. S., Kent J., Miller W., and Haussler D., Identification and classification of conserved RNA secondary structures in the human genome, PLoS Computational Biology, vol. 2, no. 4, p. e33, 2006.
[13]
Turner D. H., Sugimoto N., Kierzek R., and Dreiker S. D., Free energy increments for hydrogen bonds in nucleic acid base pairs, Journal of the American Chemical Society, vol. 109, no. 12, pp. 3783-3785, 1987.
[14]
Turner D., Sugimoto N., and Freier S., RNA structure prediction, Annual Review of Biophysics and Biophysical Chemistry, vol. 17, no. 1, pp. 167-192, 1988.
[15]
Zuker M. and Stiegler P., Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information, Nucleic Acids Research, vol. 9, no. 1, pp. 133-148, 1981.
[16]
Hofacker I. L., Fontana W., Stadler P. F., Bonhoeffer L. S., Tacker M., and Schuster P., Fast folding and comparison of RNA secondary structures, Monatshefte für Chemie / Chemical Monthly, vol. 125, no. 2, pp. 167-188, 1994.
[17]
Workman C. and Krogh A., No evidence that mRNAs have lower folding free energies than random sequences with the same dinucleotide distribution, Nucleic Acids Research, vol. 27, no. 24, pp. 4816-4822, 1999.
[18]
Rivas E. and Eddy S., Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs, Bioinformatics, vol. 16, no. 7, p. 583, 2000.
[19]
Bonnet E., Wuyts J., Rouzé P., and Van de Peer Y., Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences, Bioinformatics, vol. 20, no. 17, pp. 2911-2917, 2004.
[20]
Moulton V., Tracking down noncoding RNAs, Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 7, p. 2269, 2005.
[21]
McCaskill J. S., The equilibrium partition function and base pair binding probabilities for RNA secondary structure, Biopolymers, vol. 29, nos. 6&7, pp. 1105-1119, 1990.
[22]
Mathews D. H., Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization, RNA, vol. 10, no. 8, pp. 1178-1190, 2004.
[23]
Huynen M., Gutell R., and Konings D., Assessing the reliability of RNA folding using statistical mechanics, Journal of Molecular Biology, vol. 267, no. 5, pp. 1104-1112, 1997.
[24]
Freyhult E., Gardner P. P., and Moulton V., A comparison of RNA folding measures, BMC Bioinformatics, vol. 6, p. 241, 2005.
[25]
Batey R., Rambo R., and Doudna J., Tertiary motifs in RNA structure and folding, Angewandte Chemie, vol. 38, no. 16, pp. 2326-2343, 1999.
[26]
Lescoute A. and Westhof E., Topology of three-way junctions in folded RNAs, RNA, vol. 12, no. 1, pp. 83-93, 2006.
[27]
Walter A. E., Turner D. H., Kim J., Lyttle M. H., Muller P., Mathews D. H., and Zuker M., Coaxial stacking of helixes enhances binding of oligoribonucleotides and improves predictions of RNA folding, Proceedings of National Academy of Sciences, vol. 91, no. 20, pp. 9218-9222, 1994.
[28]
Hofacker I. L., Vienna RNA secondary structure server, Nucleic Acids Research, vol. 31, no. 13, pp. 3429-3431, 2003.
[29]
Markham N. R. and Zuker M., UNAFold: Software for nucleic acid folding and hybridization, in Bioinformatics Methods in Molecular Biology. Humana Press Inc, 2008, pp. 3-31.
[30]
Wang Y., Manzour A., Shareghi P., Shaw T. I., Li Y.-W., Malmberg R. L., and Cai L., Stable stem enabled Shannon entropies distinguish non-coding RNAs from random backgrounds, BMC Bioinformatics, vol. 13, no. Suppl. 5, p. S1, 2012.
[31]
Nawrocki E. P., Kolbe D. L., and Eddy S. R., Infernal 1.0: inference of RNA alignments, Bioinformatics, vol. 25, no. 10, pp. 1335-1337, 2009.
[32]
Dowell R. D. and Eddy S. R., Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction, BMC Bioinformatics, vol. 5, p. 71, 2004.
[33]
Rivas E., Lang R., and Eddy S. R., A range of complex probabilistic models for RNA secondary structure prediction that include the nearest-neighbor model and more, RNA, vol. 18, no. 2, pp. 193-212, 2011.
[34]
Popenda M., Szachniuk M., Blazewicz M., Wasik S., Burke E. K., Blazewicz J., and Adamiak R. W., RNA FRABASE 2.0: An advanced web-accessible database with the capacity to search the three-dimensional fragments within RNA structures, BMC Bioinformatics, vol. 11, p. 231, 2010.
[35]
Westhof E., Masquida B., and Jaeger L., RNA tectonics: Towards RNA design, Folding & Design, vol. 1, no. 4, pp. R78-R88, 1996.
[36]
Masquida B. and Westhof E., A modular and hierarchical approach for all-atom RNA modeling, in The RNA World, 3rd ed., Gesteland R., Cech T., and Atkins J., Eds. Cold Spring Harbor Laboratory Press, 2006, pp. 659-681.
[37]
Deras M. L., Brenowitz M., Ralston C. Y., Chance M. R., and Woodson S. A., Folding mechanism of the Tetrahymena ribozyme P4-P6 domain, Biochemistry, vol. 39, no. 36, pp. 10 975-10 985, 2000.
[38]
Bindewald E., Hayes R., Yingling Y. G., Kasprzak W., and Shapiro B. A., RNAJunction: A database of RNA junctions and kissing loops for three-dimensional structural analysis and nanodesign, Nucleic Acids Research, vol. 36, no. Database issue, pp. D392-D397, 2008.
[39]
Laing C., Jung S., Iqbal A., and Schlick T., Tertiary motifs revealed in analyses of higher-order RNA junctions, Journal of Molecular Biology, vol. 393, no. 1, pp. 67-82, 2009.
[40]
Durbin R., Eddy S. R., Krogh A., and Mitchison G., Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge, UK: Cambridge University Press, 1998.
[41]
Eddy S. and Durbin R., RNA sequence analysis using covariance models, Nucleic Acids Research, vol. 22, no. 11, pp. 2079-2088, 1994.
[42]
Knudsen B., Pfold: RNA secondary structure prediction using stochastic context-free grammars, Nucleic Acids Research, vol. 31, no. 13, pp. 3423-3428, 2003.
[43]
Griffiths-Jones S., Bateman A., Marshall M., Khanna A., and Eddy S. R., Rfam: An RNA family database, Nucleic Acids Research, vol. 31, no. 1, pp. 439-441, 2003.
[44]
Griffiths-Jones S., Moxon S., Marshall M., Khanna A., Eddy S. R., and Bateman A., Rfam: Annotating non-coding RNAs in complete genomes, Nucleic Acids Research, vol. 33, no. Database issue, pp. D121-D124, 2005.
[45]
Kolmogorov A. N., Sulla determinazione empirica di una legge di distribuzione, G. Inst. Ital. Attuari., vol. 4, no. 1, pp. 83-91, 1933.
[46]
Clote P., Ferré F., Kranakis E., and Krizanc D., Structural RNA has lower folding energy than random RNA of the same dinucleotide frequency, RNA, vol. 11, no. 5, pp. 578-591, 2005.
[47]
Tukey J. W., Exploratory Data Analysis. Addison-Wesley, 1977.
[48]
Laserson U., Gan H. H., and Schlick T., Predicting candidate genomic sequences that correspond to synthetic functional RNA motifs, Nucleic Acids Research, vol. 33, no. 18, pp. 6057-6069, 2005.
[49]
Salari R., Aksay C., Karakoc E., Unrau P. J., Hajirasouliha I., and Sahinalp S. C., smyRNA: A novel Ab initio ncRNA gene finder, PloS One, vol. 4, no. 5, p. e5433, 2009.
Tsinghua Science and Technology
Pages 474-483
Cite this article:
Wang Y, Malmberg RL, Cai L. A Novel Structural Measure Separating Non-Coding RNAs from Genomic Backgrounds. Tsinghua Science and Technology, 2015, 20(5): 474-483. https://doi.org/10.1109/TST.2015.7297746

510

Views

9

Downloads

0

Crossref

N/A

Web of Science

0

Scopus

0

CSCD

Altmetrics

Received: 24 June 2015
Accepted: 24 July 2015
Published: 13 October 2015
The author(s) 2015
Return