Abstract
A growing number of rna sequences are now known to exist in some distribution with two or more different stable structures. Recent algorithms attempt to reconstruct such mixtures using the list of nucleotides in a sequence in conjunction with auxiliary experimental footprinting data. In this paper, we demonstrate some challenges which remain in addressing this problem; in particular we consider the difficulty of reconstructing a mixture of two rna structures across a spectrum of different relative abundances. Although progress has been made in identifying the stable structures present, it remains nontrivial to predict the relative abundance of each within the experimentally sampled mixture. Because the ratio of structures present can change depending on experimental conditions, it is the footprinting data—and not the sequence—which must encode information on changes in the relative abundance. Here, we use simulated experimental data to demonstrate that there exist rna sequences and relative abundance combinations which cannot be recovered by current methods. We then prove that this is not a single exception, but rather part of the rule. In particular, we show, using a Nussinov–Jacobson model, that recovering the relative abundances is difficult for a large proportion of rna structure pairs. Lastly, we use information theory to establish a framework for quantifying how useful auxiliary data is in predicting the relative abundance of a structure. Together, these results demonstrate that aspects of the problem of reconstructing a mixture of rna structures from experimental data remain open.
Similar content being viewed by others
References
Clote P (2017) An efficient algorithm to compute the landscape of locally optimal RNA secondary structures with respect to the Nussinov–Jacobson energy model. J Comput Biol 12(1):83–101
Clote P, Kranakis E, Krizanc D, Stacho L (2007) Asymptotic expected number of base pairs in optimal secondary structure for random RNA using the Nussinov–Jacobson energy model. Discrete Appl Math 155(6):759–787
Cordero P, Das R (2015) Rich RNA structure landscapes revealed by mutate-and-map analysis. PLoS Comput Biol 11(11):e1004473
Deborah A, Jorge Natasha AN, Caffarena Ernesto R, Fabio P (2018) Using RNA sequence and structure for the prediction of riboswitch aptamer. A comprehensive review of available software and tools. Front Genet 8:231
Deigan KE, Li TW, Mathews DH, Weeks KM (2009) Accurate SHAPE-directed RNA structure determination. Proc Natl Acad Sci 106(1):97–102
Ding Y (2003) A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Res 31(24):7280–7301
Durrett R (2019) Probability: theory and examples, 5th edn. Cambridge University Press, Cambridge
Eddy SR (2014) Computational analysis of conserved RNA secondary structure in transcriptomes and genomes. Annu Rev Biophys 43(1):433–456 PMID: 24895857
Höbartner C, Micura R (2003) Bistable secondary structures of small RNAs and their structural probing by comparative imino proton NMR spectroscopy. J Mol Biol 325(3):421–431
Hofacker IL, Fontana W, Stadler PF, Bonhoeffer LS, Tacker M (1994) Fast folding and comparison of RNA secondary structures. Chem Mon 125(2):167–188
Leonard CW, Hajdin CE, Karabiber F, Mathews DH, Favorov OV, Dokholyan NV, Weeks KM (2013) Principles for understanding the accuracy of SHAPE-directed RNA structure modeling. Biochemistry 52(4):588–595 PMID: 23316814
Li TJX, Reidys CM (2019) On an enhancement of RNA probing data using information theory. arXiv preprint
Lorenz R, Bernhart SH, Höner zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL (2011) ViennaRNA package 2.0. Algorithms Mol Biol AMB 6:26
Lu Z, Cliff Zhang Q, Lee B, Flynn RA, Smith MA, Robinson JT, Davidovich C, Gooding AR, Goodrich KJ, Mattick JS (2016) RNA duplex map in living cells reveals higher-order transcriptome structure. Cell 165(5):1267–1279
Mathews DH (2004) Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization. RNA 10(8):1178–1190
Mathews DH, Turner DH (2006) Prediction of RNA secondary structure by free energy minimization. Curr Opin Struct Biol 16(3):270–278
McCaskill JS (1990) The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29(6–7):1105–1119
Nussinov R, Jacobson AB (1980) Fast algorithm for predicting the secondary structure of single-stranded RNA. Proc Natl Acad Sci U S A 77(11):6309–6313
Quarrier S, Martin JS, Davis-Neulander L, Beauregard A, Laederach A (2010) Evaluation of the information content of RNA structure mapping data for secondary structure prediction. RNA 16(6):1108–1117
Rice GM, Leonard CW, Weeks KM (2014) RNA secondary structure modeling at consistent high accuracy using differential shape. RNA 20(6):846–854
Rogers E, Heitsch CE (2014) Profiling small RNA reveals multimodal substructural signals in a Boltzmann ensemble. Nucleic Acids Res 42(22):171
Schroeder SJ (2018) Challenges and approaches to predicting RNA with multiple functional structures. RNA 24(12):1615–1624
Spasic A, Assmann SM, Bevilacqua PC, Mathews DH (2018) Modeling RNA secondary structure folding ensembles using SHAPE mapping data. Nucleic Acids Res 46(1):314–323
Sükösd Z, Swenson MS, Kjems J, Heitsch CE (2013) Evaluating the accuracy of SHAPE-directed RNA secondary structure predictions. Nucleic Acids Res 41(5):2807–2816
Turner DH, Mathews DH (2010) NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure. Nucleic Acids Res 38(1):280–282
Vieweger M, Nesbitt DJ (2018) Synergistic SHAPE/single-molecule deconvolution of RNA conformation under physiological conditions. Biophys J 114(8):1762–1775
Washietl S, Hofacker IL, Stadler PF, Kellis M (2012) RNA folding with soft constraints: reconciliation of probing data and thermodynamic secondary structure prediction. Nucleic Acids Res 40(10):4261–4272
Wasserman L (2004) All of statistics: a concise course in statistical inference, 1st edn. Springer, Berlin
Woods CT, Lackey L, Williams B, Dokholyan NV, Gotz D, Laederach A (2017) Comparative visualization of the RNA suboptimal conformational ensemble in vivo. Biophys J 113(2):290–301
Wuchty S, Fontana W, Hofacker IL, Schuster P (1999) Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers 49(2):145–165
Zarringhalam K, Meyer MM, Dotu I, Chuang JH, Clote P (2012) Integrating chemical footprinting data into RNA secondary structure prediction. PLoS ONE 7(10):1–13
Zuker M (1989) On finding all suboptimal foldings of an RNA molecule. Science 244(4900):48–52
Zuker M, Stiegler P (1981) Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res 9(1):133–148
Acknowledgements
The authors thank the anonymous referees for valuable comments improving the exposition of the paper and quality of the results. This work was supported by funds from the National Institutes of Health (R01GM126554 to CEH) and the National Science Foundation (DMS1344199 to CEH).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Greenwood, T., Heitsch, C.E. On the Problem of Reconstructing a Mixture of rna Structures. Bull Math Biol 82, 133 (2020). https://doi.org/10.1007/s11538-020-00804-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11538-020-00804-0