A Sampling Approach for Multiple RNA Interaction - Finding Sub-optimal Solutions Fast

Saad Mneimneh, Syed Ali Ahmed

Abstract

The interaction of two RNA molecules involves a complex interplay between folding and binding that warranted recent developments in RNA-RNA interaction algorithms. These algorithms cannot be used to predict interaction structures when the number of RNAs is more than two. Our recent formulation of the multiple RNA interaction problem is based on a combinatorial optimization called Pegs and Rubber Bands, and has been successful in predicting structures that involve more than two RNAs. Even then, however, the optimal solution obtained does not necessarily correspond to the actual biological structure. Moreover, a structure produced by interacting RNAs may not be unique to start with. Multiple solutions (thus sub-optimal ones) are needed. We extend our previous approach to generate multiple sub-optimal solutions that was based on exhaustive enumeration. Here, a sampling approach for multiple RNA interaction is developed. Since not too many samples are needed to reveal solutions that are sufficiently different, sampling provides a much faster alternative. By clustering the sampled solutions, we are able to obtain representatives that correspond to the biologically observed structures. Specifically, our results for the U2-U6 complex and its introns in the spliceosome of yeast, and the CopA-CopT complex in E. Coli are consistent with published biological structures.

References

  1. Ahmed, S. A. and Mneimneh, S. (2014). Multiple rna interaction with sub-optimal solutions. In Bioinformatics Research and Applications, pages 149-162. Springer.
  2. Ahmed, S. A., Mneimneh, S., and Greenbaum, N. L. (2013a). A combinatorial approach for multiple rna interaction: Formulations, approximations, and heuristics. In Computing and Combinatorics, pages 421-433. Springer.
  3. Ahmed, S. A., Mneimneh, S., and Greenbaum, N. L. (2013b). A combinatorial approach for multiple rna interaction: Formulations, approximations, and heuristics. In Computing and Combinatorics, pages 421-433. Springer Berlin Heidelberg.
  4. Alkan, C., Karakoc, E., Nadeau, J. H., Sahinalp, S. C., and Zhang, K. (2006). Rna-rna interaction prediction and antisense rna target search. Journal of Computational Biology, 13(2):267-282.
  5. Andronescu, M., Zhang, Z. C., and Condon, A. (2005). Secondary structure prediction of interacting rna molecules. Journal of molecular biology, 345(5):987- 1001.
  6. Cao, S. and Chen, S.-J. (2006). Free energy landscapes of rna/rna complexes: with applications to snrna complexes in spliceosomes. Journal of molecular biology, 357(1):292-312.
  7. Chen, H.-L., Condon, A., and Jabbari, H. (2009). An o(n5) algorithm for mfe prediction of kissing hairpins and 4-chains in nucleic acids. Journal of Computational Biology, 16(6):803-815.
  8. Chitsaz, H., Backofen, R., and Sahinalp, S. C. (2009a). birna: Fast rna-rna binding sites prediction. In Algorithms in Bioinformatics, pages 25-36. Springer.
  9. Chitsaz, H., Salari, R., Sahinalp, S. C., and Backofen, R. (2009b). A partition function algorithm for interacting nucleic acid strands. Bioinformatics, 25(12):i365- i373.
  10. Ding, Y. and Lawrence, C. E. (2003). A statistical sampling algorithm for RNA secondary structure prediction. Nucleic Acids Research, 31(24):7280-7301.
  11. Dirks, R. M., Bois, J. S., Schaeffer, J. M., Winfree, E., and Pierce, N. A. (2007). Thermodynamic analysis of interacting nucleic acid strands. SIAM review, 49(1):65- 88.
  12. Durbin, R., Eddy, S. R., Krogh, A., and Mitchison, G. (1998). Biological sequence analysis: probabilistic models of proteins and nucleic acids, Chapter 11. Cambridge university press.
  13. Gallager, R. G. (2012). Discrete stochastic processes, Chapter 4, volume 321. Springer Science & Business Media.
  14. Geman, S. and Geman, D. (1984). Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. Pattern Analysis and Machine Intelligence, IEEE Transactions on, (6):721-741.
  15. Hastings, W. K. (1970). Monte carlo sampling methods using markov chains and their applications. Biometrika, 57(1):97-109.
  16. Huang, F. W., Qin, J., Reidys, C. M., and Stadler, P. F. (2009). Partition function and base pairing probabilities for rna-rna interaction prediction. Bioinformatics, 25(20):2646-2654.
  17. Jaccard, P. (1901). Etude comparative de la distribution florale dans une portion des Alpes et du Jura . Impr. Corbaz.
  18. Kolb, F. A., Engdahl, H. M., Slagter-Jäger, J. G., Ehresmann, B., Ehresmann, C., Westhof, E., Wagner, E. G. H., and Romby, P. (2000a). Progression of a looploop complex to a four-way junction is crucial for the activity of a regulatory antisense rna. The EMBO journal, 19(21):5905-5915.
  19. Kolb, F. A., Malmgren, C., Westhof, E., Ehresmann, C., Ehresmann, B., Wagner, E., and Romby, P. (2000b). An unusual structure formed by antisense-target rna binding involves an extended kissing complex with a four-way junction and a side-by-side helical alignment. Rna, 6(3):311-324.
  20. Li, A. X., Marz, M., Qin, J., and Reidys, C. M. (2011). Rnarna interaction prediction based on multiple sequence alignments. Bioinformatics, 27(4):456-463.
  21. Liu, J. S. (1994). The collapsed gibbs sampler in bayesian computations with applications to a gene regulation problem. Journal of the American Statistical Association, 89(427):958-966.
  22. McCaskill, J. S. (1990). The equilibrium partition function and base pair binding probabilities for rna secondary structure. Biopolymers, 29(6-7):1105-1119.
  23. Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21(6):1087-1092.
  24. Metzler, D. and Nebel, M. E. (2008). Predicting rna secondary structures with pseudoknots by mcmc sampling. Journal of mathematical biology, 56(1-2):161- 181.
  25. Meyer, I. M. (2008). Predicting novel rna-rna interactions. Current opinion in structural biology, 18(3):387-393.
  26. Mneimneh, S. (2009). On the approximation of optimal structures for rna-rna interaction. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 6(4):682-688.
  27. Mneimneh, S. and Ahmed, S. A. (2015). Multiple rna interaction: Beyond two. To appear in IEEE Transactions on NanoBioscience.
  28. Mneimneh, S., Ahmed, S. A., and Greenbaum, N. L. (2013). Multiple RNA interaction - formulations, approximations, and heuristics. In BIOINFORMATICS 2013 - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms, Barcelona, Spain, 11 - 14 February, 2013., pages 242- 249.
  29. Mückstein, U., Tafer, H., Hackermüller, J., Bernhart, S. H., Stadler, P. F., and Hofacker, I. L. (2006). Thermodynamics of rna-rna binding. Bioinformatics, 22(10):1177-1182.
  30. Newby, M. I. and Greenbaum, N. L. (2001). A conserved pseudouridine modification in eukaryotic u2 snrna induces a change in branch-site architecture. RNA, 7(06):833-845.
  31. Pervouchine, D. D. (2004). Iris: intermolecular rna interaction search. Genome Informatics Series, 15(2):92.
  32. Powers, D. M. (2011). Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation.
  33. Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20:53-65.
  34. Salari, R., Backofen, R., and Sahinalp, S. C. (2010). Fast prediction of rna-rna interaction. Algorithms for molecular Biology, 5(5).
  35. Sashital, D. G., Cornilescu, G., and Butcher, S. E. (2004). U2-u6 rna folding reveals a group ii intron-like domain and a four-helix junction. Nature structural & molecular biology, 11(12):1237-1242.
  36. Sun, J.-S. and Manley, J. L. (1995). A novel u2-u6 snrna structure is necessary for mammalian mrna splicing. Genes & Development, 9(7):843-854.
  37. Tong, W., Goebel, R., Liu, T., and Lin, G. (2013). Approximation algorithms for the maximum multiple rna interaction problem. In Combinatorial Optimization and Applications, pages 49-59. Springer.
  38. Tong, W., Goebel, R., Liu, T., and Lin, G. (2014). Approximating the maximum multiple rna interaction problem. Theoretical Computer Science.
  39. Wei, D., Alpert, L. V., and Lawrence, C. E. (2011). Rnag: a new gibbs sampler for predicting rna secondary structure for unaligned sequences. Bioinformatics, 27(18):2486-2493.
  40. Zhao, C., Bachu, R., Popovic, M., Devany, M., Brenowitz, M., Schlatterer, J. C., and Greenbaum, N. L. (2013). Conformational heterogeneity of the protein-free human spliceosomal u2-u6 snrna complex. RNA, 19(4):561-573.
Download


Paper Citation


in Harvard Style

Mneimneh S. and Ahmed S. (2016). A Sampling Approach for Multiple RNA Interaction - Finding Sub-optimal Solutions Fast . In Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 3: BIOINFORMATICS, (BIOSTEC 2016) ISBN 978-989-758-170-0, pages 75-84. DOI: 10.5220/0005707900750084


in Bibtex Style

@conference{bioinformatics16,
author={Saad Mneimneh and Syed Ali Ahmed},
title={A Sampling Approach for Multiple RNA Interaction - Finding Sub-optimal Solutions Fast},
booktitle={Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 3: BIOINFORMATICS, (BIOSTEC 2016)},
year={2016},
pages={75-84},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005707900750084},
isbn={978-989-758-170-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 3: BIOINFORMATICS, (BIOSTEC 2016)
TI - A Sampling Approach for Multiple RNA Interaction - Finding Sub-optimal Solutions Fast
SN - 978-989-758-170-0
AU - Mneimneh S.
AU - Ahmed S.
PY - 2016
SP - 75
EP - 84
DO - 10.5220/0005707900750084