243.70 respectively in datasets C and D. We argue
that the RNASP has a desirable mean runtime even
when the sequence size increases. RNASP is ideal for
designing longer sequences within a shorter period.
Modena recorded the longest amount of mean
time while incaRNAfbinv could not solve any se-
quence within the 60 seconds constraint.
8 CONCLUSIONS AND FUTURE
WORK
In this research, we have shown that Self-play can be
used to model an agent which learns to design RNA
sequences which fold to match a given target struc-
ture. By performing state evaluation using a Deep
Neural Network, we have shown that RNASP can
learn to design RNA sequences with desirable energy
and GC content values.
The RNASP recorded the best score on two bench-
mark datasets and the best run time on longer se-
quences. As future research, it would be interesting
to investigate if other encoding scheme or different
value network architecture or novel learning methods
would yield better results.
In addition, extension of Self-play to other real
world problems such as drug design, genetics, protein
folding and protein-protein interaction also remains
an interesting future research endeavor.
9 SUPPLEMENTARY MATERIAL
The code and data used in the research is available at
https://github.com/kobonyo/sprna
ACKNOWLEDGEMENTS
The authors would like to acknowledge the support
of the French Embassy in Kenya and Strathmore Uni-
versity in Nairobi. The two entities facilitated a work-
ing environment which enabled the success of this re-
search work.
REFERENCES
Andronescu, M., Fejes, A. P., Hutter, F., Hoos, H. H., and
Condon, A. (2004). A new algorithm for rna sec-
ondary structure design. Journal of molecular biology,
336(3):607–624.
Bai, Y. and Jin, C. (2020). Provable self-play algorithms
for competitive reinforcement learning. In Interna-
tional conference on machine learning, pages 551–
560. PMLR.
Berner, C., Brockman, G., Chan, B., Cheung, V., Debiak,
P., Dennison, C., Farhi, D., Fischer, Q., Hashme, S.,
Hesse, C., et al. (2019). Dota 2 with large scale deep
reinforcement learning. arXiv preprint 1912.06680.
Brockman, G., Cheung, V., Pettersson, L., Schneider, J.,
Schulman, J., Tang, J., and Zaremba, W. (2016). Ope-
nai gym.
Busch, A. and Backofen, R. (2006). Info-rna a fast approach
to inverse rna folding. Bioinformatics, 22(15):1823–
1831.
Cazenave, T., Chen, Y.-C., Chen, G.-W., Chen, S.-Y., Chiu,
X.-D., Dehos, J., Elsa, M., Gong, Q., Hu, H., Khali-
dov, V., et al. (2020). Polygames: Improved zero
learning. ICGA Journal, pages 1–13.
Cazenave, T. and Fournier, T. (2020). Monte carlo inverse
folding. arXiv preprint 2005.09961.
Esmaili-Taheri, A. and Ganjtabesh, M. (2015). Erd: a fast
and reliable tool for rna design including constraints.
BMC bioinformatics, 16(1):1–11.
Garcia-Martin, J. A., Clote, P., and Dotu, I. (2013).
Rnaifold: a constraint programming algorithm for
rna inverse folding and molecular design. Jour-
nal of bioinformatics and computational biology,
11(02):1350001.
Hofacker, I. L., Fontana, W., Stadler, P. F., Bonhoeffer,
L. S., Tacker, M., and Schuster, P. (1994). Fast folding
and comparison of rna secondary structures. Monat-
shefte f
¨
ur Chemie/Chemical Monthly, 125(2):167–
188.
Ioffe, S. and Szegedy, C. (2015). Batch normalization: Ac-
celerating deep network training by reducing internal
covariate shift. In International conference on ma-
chine learning, pages 448–456. PMLR.
Kleinkauf, R., Houwaart, T., Backofen, R., and Mann, M.
(2015a). antarna–multi-objective inverse folding of
pseudoknot rna using ant-colony optimization. BMC
bioinformatics, 16(1):1–7.
Kleinkauf, R., Mann, M., and Backofen, R. (2015b). an-
tarna: ant colony-based rna sequence design. Bioin-
formatics, 31(19):3114–3121.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Im-
agenet classification with deep convolutional neural
networks. Advances in neural information processing
systems, 25:1097–1105.
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998).
Gradient-based learning applied to document recogni-
tion. IEEE, 86(11):2278–2324.
Levin, A., Lis, M., Ponty, Y., O’donnell, C. W., Devadas,
S., Berger, B., and Waldisp
¨
uhl, J. (2012). A global
sampling approach to designing and reengineering
rna secondary structures. Nucleic acids research,
40(20):10041–10052.
Lorenz, R., Bernhart, S. H., H
¨
oner zu Siederdissen, C.,
Tafer, H., Flamm, C., Stadler, P. F., and Hofacker, I. L.
(2011). Viennarna package 2.0. Algorithms for molec-
ular biology, 6(1):1–14.
Designing RNA Sequences by Self-play
311