loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Mirele C. S. F. Costa 1 ; João Victor A. Oliveira 2 ; Waldeyr M. C. da Silva 1 ; 3 ; Rituparno Sen 4 ; Jörg Fallmann 4 ; Peter F. Stadler 5 ; 6 ; 7 ; 4 and Maria Emília M. T. Walter 1

Affiliations: 1 University of Brasília (UnB), Brazil ; 2 Federal Institute of Brasília (IFB), Brazil ; 3 Federal Institute of Goiás (IFG), Brazil ; 4 Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany ; 5 German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany ; 6 Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany ; 7 Santa Fe Institute, Santa Fe, U.S.A.

Keyword(s): Small Nucleolar RNAs (snoRNAs), Non-Coding RNA Inference, Machine Learning, Chordate Genome.

Abstract: Machine learning (ML) methods are often used to identify members of non-coding RNA classes such as microRNAs or snoRNAs. However, ML methods have not been successfully used for homology search tasks. A systematic evaluation of ML in homology search requires large, controlled, and known ground truth test sets, and thus, methods to construct large realistic artificial data sets. Here we describe a method for producing sets of arbitrarily large and diverse snoRNA sequences based on artificial evolution. These are then used to evaluate supervised ML methods (Support Vector Machine, Artificial Neural Network, and Random Forest) for snoRNA detection in a chordate genome. Our results indicate that ML approaches can indeed be competitive also for homology search.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.119.104.238

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Costa, M.; Oliveira, J.; C. da Silva, W.; Sen, R.; Fallmann, J.; Stadler, P. and Walter, M. (2021). Machine Learning Studies of Non-coding RNAs based on Artificially Constructed Training Data. In Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2021) - BIOINFORMATICS; ISBN 978-989-758-490-9; ISSN 2184-4305, SciTePress, pages 176-183. DOI: 10.5220/0010346000002865

@conference{bioinformatics21,
author={Mirele C. S. F. Costa. and João Victor A. Oliveira. and Waldeyr M. {C. da Silva}. and Rituparno Sen. and Jörg Fallmann. and Peter F. Stadler. and Maria Emília M. T. Walter.},
title={Machine Learning Studies of Non-coding RNAs based on Artificially Constructed Training Data},
booktitle={Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2021) - BIOINFORMATICS},
year={2021},
pages={176-183},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010346000002865},
isbn={978-989-758-490-9},
issn={2184-4305},
}

TY - CONF

JO - Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2021) - BIOINFORMATICS
TI - Machine Learning Studies of Non-coding RNAs based on Artificially Constructed Training Data
SN - 978-989-758-490-9
IS - 2184-4305
AU - Costa, M.
AU - Oliveira, J.
AU - C. da Silva, W.
AU - Sen, R.
AU - Fallmann, J.
AU - Stadler, P.
AU - Walter, M.
PY - 2021
SP - 176
EP - 183
DO - 10.5220/0010346000002865
PB - SciTePress