The Composition of Dense Neural Networks and Formal Grammars for Secondary Structure Analysis

Semyon Grigorev, Semyon Grigorev, Polina Lunina, Polina Lunina

2019

Abstract

We propose a way to combine formal grammars and artificial neural networks for biological sequences processing. Formal grammars encode the secondary structure of the sequence and neural networks deal with mutations and noise. In contrast to the classical way, when probabilistic grammars are used for secondary structure modeling, we propose to use arbitrary (not probabilistic) grammars which simplifies grammar creation. Instead of modeling the structure of the whole sequence, we create a grammar which only describes features of the secondary structure. Then we use undirected matrix-based parsing to extract features: the fact that some substring can be derived from some nonterminal is a feature. After that, we use a dense neural network to process features. In this paper, we describe in details all the parts of our receipt: a grammar, parsing algorithm, and network architecture. We discuss possible improvements and future work. Finally, we provide the results of tRNA and 16s rRNA processing which shows the applicability of our idea to real problems.

Download


Paper Citation


in Harvard Style

Grigorev S. and Lunina P. (2019). The Composition of Dense Neural Networks and Formal Grammars for Secondary Structure Analysis. In Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2019) - Volume 3: BIOINFORMATICS; ISBN 978-989-758-353-7, SciTePress, pages 234-241. DOI: 10.5220/0007472302340241


in Bibtex Style

@conference{bioinformatics19,
author={Semyon Grigorev and Polina Lunina},
title={The Composition of Dense Neural Networks and Formal Grammars for Secondary Structure Analysis},
booktitle={Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2019) - Volume 3: BIOINFORMATICS},
year={2019},
pages={234-241},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007472302340241},
isbn={978-989-758-353-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2019) - Volume 3: BIOINFORMATICS
TI - The Composition of Dense Neural Networks and Formal Grammars for Secondary Structure Analysis
SN - 978-989-758-353-7
AU - Grigorev S.
AU - Lunina P.
PY - 2019
SP - 234
EP - 241
DO - 10.5220/0007472302340241
PB - SciTePress