Backward Pattern Matching on Elastic Degenerate Strings

Petr Procházka, Ondřej Cvacho, Luboš Krčál, Jan Holub

2021

Abstract

Recently, the concept of Elastic Degenerate Strings (EDS) was introduced as a way of representing a sequenced population of the same species. Several on-line Elastic Degenerate String Matching (EDSM) algorithms were presented so far. Some of them provide practical implementation. We propose a new on-line EDSM algorithm BNDM-EDS. Our algorithm combines two traditional algorithms BNDM and the Shift-And that were adapted to the specifics needed by Elastic Degenerate Strings. BNDM-EDS is running in O(Nmdm w e) worst-case time. This implies O(Nm) time for small patterns, where m is the length of the searched pattern, N is the size of EDS, and w is the size of the computer word. The algorithm uses O(N + n) space, where n is the length of EDS. BNDM-EDS requires a simple preprocessing step with time and space O(m). Experimental results on real genomic data show superiority of BNDM-EDS over state-of-the-art algorithms.

Download


Paper Citation


in Harvard Style

Procházka P., Cvacho O., Krčál L. and Holub J. (2021). Backward Pattern Matching on Elastic Degenerate Strings. In Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2021) - Volume 3: BIOINFORMATICS; ISBN 978-989-758-490-9, SciTePress, pages 50-59. DOI: 10.5220/0010243600002865


in Bibtex Style

@conference{bioinformatics21,
author={Petr Procházka and Ondřej Cvacho and Luboš Krčál and Jan Holub},
title={Backward Pattern Matching on Elastic Degenerate Strings},
booktitle={Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2021) - Volume 3: BIOINFORMATICS},
year={2021},
pages={50-59},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010243600002865},
isbn={978-989-758-490-9},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2021) - Volume 3: BIOINFORMATICS
TI - Backward Pattern Matching on Elastic Degenerate Strings
SN - 978-989-758-490-9
AU - Procházka P.
AU - Cvacho O.
AU - Krčál L.
AU - Holub J.
PY - 2021
SP - 50
EP - 59
DO - 10.5220/0010243600002865
PB - SciTePress