Fast Exact String to D-Texts Alignments

Njagi Mwaniki, Erik Garrison, Nadia Pisanti

2023

Abstract

A D-strings is a degenerate string representing similar and aligned strings by collapsing common fragments and highlighting variants. D-strings can represent a MSA or a pan-genome. In this paper we propose a new, fast and exact method to align a string to a D-string. In recent years, aligning a sequence to a pangenome has become a central problem in computational genomics and pangenomics. A fast and accurate solution to this problem can serve as a toolkit to many crucial tasks such as read-correction, Multiple Sequences Alignment (MSA), genome assemblies, and variant calling, just to name a few. An implementation of our tool is publicly available on github at https://github.com/urbanslug/dsa.

Download


Paper Citation


in Harvard Style

Mwaniki N., Garrison E. and Pisanti N. (2023). Fast Exact String to D-Texts Alignments. In Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - Volume 3: BIOINFORMATICS; ISBN 978-989-758-631-6, SciTePress, pages 70-79. DOI: 10.5220/0011666900003414


in Bibtex Style

@conference{bioinformatics23,
author={Njagi Mwaniki and Erik Garrison and Nadia Pisanti},
title={Fast Exact String to D-Texts Alignments},
booktitle={Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - Volume 3: BIOINFORMATICS},
year={2023},
pages={70-79},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011666900003414},
isbn={978-989-758-631-6},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - Volume 3: BIOINFORMATICS
TI - Fast Exact String to D-Texts Alignments
SN - 978-989-758-631-6
AU - Mwaniki N.
AU - Garrison E.
AU - Pisanti N.
PY - 2023
SP - 70
EP - 79
DO - 10.5220/0011666900003414
PB - SciTePress