loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Author: Roman Snytsar

Affiliation: Microsoft Research, One Microsoft Way, Redmond WA 98052, U.S.A.

Keyword(s): Parallel Processing, Vectorization, Bioinformatics, FM-Index.

Abstract: Many modern sequence alignment tools implement fast string matching using the space efficient data structure called a FM-index. The succinct nature of this data structure presents unique challenges for the algorithm designers. In this paper, we explore the opportunities for parallelization of the exact and inexact matches, and present an efficient solution for the Occ portion of the algorithm that utilizes the instruction-level parallelism of the modern CPUs. Our implementation computes all eight Occ values required for the inexact match algorithm step in a single pass. We showcase the algorithm performance in a multi-core genome aligner and discuss effects of the memory prefetch.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.145.107.181

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Snytsar, R. (2019). Vectorized Character Counting for Faster Pattern Matching. In Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2019) - BIOINFORMATICS; ISBN 978-989-758-353-7; ISSN 2184-4305, SciTePress, pages 149-154. DOI: 10.5220/0007258201490154

@conference{bioinformatics19,
author={Roman Snytsar.},
title={Vectorized Character Counting for Faster Pattern Matching},
booktitle={Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2019) - BIOINFORMATICS},
year={2019},
pages={149-154},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007258201490154},
isbn={978-989-758-353-7},
issn={2184-4305},
}

TY - CONF

JO - Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2019) - BIOINFORMATICS
TI - Vectorized Character Counting for Faster Pattern Matching
SN - 978-989-758-353-7
IS - 2184-4305
AU - Snytsar, R.
PY - 2019
SP - 149
EP - 154
DO - 10.5220/0007258201490154
PB - SciTePress