loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Author: Kemal Efe

Affiliation: Retired Professor of Computer Engineering, Turkey

Keyword(s): Genome Sequencing, K-Mer Counting, De Bruijn Graph, De Novo Assembly, Next Generation Sequencing.

Abstract: Due to the sheer size of the input data, k-mer counting is a memory-intensive task. Existing methods to parallelize k-mer counting cannot guarantee equal block sizes. Consequently, when the largest block is too large for a processor’s local memory, the entire computation fails. This paper shows how to partition the input into approximately equal-sized blocks each of which can be processed independently. Initially, we consider how to map k-mers into a number of independent blocks such that block sizes follow a truncated normal distribution. Then, we show how to modify the mapping function to obtain an approximately uniform distribution. To prove the claimed statistical properties of block sizes, we refer to the central limit theorem, along with certain properties of Pascal’s quadrinomial triangle. This analysis yields a tight upper bound on block sizes, which can be controlled by changing certain parameters of the mapping function. Since the running time of the resulting algorithm is O(1) per k-mer, partitioning can be performed efficiently while reading the input data from the storage medium. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.129.208.25

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Efe, K. (2018). Robust K-Mer Partitioning for Parallel Counting. In Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018) - BIOINFORMATICS; ISBN 978-989-758-280-6; ISSN 2184-4305, SciTePress, pages 146-153. DOI: 10.5220/0006638801460153

@conference{bioinformatics18,
author={Kemal Efe.},
title={Robust K-Mer Partitioning for Parallel Counting},
booktitle={Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018) - BIOINFORMATICS},
year={2018},
pages={146-153},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006638801460153},
isbn={978-989-758-280-6},
issn={2184-4305},
}

TY - CONF

JO - Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018) - BIOINFORMATICS
TI - Robust K-Mer Partitioning for Parallel Counting
SN - 978-989-758-280-6
IS - 2184-4305
AU - Efe, K.
PY - 2018
SP - 146
EP - 153
DO - 10.5220/0006638801460153
PB - SciTePress