loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Mattia Marcolin ; Francesco Andreace and Matteo Comin

Affiliation: Department of Information Engineering, University of Padua, Via Gradenigo 6/a, 35100 Padua, Italy

Keyword(s): Genotyping, Mapping-free, k-mers Set Representation.

Abstract: Advances in sequencing technologies and computational methods have enabled rapid and accurate identification of genetic variants. Accurate genotype calls and allele frequency estimations are crucial for population genomics analyses. One of the most demanding step in the genotyping pipeline is mapping reads to the human reference genome. Recently mapping-free methods, like Lava and VarGeno, have been proposed for the genotyping problem. They are reported to perform 30 times faster than a standard alignment-based genotyping pipeline while achieving comparable accuracy. Moreover, these methods are able to include known genomic variants in the reference making read mapping, and genotyping variant-aware. However, in order to run they require a large k-mers database, of about 60GB, to be loaded in memory. In this paper we study the problem of genotyping using new efficient data structures based on k-mers set compression, and we present a fast mapping-free genotyping tool, named GenoLight. GenoLight reports accuracy results similar to the standard pipeline, but it is up to 8 times faster. Also, GenoLight uses between 5 to 10 times less memory than the other mapping-free tools, and it can be run on a laptop. Availability: https://github.com/CominLab/GenoLight. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.97.14.89

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Marcolin, M. ; Andreace, F. and Comin, M. (2022). Efficient k-mer Indexing with Application to Mapping-free SNP Genotyping. In Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - BIOINFORMATICS; ISBN 978-989-758-552-4; ISSN 2184-4305, SciTePress, pages 62-70. DOI: 10.5220/0010985700003123

@conference{bioinformatics22,
author={Mattia Marcolin and Francesco Andreace and Matteo Comin},
title={Efficient k-mer Indexing with Application to Mapping-free SNP Genotyping},
booktitle={Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - BIOINFORMATICS},
year={2022},
pages={62-70},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010985700003123},
isbn={978-989-758-552-4},
issn={2184-4305},
}

TY - CONF

JO - Proceedings of the 15th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2022) - BIOINFORMATICS
TI - Efficient k-mer Indexing with Application to Mapping-free SNP Genotyping
SN - 978-989-758-552-4
IS - 2184-4305
AU - Marcolin, M.
AU - Andreace, F.
AU - Comin, M.
PY - 2022
SP - 62
EP - 70
DO - 10.5220/0010985700003123
PB - SciTePress