loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Li Jiangyu ; Liu Yang ; Wang Xiaolei ; Mao Yiqing ; Wang Yumin and Zhao Dongsheng

Affiliation: Academy of Military Medical Sciences, China

ISBN: 978-989-8565-35-8

Keyword(s): High-Throughput Sequencing, Metagenomics, RINS, Hadoop, MapReduce.

Related Ontology Subjects/Areas/Topics: Algorithms and Software Tools ; Bioinformatics ; Biomedical Engineering ; Next Generation Sequencing ; Sequence Analysis

Abstract: Sequencing data increase rapidly in recent years with the development of high-throughput sequencing technology. Using parallel computing to accelerate the computation is an important way to process the large volume of sequence data. RINS is a pipeline used to identify nonhuman sequences in deep sequencing datasets. It uses user-provided microbial reference genomes to reduce the number of reads to be processed and improve the processing speed. But all of its steps run serially. As a result, the processing speed of RINS slows down sharply as the sequencing data and reference genomes increase. In this article, we report a pipeline that processes sequencing data parallel through Hadoop. By comparing the runtime using same dataset, Hadoop-RINS is proved to be significantly faster than RINS with the same computation result.

PDF ImageFull Text

Download
Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.85.143.239

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Jiangyu, L.; Jiangyu, L.; Yang, L.; Xiaolei, W.; Yiqing, M.; Yumin, W. and Dongsheng, Z. (2013). Hadoop-RINS - A Hadoop Accelerated Pipeline for Rapid Nonhuman Sequence Identification.In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013) ISBN 978-989-8565-35-8, pages 296-299. DOI: 10.5220/0004239602960299

@conference{bioinformatics13,
author={Li Jiangyu. and Li Jiangyu. and Liu Yang. and Wang Xiaolei. and Mao Yiqing. and Wang Yumin. and Zhao Dongsheng.},
title={Hadoop-RINS - A Hadoop Accelerated Pipeline for Rapid Nonhuman Sequence Identification},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013)},
year={2013},
pages={296-299},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004239602960299},
isbn={978-989-8565-35-8},
}

TY - CONF

JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013)
TI - Hadoop-RINS - A Hadoop Accelerated Pipeline for Rapid Nonhuman Sequence Identification
SN - 978-989-8565-35-8
AU - Jiangyu, L.
AU - Jiangyu, L.
AU - Yang, L.
AU - Xiaolei, W.
AU - Yiqing, M.
AU - Yumin, W.
AU - Dongsheng, Z.
PY - 2013
SP - 296
EP - 299
DO - 10.5220/0004239602960299

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.