A Deep Learning Method to Impute Missing Values and Compress Genome-wide Polymorphism Data in Rice

Tanzila Islam, Chyon Kim, Hiroyoshi Iwata, Hiroyuki Shimono, Akio Kimura, Hein Zaw, Chitra Raghavan, Hei Leung, Rakesh Singh

Abstract

Missing value imputation and compressing genome-wide DNA polymorphism data are considered as a challenging task in genomic data analysis. Missing data consists in the lack of information in a dataset that directly influences data analysis performance. The aim is to develop a deep learning model named Autoencoder Genome Imputation and Compression (AGIC) which can impute missing values and compress genome-wide polymorphism data using a separated neural network model to reduce the computational time. This research will challenge the construction of a model by using Autoencoder for genomic analysis, in other words, a fusion research between agriculture and information sciences. Moreover, there is no knowledge of missing value imputation and genome-wide polymorphism data compression using Separated Stacking Autoencoder Model. The main contributions are: (1) missing value imputation of genome-wide polymorphism data, (2) genome-wide polymorphism data compression of Rice DNA. To demonstrate the usage of AGIC model, real genome-wide polymorphism data from a rice MAGIC population has been used.

Download


Paper Citation


in Harvard Style

Islam T., Kim C., Iwata H., Shimono H., Kimura A., Zaw H., Raghavan C., Leung H. and Singh R. (2021). A Deep Learning Method to Impute Missing Values and Compress Genome-wide Polymorphism Data in Rice.In Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 3: BIOINFORMATICS, ISBN 978-989-758-490-9, pages 101-109. DOI: 10.5220/0010233901010109


in Bibtex Style

@conference{bioinformatics21,
author={Tanzila Islam and Chyon Kim and Hiroyoshi Iwata and Hiroyuki Shimono and Akio Kimura and Hein Zaw and Chitra Raghavan and Hei Leung and Rakesh Singh},
title={A Deep Learning Method to Impute Missing Values and Compress Genome-wide Polymorphism Data in Rice},
booktitle={Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 3: BIOINFORMATICS,},
year={2021},
pages={101-109},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010233901010109},
isbn={978-989-758-490-9},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 3: BIOINFORMATICS,
TI - A Deep Learning Method to Impute Missing Values and Compress Genome-wide Polymorphism Data in Rice
SN - 978-989-758-490-9
AU - Islam T.
AU - Kim C.
AU - Iwata H.
AU - Shimono H.
AU - Kimura A.
AU - Zaw H.
AU - Raghavan C.
AU - Leung H.
AU - Singh R.
PY - 2021
SP - 101
EP - 109
DO - 10.5220/0010233901010109