loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Ivan V. Kulakovskiy 1 ; Victor G. Levitsky 2 ; Dmitry G. Oschepkov 3 ; Ilya E. Vorontsov 4 and Vsevolod J. Makeev 5

Affiliations: 1 Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilov Institute of General Genetics and Russian Academy of Sciences, Russian Federation ; 2 nstitute of Cytology and Genetics of the Siberian Division of Russian Academy of Sciences, Faculty of Natural Sciences and Novosibirsk State University, Russian Federation ; 3 nstitute of Cytology and Genetics of the Siberian Division of Russian Academy of Sciences, Russian Federation ; 4 Vavilov Institute of General Genetics, Russian Academy of Sciences and Moscow Institute of Physics and Technology, Russian Federation ; 5 Vavilov Institute of General Genetics, Russian Academy of Sciences, State Research Institute of Genetics and Selection of Industrial Microorganisms and Moscow Institute of Physics and Technology, Russian Federation

ISBN: 978-989-8565-35-8

Keyword(s): Motif Discovery, Transcription Factor Binding Sites, TFBS Models, Positional Weight Matrices, PWM, ChIP Seq, Dinucleotide Composition.

Related Ontology Subjects/Areas/Topics: Algorithms and Software Tools ; Bioinformatics ; Biomedical Engineering ; Next Generation Sequencing ; Sequence Analysis

Abstract: Identification and consequent analysis of DNA sequence motifs recognized by transcription factors is an important component in studying transcriptional regulation in higher eukaryotes. In particular, motif discovery methods are applied to construct transcription factor binding sites (TFBSs) models. The TFBS models are then used for prediction of putative binding sites in genomic regions of interest. The most popular TFBS model is a positional weight matrix (PWM). The PWM is usually constructed from nucleotide positional frequencies estimated from a gapless multiple local alignments of experimentally identified TFBS sequences. Modern high-throughput experiments, like ChIP-Seq, provide enough data for careful training of more advanced models having more parameters. Until now, the majority of existing tools for TFBS prediction in ChIP-Seq data still rely on PWMs with independent positions. This is partly explained with only marginal improvement of specificity and sensitivity of TFBS reco gnition for advanced models over those based on traditional PWMs if trained on ChIP-Seq data. Here we present a novel computational tool, diChIPMunk (http://autosome.ru/dichipmunk/), which can construct dinucleotide PWMs accounting for neighboring nucleotide correlations in input sequences. diChIPMunk retains advantages of the published ChIPMunk algorithm, including usage of ChIP Seq peak shape and overall computational efficiency. Using public ChIP-Seq data for several TFs we show that carefully trained dinucleotide PWMs perform significantly better as compared to PWMs based on mononucleotide frequencies. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.226.251.81

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
V. Kulakovskiy, I.; G. Levitsky, V.; G. Oschepkov, D.; E. Vorontsov, I. and J. Makeev, V. (2013). Learning Advanced TFBS Models from Chip-Seq Data - diChIPMunk: Effective Construction of Dinucleotide Positional Weight Matrices.In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013) ISBN 978-989-8565-35-8, pages 146-150. DOI: 10.5220/0004238201460150

@conference{bioinformatics13,
author={Ivan V. Kulakovskiy. and Victor G. Levitsky. and Dmitry G. Oschepkov. and Ilya E. Vorontsov. and Vsevolod J. Makeev.},
title={Learning Advanced TFBS Models from Chip-Seq Data - diChIPMunk: Effective Construction of Dinucleotide Positional Weight Matrices},
booktitle={Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013)},
year={2013},
pages={146-150},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004238201460150},
isbn={978-989-8565-35-8},
}

TY - CONF

JO - Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms - Volume 1: BIOINFORMATICS, (BIOSTEC 2013)
TI - Learning Advanced TFBS Models from Chip-Seq Data - diChIPMunk: Effective Construction of Dinucleotide Positional Weight Matrices
SN - 978-989-8565-35-8
AU - V. Kulakovskiy, I.
AU - G. Levitsky, V.
AU - G. Oschepkov, D.
AU - E. Vorontsov, I.
AU - J. Makeev, V.
PY - 2013
SP - 146
EP - 150
DO - 10.5220/0004238201460150

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.