Authors:
Matej Lexa
and
Stanislav Stefanic
Affiliation:
Masaryk University, Czech Republic
Keyword(s):
GWAS, SNPs, Biological Knowledge, Databases, Genotyping, Filtering
Related
Ontology
Subjects/Areas/Topics:
Bioinformatics
;
Biomedical Engineering
;
Data Mining and Machine Learning
;
Databases and Data Management
;
Genomics and Proteomics
;
Next Generation Sequencing
;
Structural Variations
Abstract:
Genome-wide association studies have become a standard way of discovering novel causative alleles by looking
for statisticaly significant associations in patient genotyping data. The present challenge for these methods
is to discover associations involving multiple interacting loci, a common phenomenon in diseases often related
to epistasis. The main problem is the exponential increase in necessary computational power for every
additional interacting locus considered in association tests. Several approaches have been proposed to manage
this problem, including limiting analysis to interacting pairs and filtering SNPs according to external biological
knowledge. Here we explore the possibilities of using public protein interaction data and pathway maps
to filter out only pairs of SNPs that are likely to interact, perhaps because of epistatic mechanisms working
at the protein level. After filtering all possible pairs of SNPs by their presence in common protein-protein
interactions or pro
teins sharing a metabolic or signalling pathway, we calculate the possible reduction in computational
requirements under different scenarios. We discuss these exploratory results in the context of the
so-called ”lost heredity” and the usefulness of this approach for similar scenarios.
(More)