Gene-gene Interaction Analysis by IAC (Interaction Analysis by Chi-Square) - A Novel Biological Constraint-based Interaction Analysis Framework

Sidney K. Chu, Samuel Guanglin Xu, Feng Xu, Nelson L. S. Tang

Abstract

In the recent years of the GWAS era, large-scale genotyping of million polymorphisms (SNPs) among thousands of patients have identified new disease predisposition loci. However, these conventional GWAS statistical models only analyse SNPs singularly and cannot detect significant SNP-SNP (gene-gene) interaction. Studies of interacting genetic variants (SNPs) are useful to elucidate a disease’s underlying biological pathway. Therefore, a powerful and efficient statistical model to detect SNP-SNP interaction is urgently needed. We hypothesize that among all the exhaustive model patterns of interaction (>100), only limited patterns are plausible based on the principle of protein-protein interaction (in the context of GWAS data analysis). The production of proteins by the process of translation of DNA predicts that gene-gene interaction resulting in a phenotype should only occur in classical genetic epistasis models, such as dominant-dominant, and recessive-recessive models. We developed a statistical analysis model, IAC (Interaction Analysis by Chi-Square), to examine such interactions. We then exhausted different population and statistical parameters, upon a total of 532 simulated case-control experiments to study the effects of these parameters on statistical power and type I error of using an interaction vs. singular SNP analysis. Our method has also detected potential pairwise interactions associated with Parkinson's disease that were previously undetected in conventional methods. We showed that the detection of SNP-SNP interaction is actually feasible using typical sample sizes found in common GWAS studies. This approach may be applied in complimentarily with other models in two-stage association tests to efficiently detect candidate SNPs for further study.

References

  1. Bondos SE, Catanese DJ, Tan X-X, et al. Hox transcription factor ultrabithorax Ib physically and genetically interacts with disconnected interacting protein 1, a double-stranded RNA-binding protein. J Biol Chem. 2004;279(25):26433-26444. doi:10.1074/jbc.M312842200.
  2. Boulesteix A-L, Janitza S, Kruppa J, König IR. Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics. Wiley Interdiscip Rev Data Min Knowl Discov. 2012;2(6):493-507.
  3. Bush WS, Moore JH. Chapter 11: Genome-wide association studies. PLoS Comput Biol. 2012;8(12):e1002822.
  4. Dummer PD, Limou S, Rosenberg AZ, et al. APOL1 Kidney Disease Risk Variants: An Evolving Landscape. Semin Nephrol. 2015;35(3):222-236.
  5. Emily M, Mailund T, Hein J, Schauser L, Schierup MH. Using biological networks to search for interacting loci in genome-wide association studies. Eur J Hum Genet. 2009;17(10):1231-1240.
  6. Evans DM, Marchini J, Morris AP, Cardon LR. Two-stage two-locus models in genome-wide association. PLoS Genet. 2006;2(9):e157.
  7. Feng T, Zhang S, Sha Q. Two-stage association tests for genome-wide association studies based on family data with arbitrary family structure. Eur J Hum Genet. 2007;15(11):1169-1175.
  8. Fung H-C, Scholz S, Matarin M, et al. Genome-wide genotyping in Parkinson's disease and neurologically normal controls: first stage analysis and public release of data. Lancet Neurol. 2006;5(11):911-916.
  9. Goodman JE, Mechanic LE, Luke BT, Ambs S, Chanock S, Harris CC. Exploring SNP-SNP interactions and colon cancer risk using polymorphism interaction analysis. Int J Cancer. 2006;118(7):1790-1797.
  10. Jones S, Thornton J.M, Encyclopedia of Life Sciences, John Wiley & Sons Ltd, 2011.
  11. Kam-Thong T, Czamara D, Tsuda K, et al. EPIBLASTER-fast exhaustive two-locus epistasis detection strategy using graphical processing units. Eur J Hum Genet. 2011;19(4):465-471.
  12. Lettre G, Lange C, Hirschhorn JN. Genetic model testing and statistical power in population-based association studies of quantitative traits. Genet Epidemiol. 2007;31(4):358-362.
  13. Li W, Reich J. A complete enumeration and classification of two-locus disease models. Hum Hered. 2000;50(6):334-349.
  14. Manolio TA, Collins FS, Cox NJ, et al. Finding the missing heritability of complex diseases. Nature. 2009;461(7265):747-753.
  15. Moore JH, Gilbert JC, Tsai C-T, et al. A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. J Theor Biol. 2006;241(2):252-261.
  16. Moore JH, Williams SM. Traversing the conceptual divide between biological and statistical epistasis: systems biology and a more modern synthesis. Bioessays. 2005;27(6):637-646.
  17. Musani SK, Shriner D, Liu N, et al. Detection of gene x gene interactions in genome-wide association studies of human population data. Hum Hered. 2007;63(2):67- 84. doi:10.1159/000099179.
  18. Ortlund EA, Bridgham JT, Redinbo MR, Thornton JW. Crystal structure of an ancient protein: evolution by conformational epistasis. Science. 2007;317(5844):1544-1548.
  19. Phillips PC. Epistasis-the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet. 2008;9:855-867.
  20. Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and populationbased linkage analyses. Am J Hum Genet. 2007;81(3):559-575.
  21. Raval A, Ray A. Introduction to Biological Networks. CRC Press; 2013.
  22. Schüpbach T, Xenarios I, Bergmann S, Kapur K. FastEpistasis: a high performance computing solution for quantitative trait epistasis. Bioinformatics. 2010;26(11):1468-1469.
  23. Schwarz DF, König IR, Ziegler A. On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data. Bioinformatics. 2010;26(14):1752-1758.
  24. Segrè D, Deluna A, Church GM, Kishony R. Modular epistasis in yeast metabolism. Nat Genet. 2005;37(1):77-83.
  25. Spencer CCA, Su Z, Donnelly P, Marchini J. Designing genome-wide association studies: sample size, power, imputation, and the choice of genotyping chip. PLoS Genet. 2009;5(5):e1000477.
  26. Stadler ZK, Thom P, Robson ME, et al. Genome-wide association studies of cancer. J Clin Oncol. 2010;28(27):4255-4267.
  27. Steingrimsson E, Arnheiter H, Hallsson JH, Lamoreux ML, Copeland NG, Jenkins NA. Interallelic Complementation at the Mouse Mitf Locus. Genetics. 2003;163(1):267-276.
  28. Tang W, Wu X, Jiang R, Li Y. Epistatic module detection for case-control studies: a Bayesian model with a Gibbs sampling strategy. PLoS Genet. 2009;5(5):e1000464.
  29. Thornton-Wells TA, Moore JH, Haines JL. Genetics, statistics and human disease: analytical retooling for complexity. Trends Genet. 2004;20(12):640-647.
  30. Turner SD, Bush WS. Multivariate analysis of regulatory SNPs: empowering personal genomics by considering cis-epistasis and heterogeneity. Pac Symp Biocomput. January 2011:276-287.
  31. Venturi GM, Bloecher A, Williams-Hart T, Tatchell K. Genetic Interactions Between GLC7, PPZ1 and PPZ2 in Saccharomyces cerevisiae. Genetics. 2000;155(1):69-83.
  32. Visscher PM, Brown MA, et al. Five years of GWAS discovery. Am J Hum Genet. 2012;90(1):7-24.
  33. Wan X, Yang C, Yang Q, et al. BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am J Hum Genet. 2010;87(3):325-340.
  34. Zhang X, Huang S, Zou F, Wang W. TEAM: efficient two-locus epistasis tests in human genome-wide association study. Bioinformatics. 2010;26:i217.
Download


Paper Citation


in Harvard Style

Chu S., Xu S., Xu F. and Tang N. (2016). Gene-gene Interaction Analysis by IAC (Interaction Analysis by Chi-Square) - A Novel Biological Constraint-based Interaction Analysis Framework . In Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 3: BIOINFORMATICS, (BIOSTEC 2016) ISBN 978-989-758-170-0, pages 142-150. DOI: 10.5220/0005654601420150


in Bibtex Style

@conference{bioinformatics16,
author={Sidney K. Chu and Samuel Guanglin Xu and Feng Xu and Nelson L. S. Tang},
title={Gene-gene Interaction Analysis by IAC (Interaction Analysis by Chi-Square) - A Novel Biological Constraint-based Interaction Analysis Framework},
booktitle={Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 3: BIOINFORMATICS, (BIOSTEC 2016)},
year={2016},
pages={142-150},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005654601420150},
isbn={978-989-758-170-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 3: BIOINFORMATICS, (BIOSTEC 2016)
TI - Gene-gene Interaction Analysis by IAC (Interaction Analysis by Chi-Square) - A Novel Biological Constraint-based Interaction Analysis Framework
SN - 978-989-758-170-0
AU - Chu S.
AU - Xu S.
AU - Xu F.
AU - Tang N.
PY - 2016
SP - 142
EP - 150
DO - 10.5220/0005654601420150