GeneStat home

Genetic Association Studies

Statistical modelling

Resources & Links

List of Authors

edit SideBar

Search

Gene-gene interaction

1. Problem setup

It is widely recognized that statistical modeling can only play a limited role in helping to understand biological interaction. However, statistical modeling of interactions involving genes may be helpful in identifying genes influencing disease susceptibility which otherwise would remain unidentified.

2. Statistical methods

Marchini et al. (2005) have demonstrated for a variety of plausible qualitative disease models that simultaneous consideration of two unlinked SNPs and their corresponding 9 two-marker genotypes is a powerful strategy. The problem of considering all marker pairs in a GWAS still is running time. Powerful machines are needed to deliver the genotype contingency tables for all marker pairs. Two stage approaches can reduce the computational burden. Note that the goal of Marchini et al. is to improve power by considering markers simultaneously, and not to test for interaction per se.

For quantitative traits gene-gene interaction is understood as departure from additivity on the scale in focus. The additive structure is motivated by independence. For qualitative traits Cordell et al. (2001) consider various models that vary according to specification of the (link) function which relates a linear function of covariates, ß'x (where x is a vector including gene and environmental effects and their interactions) to the probability of disease. The models considered are:

1. Additive model: pr(y = 1) = ß'x
2. Multiplicative model: log(pr(y = 1)) = ß'x
3. Heterogeneity model: -log(1-pr(y = 1)) = ß'x
4. Additive model for liability: probit(pr(y = 1)) = ß'x
5. Additive model for log odds: logit(pr(y = 1)) = ß'x

Cordell et al. (2001) compare the fit of the saturated model to the main effects model based on each of the above link functions, and show that whether the interaction is deemed significant depend on which link function is assumed.

The fact that an adequate fit to a statistical main effects model (without interactions) does not necessarily imply biological independence is seen by considering the multiplicative model (model 5 above with covariates in x only including main effects of genotype ßg and environment ße):

logit[p(y = 1|g = Aa, e = 1)] = ß0 + ßg + ße
logit[p(y = 1|g = Aa, e = 0)] = ß0 + ßg
logit[p(y = 1|g = AA, e = 1)] = ß0 + ße

In this case ORAa=exp(ßg), OR1=exp(ße) and ORAa,1=exp(ßge), i.e. ORAa,1 = ORAa *OR1. Hence, in a multiplicative model, although there is considered to be no statistical interaction there is still biological interaction.

Closely related to the interaction concept is the concept of epistasis, with varying conflicting definitions discussed by Cordell (2002). Synergistic and antagonistic epistasis are easily understood, while pure epistasis, i.e. when no marginal effects can be detected, requires a perfect balance between penetrances and frequencies for which there is no reasonable genetic force. (Wang et al 2005).

Musani et al (2007) in a review paper classifies methods for detecting epistatis into regression-based (e.g. Focused Interaction Testing Framework (FITF) (Millstein et al 2006)), data reduction-based (e.g Multifactor Dimensional Reduction (MDR) (Hahn et al 2003)) and pattern recognition methods (e.g. Parameter Decreasing Method (PDM) (Motsinger et al 2006)).

References

  • Cordell, HJ, Todd JA, Hill NJ, Lord CJ, Lyons PA, Peterson LB, Wicker LS, Clayton DG (2001) Statistical modeling of interlocus interactions in a complex disease: rejection of the multiplicative model of Epistasis in Type I Diabetes. Genetics. 158: 357-367.
  • Cordell HJ (2002) Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. Hum. Mol. Gen. 11, 2463-2468.
  • Hahn, LW, Ritchie, MD, and Moore, JH. Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions. Bioinformatics, 19, 376-382. (2003)
  • Marchini J, Donnelly P, Cardon LR (2005) Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet 37:413-417
  • Millstein J, Conti DV, Gililand FD, Gauderman WJ (2006). A testing framework for identifying susceptibility genes in the presence of epistasis. Am J HUM Genet 64:413-417
  • Motsinger AA, Dudek SM, Hahn LW, Ritchie MD: Comparison of neural network optimization for studies of human genetics. (2006). Lect Notes comp Sci 3907:103-114
  • Musani SK, Shriner D, Liu N, Feng R, Coffey CS, Yi N, Tiwari HK, Allison DB (2007) Detection of gene x gene interactions in genome-wide association studies of human population data. Hum Hered. 63:67-84

Page Actions

Recent Changes

Group & Page

Back Links