Tools GxE interactions
Curation of GxE interactions — listings under the GWAS Tools tab.
Summary Table
Click a column header to sort the table.
| NAME | CATEGORY | Main citation | YEAR |
|---|---|---|---|
| GEE-adaptive / GEE-joint | MISC | Chen Y et al., Genet Epidemiol, 2019 |
2019 |
| GPLEMMA | MISC | Kerin M et al., Bioinformatics, 2021 |
2021 |
| G×Escan | MISC | Gauderman WJ et al., Genet Epidemiol, 2013 |
2013 |
| IMRP-GxE | MISC | Zhu X et al., Nat Commun, 2024 |
2024 |
| LEMMA | MISC | Kerin M et al., Am J Hum Genet, 2020 |
2020 |
| SPAGxECCT | MISC | Ma Y et al., Nat Commun, 2025 |
2025 |
| StructLMM | MISC | Moore R et al., Nat Genet, 2019 |
2019 |
| Review | Review | Herrera-Luis E et al., Nat Rev Genet, 2024 |
2024 |
| Review | Review | Thomas D, Nat Rev Genet, 2010 |
2010 |
| Review | Review | Ottman R, Prev Med, 1996 |
1996 |
| Review | Review | Hunter DJ, Nat Rev Genet, 2005 |
2005 |
| Review | Review | Manuck SB et al., Annu Rev Psychol, 2014 |
2014 |
| Review | Review | Zhang X et al., Dev Psychopathol, 2022 |
2022 |
| Review | Review | Boyce WT et al., Proc Natl Acad Sci U S A, 2020 |
2020 |
MISC
GEE-adaptive / GEE-joint (GEE-GEWIS)
PUBMED_LINK
FULL NAME
Generalized estimating equation GEWIS with kinship and ancestry adjustment
DESCRIPTION
GEE-based gene-environment-wide interaction study (GEWIS) methods that adjust for phenotypic correlation due to kinship and for admixture via covariates, extending case-control G×E scans to related and admixed cohorts.
URL
KEYWORDS
GEWIS, GEE, kinship, admixture, population stratification, related individuals, G×E
TITLE
Extended methods for gene-environment-wide interaction scans in studies of admixed individuals with varying degrees of relationships.
Main citation
Chen Y, Adrianto I, Iannuzzi MC, Garman L, Montgomery CG, Rybicki BA, Levin AM, Li J. (2019) Extended methods for gene-environment-wide interaction scans in studies of admixed individuals with varying degrees of relationships. Genet Epidemiol, 43 (4) 414-426. doi:10.1002/gepi.22196. PMID 30793815
ABSTRACT
The etiology of many complex diseases involves both environmental exposures and inherited genetic predisposition as well as interactions between them. Gene-environment-wide interaction studies (GEWIS) provide a means to identify the interactions between genetic variation and environmental exposures that underlie disease risk. However, current GEWIS methods lack the capability to adjust for the potentially complex correlations in studies with varying degrees of relationships (both known and unknown) among individuals in admixed populations. We developed novel generalized estimating equation (GEE) based methods—GEE-adaptive and GEE-joint—to account for phenotypic correlations due to kinship while accounting for covariates, including measures of genome-wide ancestry. In simulation studies of admixed individuals, both methods controlled family-wise error rates, an advantage over the case-only approach. They demonstrated higher power than traditional case-control methods across a wide range of underlying alternative hypotheses, especially where both marginal and interaction effects were present. We applied the proposed method to conduct a GEWIS of a known sarcoidosis risk factor (insecticide exposure) and risk of sarcoidosis in African Americans and identified two novel loci with suggestive evidence of G×E interaction.
DOI
10.1002/gepi.22196
ARROW_SUMMARY
Inputs: individual genotypes; binary outcome; E; ancestry PCs/covariates; kinship or pedigree → GEE → GEE-adaptive or GEE-joint GEWIS
GPLEMMA
PUBMED_LINK
FULL NAME
Gaussian Prior Linear Environment Mixed Model Analysis
DESCRIPTION
GPLEMMA (Gaussian Prior Linear Environment Mixed Model Analysis) non-linear randomized Haseman-Elston regression method for flexible modeling of gene-environment interactions in large datasets such as the UK Biobank.
URL
TITLE
A non-linear regression method for estimation of gene-environment heritability.
Main citation
Kerin M, Marchini J. (2021) A non-linear regression method for estimation of gene-environment heritability. Bioinformatics, 36 (24) 5632-5639. doi:10.1093/bioinformatics/btaa1079. PMID 33367483
ABSTRACT
MOTIVATION: Gene-environment (GxE) interactions are one of the least studied aspects of the genetic architecture of human traits and diseases. The environment of an individual is inherently high dimensional, evolves through time and can be expensive and time consuming to measure. The UK Biobank study, with all 500 000 participants having undergone an extensive baseline questionnaire, represents a unique opportunity to assess GxE heritability for many traits and diseases in a well powered setting. RESULTS: We have developed a randomized Haseman-Elston non-linear regression method applicable when many environmental variables have been measured on each individual. The method (GPLEMMA) simultaneously estimates a linear environmental score (ES) and its GxE heritability. We compare the method via simulation to a whole-genome regression approach (LEMMA) for estimating GxE heritability. We show that GPLEMMA is more computationally efficient than LEMMA on large datasets, and produces results highly correlated with those from LEMMA when applied to simulated data and real data from the UK Biobank. AVAILABILITY AND IMPLEMENTATION: Software implementing the GPLEMMA method is available from https://jmarchini.org/gplemma/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
DOI
10.1093/bioinformatics/btaa1079
ARROW_SUMMARY
Inputs: individual genotypes + phenotypes; kinship/relatedness; env matrix → randomized Haseman–Elston + priors → ES + G×E heritability
G×Escan
PUBMED_LINK
FULL NAME
Genomewide interaction scan with EDG×E two-step screening and testing
DESCRIPTION
Two-step genome-wide interaction scan (EDG×E) and software implementing it plus other case-control GWIS methods; targets SNPs with weak marginal association that may show G×E in environment-defined subgroups.
URL
KEYWORDS
GWIS, case-control, two-step, EDG×E, weak marginal effect, G×E
TITLE
Finding novel genes by testing G×E interactions in a genome wide association study.
Main citation
Gauderman WJ, Zhang P, Morrison JL, Lewinger JP. (2013) Finding novel genes by testing G×E interactions in a genome wide association study. Genet Epidemiol, 37 (6) 603-613. doi:10.1002/gepi.21748. PMID 23873611
ABSTRACT
In a genome-wide association study (GWAS), investigators typically focus their primary analysis on the direct (marginal) associations of each single nucleotide polymorphism (SNP) with the trait. Some SNPs that are truly associated with the trait may not be identified in this scan if they have a weak marginal effect and thus low power to be detected. However, these SNPs may be quite important in subgroups of the population defined by an environmental or personal factor, and may be detectable if such a factor is carefully considered in a gene-environment (G×E) interaction analysis. We address the question "Using a genome wide interaction scan (GWIS), can we find new genes that were not found in the primary GWAS scan?" We review commonly used approaches for conducting a GWIS in case-control studies, and propose a new two-step screening and testing method (EDG×E) that is optimized to find genes with a weak marginal effect. We simulate several scenarios in which our two-step method provides 70-80% power to detect a disease locus while a marginal scan provides less than 5% power. We also provide simulations demonstrating that the EDG×E method outperforms other GWIS approaches (including case-only and previously proposed two-step methods) for finding genes with a weak marginal effect. G×Escan software implements this method as well as several other GWIS approaches.
DOI
10.1002/gepi.21748
ARROW_SUMMARY
Inputs: individual case-control genotypes; binary E [+ covariates] → EDG×E screen + test → G×Escan
IMRP-GxE
PUBMED_LINK
FULL NAME
Mendelian randomization-based genome-wide screening for gene–environment interactions
DESCRIPTION
Screens for combined gene–environment interaction and environmental mediation by testing departure of marginal GWAS effects from GWIS main effects using an MR-style statistic (IMRP), applicable to summary statistics from separate GWAS and interaction meta-analyses.
URL
KEYWORDS
GWAS, GWIS, G×E, Mendelian randomization, IMRP, summary statistics
TITLE
An approach to identify gene-environment interactions and reveal new biological insight in complex traits.
Main citation
Zhu X, Yang Y, Lorincz-Comi N, Li G, Bentley AR, de Vries PS, ...&, Aschard H. (2024) An approach to identify gene-environment interactions and reveal new biological insight in complex traits. Nat Commun, 15 (1) 3385. doi:10.1038/s41467-024-47806-3. PMID 38649715
ABSTRACT
There is a long-standing debate about the magnitude of the contribution of gene-environment interactions to phenotypic variations of complex traits owing to the low statistical power and few reported interactions to date. To address this issue, the Gene-Lifestyle Interactions Working Group within the Cohorts for Heart and Aging Research in Genetic Epidemiology Consortium has been spearheading efforts to investigate G×E in large and diverse samples through meta-analysis. Here, we present a powerful new approach to screen for interactions across the genome, an approach that shares substantial similarity to the Mendelian randomization framework. We identify and confirm 5 loci (6 independent signals) interacted with either cigarette smoking or alcohol consumption for serum lipids, and empirically demonstrate that interaction and mediation are the major contributors to genetic effect size heterogeneity across populations. The estimated lower bound of the interaction and environmentally mediated heritability is significant (P < 0.02) for low-density lipoprotein cholesterol and triglycerides in Cross-Population data. Our study improves the understanding of the genetic architecture and environmental contributions to complex traits.
DOI
10.1038/s41467-024-47806-3
ARROW_SUMMARY
Inputs: GWAS + GWIS summary stats (per-SNP β/SE; LD-pruned instruments; sample-overlap ρ if needed) → IMRP θ → T_MR-GxE (G×E + mediation)
LEMMA
PUBMED_LINK
FULL NAME
Linear Environment Mixed Model Analysis
DESCRIPTION
LEMMA (Linear Environment Mixed Model Analysis) is a whole genome wide regression method for flexible modeling of gene-environment interactions in large datasets such as the UK Biobank.
URL
TITLE
Inferring Gene-by-Environment Interactions with a Bayesian Whole-Genome Regression Model.
Main citation
Kerin M, Marchini J. (2020) Inferring Gene-by-Environment Interactions with a Bayesian Whole-Genome Regression Model. Am J Hum Genet, 107 (4) 698-713. doi:10.1016/j.ajhg.2020.08.009. PMID 32888427
ABSTRACT
The contribution of gene-by-environment (GxE) interactions for many human traits and diseases is poorly characterized. We propose a Bayesian whole-genome regression model for joint modeling of main genetic effects and GxE interactions in large-scale datasets, such as the UK Biobank, where many environmental variables have been measured. The method is called LEMMA (Linear Environment Mixed Model Analysis) and estimates a linear combination of environmental variables, called an environmental score (ES), that interacts with genetic markers throughout the genome. The ES provides a readily interpretable way to examine the combined effect of many environmental variables. The ES can be used both to estimate the proportion of phenotypic variance attributable to GxE effects and to test for GxE effects at genetic variants across the genome. GxE effects can induce heteroskedasticity in quantitative traits, and LEMMA accounts for this by using robust standard error estimates when testing for GxE effects. When applied to body mass index, systolic blood pressure, diastolic blood pressure, and pulse pressure in the UK Biobank, we estimate that 9.3%, 3.9%, 1.6%, and 12.5%, respectively, of phenotypic variance is explained by GxE interactions and that low-frequency variants explain most of this variance. We also identify three loci that interact with the estimated environmental scores (-log10p>7.3).
DOI
10.1016/j.ajhg.2020.08.009
ARROW_SUMMARY
Inputs: individual genotypes; quantitative Y; matrix of environmental measures → Bayesian WGR + ES → G×E variance + per-SNP tests
SPAGxECCT
PUBMED_LINK
FULL NAME
Retrospective genome-wide G×E with saddlepoint approximation and Cauchy combination test
DESCRIPTION
A scalable and accurate framework for large-scale genome-wide gene-environment interaction (G×E) analysis.
URL
KEYWORDS
G×E, genome-wide, saddlepoint approximation, Cauchy combination test, time-to-event, ordinal traits
TITLE
Efficient and accurate framework for genome-wide gene-environment interaction analysis in large-scale biobanks.
Main citation
Ma Y, Zhao Y, Zhang JF, Bi W. (2025) Efficient and accurate framework for genome-wide gene-environment interaction analysis in large-scale biobanks. Nat Commun, 16 (1) 3064. doi:10.1038/s41467-025-57887-3. PMID 40157913
ABSTRACT
Gene-environment interaction (G×E) analysis elucidates the interplay between genetic and environmental factors. Genome-wide association studies (GWAS) have expanded to encompass complex traits like time-to-event and ordinal traits, which provide richer phenotypic information. However, most existing scalable approaches focus only on quantitative or binary traits. Here we propose SPAGxECCT, a scalable and accurate framework for diverse trait types. SPAGxECCT fits a genotype-independent model and employs a hybrid strategy including saddlepoint approximation (SPA) for accurate p value calculation, especially for low-frequency variants and unbalanced phenotypic distributions. We extend SPAGxECCT to SPAGxEmixCCT, which accounts for population stratification and is applicable to multi-ancestry or admixed populations. SPAGxEmixCCT can further be extended to SPAGxEmixCCT-local, which identifies ancestry-specific G×E effects using local ancestry. Through extensive simulations and real data analyses of UK Biobank data, we demonstrate that SPAGxECCT and SPAGxEmixCCT are scalable to analyze large-scale study cohort, control type I error rates effectively, and maintain power.
DOI
10.1038/s41467-025-57887-3
ARROW_SUMMARY
Inputs: individual genotypes; phenotype (quant/binary/ordinal/survival); E; covariates [GRM for SPAGxE+; PCs/local ancestry for SPAGxEmixCCT] → residual + projection → SPA (± CCT) G×E
StructLMM
PUBMED_LINK
FULL NAME
Structured Linear Mixed Model
DESCRIPTION
Structured Linear Mixed Model (StructLMM) is a computationally efficient method to test for and characterize loci that interact with multiple environments
URL
TITLE
A linear mixed-model approach to study multivariate gene-environment interactions.
Main citation
Moore R, Casale FP, Jan Bonder M, Horta D, ...&, Stegle O. (2019) A linear mixed-model approach to study multivariate gene-environment interactions. Nat Genet, 51 (1) 180-186. doi:10.1038/s41588-018-0271-0. PMID 30478441
ABSTRACT
Different exposures, including diet, physical activity, or external conditions can contribute to genotype-environment interactions (G×E). Although high-dimensional environmental data are increasingly available and multiple exposures have been implicated with G×E at the same loci, multi-environment tests for G×E are not established. Here, we propose the structured linear mixed model (StructLMM), a computationally efficient method to identify and characterize loci that interact with one or more environments. After validating our model using simulations, we applied StructLMM to body mass index in the UK Biobank, where our model yields previously known and novel G×E signals. Finally, in an application to a large blood eQTL dataset, we demonstrate that StructLMM can be used to study interactions with hundreds of environmental variables.
DOI
10.1038/s41588-018-0271-0
ARROW_SUMMARY
Inputs: individual genotypes; quantitative Y; env feature matrix + env–env covariance → StructLMM → multivariate G×E
Review
Review
PUBMED_LINK
URL
TITLE
Gene-environment interactions in human health.
Main citation
Herrera-Luis E, Benke K, Volk H, Ladd-Acosta C, Wojcik GL. (2024) Gene-environment interactions in human health. Nat Rev Genet, 25 (11) 768-784. doi:10.1038/s41576-024-00731-z. PMID 38806721
ABSTRACT
Gene-environment interactions (G×E), the interplay of genetic variation with environmental factors, have a pivotal impact on human complex traits and diseases. Statistically, G×E can be assessed by determining the deviation from expectation of predictive models based solely on the phenotypic effects of genetics or environmental exposures. Despite the unprecedented, widespread and diverse use of G×E analytical frameworks, heterogeneity in their application and reporting hinders their applicability in public health. In this Review, we discuss study design considerations as well as G×E analytical frameworks to assess polygenic liability dependent on the environment, to identify specific genetic variants exhibiting G×E, and to characterize environmental context for these dynamics. We conclude with recommendations to address the most common challenges and pitfalls in the conceptualization, methodology and reporting of G×E studies, as well as future directions.
DOI
10.1038/s41576-024-00731-z
Review
PUBMED_LINK
URL
TITLE
Gene-environment-wide association studies: emerging approaches.
Main citation
Thomas D. (2010) Gene-environment-wide association studies: emerging approaches. Nat Rev Genet, 11 (4) 259-272. doi:10.1038/nrg2764. PMID 20212493
ABSTRACT
Despite the yield of recent genome-wide association (GWA) studies, the identified variants explain only a small proportion of the heritability of most complex diseases. This unexplained heritability could be partly due to gene-environment (G×E) interactions or more complex pathways involving multiple genes and exposures. This Review provides a tutorial on the available epidemiological designs and statistical analysis approaches for studying specific G×E interactions and choosing the most appropriate methods. I discuss the approaches that are being developed for studying entire pathways and available techniques for mining interactions in GWA data. I also explore methods for marrying hypothesis-driven pathway-based approaches with "agnostic" GWA studies.
DOI
10.1038/nrg2764
Review
PUBMED_LINK
TITLE
Gene-environment interaction: definitions and study designs.
Main citation
Ottman R. (1996) Gene-environment interaction: definitions and study designs. Prev Med, 25 (6) 764-70. doi:10.1006/pmed.1996.0117. PMID 8936580
ABSTRACT
Study of gene-environment interaction is important for improving accuracy and precision in the assessment of both genetic and environmental influences. This overview presents a simple definition of gene-environment interaction and suggests study designs for detecting it. Gene-environment interaction is defined as "a different effect of an environmental exposure on disease risk in persons with different genotypes," or, alternatively, "a different effect of a genotype on disease risk in persons with different environmental exposures." Under this strictly statistical definition, the presence or absence of interaction depends upon the scale of measurement (additive or multiplicative). The decision of which scale is appropriate will be governed by many factors, including the main objective of an investigation (discovery of etiology, public health prediction, etc.) and the hypothesized pathophysiologic model. Five biologically plausible models are described for the relations between genotypes and environmental exposures, in terms of their effects on disease risk. Each of these models leads to a different set of predictions about disease risk in individuals classified by presence or absence of a high-risk genotype and environmental exposure. Classification according to the exposure is relatively easy, using conventional epidemiologic methods. Classification according to the high-risk genotype is more difficult, but several alternative strategies are suggested.
DOI
10.1006/pmed.1996.0117
Review
PUBMED_LINK
TITLE
Gene-environment interactions in human diseases.
Main citation
Hunter DJ. (2005) Gene-environment interactions in human diseases. Nat Rev Genet, 6 (4) 287-98. doi:10.1038/nrg1578. PMID 15803198
ABSTRACT
Studies of gene-environment interactions aim to describe how genetic and environmental factors jointly influence the risk of developing a human disease. Gene-environment interactions can be described by using several models, which take into account the various ways in which genetic effects can be modified by environmental exposures, the number of levels of these exposures and the model on which the genetic effects are based. Choice of study design, sample size and genotyping technology influence the analysis and interpretation of observed gene-environment interactions. Current systems for reporting epidemiological studies make it difficult to assess whether the observed interactions are reproducible, so suggestions are made for improvements in this area.
DOI
10.1038/nrg1578
Review
PUBMED_LINK
TITLE
Gene-environment interaction.
Main citation
Manuck SB, McCaffery JM. (2014) Gene-environment interaction. Annu Rev Psychol, 65 () 41-70. doi:10.1146/annurev-psych-010213-115100. PMID 24405358
ABSTRACT
With the advent of increasingly accessible technologies for typing genetic variation, studies of gene-environment (G×E) interactions have proliferated in psychological research. Among the aims of such studies are testing developmental hypotheses and models of the etiology of behavioral disorders, defining boundaries of genetic and environmental influences, and identifying individuals most susceptible to risk exposures or most amenable to preventive and therapeutic interventions. This research also coincides with the emergence of unanticipated difficulties in detecting genetic variants of direct association with behavioral traits and disorders, which may be obscured if genetic effects are expressed only in predisposing environments. In this essay we consider these and other rationales for positing G×E interactions, review conceptual models meant to inform G×E interpretations from a psychological perspective, discuss points of common critique to which G×E research is vulnerable, and address the role of the environment in G×E interactions.
DOI
10.1146/annurev-psych-010213-115100
Review
PUBMED_LINK
TITLE
Three phases of Gene × Environment interaction research: Theoretical assumptions underlying gene selection.
Main citation
Zhang X, Belsky J. (2022) Three phases of Gene × Environment interaction research: Theoretical assumptions underlying gene selection. Dev Psychopathol, 34 (1) 295-306. doi:10.1017/S0954579420000966. PMID 32880244
ABSTRACT
Some Gene × Environment interaction (G×E) research has focused upon single candidate genes, whereas other related work has targeted multiple genes (e.g., polygenic scores). Each approach has informed efforts to identify individuals who are either especially vulnerable to the negative effects of contextual adversity (diathesis stress) or especially susceptible to both positive and negative contextual conditions (differential susceptibility). A critical step in all such molecular G×E research is the selection of genetic variants thought to moderate environmental influences, a subject that has not received a great deal of attention in critiques of G×E research (beyond the observation of small effects of individual genes). Here we conceptually distinguish three phases of G×E work based on the selection of genes presumed to moderate environmental effects and the theoretical basis of such decisions: (a) single candidate genes, (b) composited (multiple) candidate genes, and (c) GWAS-derived polygenic scores. This illustrative, not exhaustive, review makes it clear that implicit or explicit theoretical assumptions inform gene selection in ways that have not been clearly articulated or fully appreciated.
DOI
10.1017/S0954579420000966
Review
PUBMED_LINK
TITLE
Genes and environments, development and time.
Main citation
Boyce WT, Sokolowski MB, Robinson GE. (2020) Genes and environments, development and time. Proc Natl Acad Sci U S A, 117 (38) 23235-23241. doi:10.1073/pnas.2016710117. PMID 32967067
ABSTRACT
A now substantial body of science implicates a dynamic interplay between genetic and environmental variation in the development of individual differences in behavior and health. Such outcomes are affected by molecular, often epigenetic, processes involving gene-environment (G-E) interplay that can influence gene expression. Early environments with exposures to poverty, chronic adversities, and acutely stressful events have been linked to maladaptive development and compromised health and behavior. Genetic differences can impart either enhanced or blunted susceptibility to the effects of such pathogenic environments. However, largely missing from present discourse regarding G-E interplay is the role of time, a "third factor" guiding the emergence of complex developmental endpoints across different scales of time. Trajectories of development increasingly appear best accounted for by a complex, dynamic interchange among the highly linked elements of genes, contexts, and time at multiple scales, including neurobiological (minutes to milliseconds), genomic (hours to minutes), developmental (years and months), and evolutionary (centuries and millennia) time. This special issue of PNAS thus explores time and timing among G-E transactions: The importance of timing and timescales in plasticity and critical periods of brain development; epigenetics and the molecular underpinnings of biologically embedded experience; the encoding of experience across time and biological levels of organization; and gene-regulatory networks in behavior and development and their linkages to neuronal networks. Taken together, the collection of papers offers perspectives on how G-E interplay operates contingently within and against a backdrop of time and timescales.
DOI
10.1073/pnas.2016710117