Skip to content

AI GWAS Imaging GWAS

Curation of Imaging GWAS within GWAS — listings under the AI tab.

Image-based Phenotyping for GWAS

Methods using computer vision to extract quantitative traits from medical images for genetic association:

  • Supervised CNNs (2021-2023): U-Net-based segmentation of organs (Liu PMID 34128465, eLife 2021), ResNet transferred from ImageNet for trait regression (Haas PMID 34957434, Cell Genomics 2021; Pirruccello PMID 35637384, Nat Genet 2022; Khurshid PMID 36944631, Nat Commun 2023).
  • Self-supervised (2024): Contrastive learning (SimCLR-style CNN encoder) enabling GWAS directly from image embeddings without manual trait definition (iGWAS, Kirchler et al. PMID 39020183, Nat Genet 2024).

Trend: supervised organ-specific CNNs → self-supervised whole-image representation learning.

Summary Table

Click a column header to sort the table.

NAME Main citation YEAR
DL Thoracic Aorta GWAS
Pirruccello JP et al., Nat Genet, 2022
2022
Haas ME
Haas ME et al., Cell Genom, 2021
2021
Khurshid S
Khurshid S et al., Nat Commun, 2023
2023
Liu Y
Liu Y et al., Elife, 2021
2021
Ning C
Ning C et al., Nat Commun, 2023
2023
iGWAS
Xie Z et al., PLoS Genet, 2024
2024
transferGWAS
Kirchler M et al., Bioinformatics, 2022
2022

DL Thoracic Aorta GWAS (DL Aorta GWAS)

AI GWAS Deep Learning Medical Imaging UK Biobank Nat Genet
PUBMED_LINK
34837083
FULL NAME
Deep Learning Enables Genetic Analysis of the Human Thoracic Aorta
DESCRIPTION
Applied a pretrained CNN (transferred from natural image recognition, e.g. ResNet/Inception-like architecture) to 4.6 million cardiac MRI images from UK Biobank, trained on only 116 manually annotated samples to regress ascending and descending thoracic aorta dimensions. GWAS identified 82 ascending and 47 descending aorta loci. Demonstrates transfer learning from natural images to medical imaging for rapid biobank-scale phenotyping.
KEYWORDS
deep learning, CNN, ImageNet transfer learning, cardiac MRI, thoracic aorta, image regression, UK Biobank
TITLE
Deep learning enables genetic analysis of the human thoracic aorta.
ABSTRACT
Enlargement or aneurysm of the aorta predisposes to dissection, an important cause of sudden death. We trained a deep learning model to evaluate the dimensions of the ascending and descending thoracic aorta in 4.6 million cardiac magnetic resonance images from the UK Biobank. We then conducted genome-wide association studies in 39,688 individuals, identifying 82 loci associated with ascending and 47 with descending thoracic aortic diameter. Transcriptome-wide analyses, rare-variant burden tests and human aortic single nucleus RNA sequencing prioritized genes including FBN1 and MFAP5.
DOI
10.1038/s41588-021-00962-4

Haas ME (ML Liver Fat GWAS)

AI GWAS Imaging Machine Learning Liver Fat Abdominal MRI UK Biobank
PUBMED_LINK
34957434
FULL NAME
Machine Learning Enables New Insights into Genetic Contributions to Liver Fat Accumulation
DESCRIPTION
Developed an abdominal MRI-based machine-learning regression model (gradient-boosted regression on raw MRI signal intensities) to accurately estimate liver fat from UK Biobank abdominal MRI scans (correlation 0.97-0.99 with ground truth). Trained on 4,511 participants with gold-standard MRI biomarker measurements and applied to 32,192 additional individuals. GWAS identified 8 associated variants (5 novel: MTARC1, ADH1B, TRIB1, GPAM, MAST3) and a polygenic score strongly associated with future chronic liver disease risk (HR>1.32 per SD, p<9e-17).
KEYWORDS
MRI signal regression, liver fat quantification, abdominal MRI, hepatic steatosis, gradient boosting, UK Biobank
TITLE
Machine learning enables new insights into genetic contributions to liver fat accumulation.
Main citation
Haas ME, Pirruccello JP, Friedman SN, Wang M, ...&, Khera AV. (2021) Machine learning enables new insights into genetic contributions to liver fat accumulation. Cell Genom, 1 (3). doi:10.1016/j.xgen.2021.100066. PMID 34957434
ABSTRACT
Excess liver fat, called hepatic steatosis, is a leading risk factor for end-stage liver disease and cardiometabolic diseases but often remains undiagnosed in clinical practice because of the need for direct imaging assessments. We developed an abdominal MRI-based machine-learning algorithm to accurately estimate liver fat from a truth dataset of 4,511 middle-aged UK Biobank participants, enabling quantification in 32,192 additional individuals. A genome-wide association study of common genetic variants and liver fat replicated three known associations and identified five newly associated variants.
DOI
10.1016/j.xgen.2021.100066

Khurshid S (DL LV Mass GWAS)

AI GWAS Imaging Deep Learning Cardiac MRI Left Ventricular Mass UK Biobank
PUBMED_LINK
36944631
FULL NAME
Clinical and Genetic Associations of Deep Learning-Derived Cardiac Magnetic Resonance-Based Left Ventricular Mass
DESCRIPTION
Applied a CNN-based segmentation model (U-Net style architecture) to automatically segment left ventricular myocardium from cardiac MRI in 43,230 UK Biobank participants. The segmented contours were used to compute left ventricular mass indexed to body surface area (LVMI), enabling GWAS that identified 12 associations (11 novel) implicating genes associated with cardiac contractility and cardiomyopathy. The LVMI polygenic risk score validated in independent Mass General Brigham cohort.
KEYWORDS
deep learning, cardiac MRI segmentation, U-Net, left ventricular mass, CNN, cardiomyopathy, UK Biobank
TITLE
Clinical and genetic associations of deep learning-derived cardiac magnetic resonance-based left ventricular mass.
Main citation
Khurshid S, Lazarte J, Pirruccello JP, ...&, Lubitz SA. (2023) Clinical and genetic associations of deep learning-derived cardiac magnetic resonance-based left ventricular mass. Nat Commun, 14 (1) 1558. doi:10.1038/s41467-023-37173-w. PMID 36944631
ABSTRACT
Left ventricular mass is a risk marker for cardiovascular events, and may indicate an underlying cardiomyopathy. Cardiac magnetic resonance is the gold-standard for left ventricular mass estimation, but is challenging to obtain at scale. Here, we use deep learning to enable genome-wide association study of cardiac magnetic resonance-derived left ventricular mass indexed to body surface area within 43,230 UK Biobank participants. We identify 12 genome-wide associations (1 known at TTN and 11 novel for left ventricular mass).
DOI
10.1038/s41467-023-37173-w

Liu Y (DL Organ MRI GWAS)

AI GWAS Imaging Deep Learning Abdominal MRI Organ Traits UK Biobank
PUBMED_LINK
34128465
FULL NAME
Genetic Architecture of 11 Organ Traits Derived from Abdominal MRI Using Deep Learning
DESCRIPTION
Applied a U-Net-based CNN segmentation pipeline to over 38,000 abdominal MRI scans from UK Biobank. The deep learning model automatically segmented 7 organs/tissues (liver, pancreas, kidneys, spleen, lungs, visceral adipose tissue, subcutaneous adipose tissue) and quantified their volume, fat content (via signal intensity), and iron content (via T2* mapping). GWAS on these 11 DL-derived traits identified 93 independent genome-wide significant associations (heritability 8-44%), including 4 novel liver trait associations.
KEYWORDS
deep learning, abdominal MRI segmentation, U-Net, organ volume quantification, liver fat, pancreas iron, UK Biobank
TITLE
Genetic architecture of 11 organ traits derived from abdominal MRI using deep learning.
Main citation
Liu Y, Basty N, Whitcher B, Bell JD, ...&, Cule M. (2021) Genetic architecture of 11 organ traits derived from abdominal MRI using deep learning. Elife, 10. doi:10.7554/eLife.65554. PMID 34128465
ABSTRACT
Cardiometabolic diseases are an increasing global health burden. While socioeconomic, environmental, behavioural, and genetic risk factors have been identified, a better understanding of the underlying mechanisms is required to develop more effective interventions. Magnetic resonance imaging (MRI) has been used to assess organ health, but biobank-scale studies are still in their infancy. Using over 38,000 abdominal MRI scans in the UK Biobank, we used deep learning to quantify volume, fat, and iron in seven organs and tissues, and demonstrate that imaging-derived phenotypes reflect health status. We identify 93 independent genome-wide significant associations.
DOI
10.7554/eLife.65554

Ning C (DL LVRWT GWAS)

AI GWAS Imaging Deep Learning Cardiac MRI Left Ventricular Wall Hypertrophic Cardiomyopathy UK Biobank
PUBMED_LINK
38036550
FULL NAME
Genome-Wide Association Analysis of Left Ventricular Imaging-Derived Phenotypes Identifies 72 Risk Loci
DESCRIPTION
Built a CNN-based deep learning algorithm for automated segmentation of left ventricular myocardium from cardiac MRI, enabling precise calculation of 12 regional wall thickness (LVRWT) measurements in 42,194 UK Biobank participants. GWAS of these 12 CNN-derived LVRWT traits identified 72 significant genetic loci involved in heart development and contraction pathways. Mendelian randomization confirmed causal relationships with hypertrophic cardiomyopathy. The PRS of inferoseptal LVRWT enabled identification of high-risk individuals.
KEYWORDS
deep learning, cardiac MRI, CNN, left ventricular wall thickness segmentation, hypertrophic cardiomyopathy, UK Biobank
TITLE
Genome-wide association analysis of left ventricular imaging-derived phenotypes identifies 72 risk loci and yields genetic insights into hypertrophic cardiomyopathy.
Main citation
Ning C, Fan L, Jin M, ...&, Miao X. (2023) Genome-wide association analysis of left ventricular imaging-derived phenotypes identifies 72 risk loci and yields genetic insights into hypertrophic cardiomyopathy. Nat Commun, 14 (1) 7900. doi:10.1038/s41467-023-43771-5. PMID 38036550
ABSTRACT
Left ventricular regional wall thickness (LVRWT) is an independent predictor of morbidity and mortality in cardiovascular diseases (CVDs). To identify specific genetic influences on individual LVRWT, we established a novel deep learning algorithm to calculate 12 LVRWTs accurately in 42,194 individuals from the UK Biobank with cardiac magnetic resonance (CMR) imaging. Genome-wide association studies of CMR-derived 12 LVRWTs identified 72 significant genetic loci associated with at least one LVRWT phenotype.
DOI
10.1038/s41467-023-43771-5

iGWAS

AI GWAS Imaging Deep Learning Self-Supervised Learning Retinal Fundus Phenotyping Contrastive Learning
PUBMED_LINK
38728357
FULL NAME
Image-Based Genome-Wide Association of Self-Supervised Deep Phenotyping of Retina Fundus Images
DESCRIPTION
iGWAS uses self-supervised contrastive learning (SimCLR-style framework with a CNN encoder backbone) to extract a 128-dimensional phenotype vector directly from retinal fundus images without any manual labels. Trained on 40,000 EyePACS images via instance discrimination, then applied to 130,329 UK Biobank fundus images. GWAS on these 128 learned phenotypes identified 14 genome-wide significant loci. First demonstration of unsupervised deep phenotyping for image-based GWAS — discovering genetic associations without predefined human annotations.
KEYWORDS
self-supervised contrastive learning, SimCLR, CNN, retinal fundus, deep phenotyping, image-based GWAS, UK Biobank
TITLE
iGWAS: Image-Based Genome-Wide Association of Self-Supervised Deep Phenotyping of Retina Fundus Images.
Main citation
Xie Z, Zhang T, Kim S, Sun J, Forouzandeh P, Chen R, Zhi D. (2024) iGWAS: Image-Based Genome-Wide Association of Self-Supervised Deep Phenotyping of Retina Fundus Images. PLOS Genetics, 20(5):e1011273. doi:10.1371/journal.pgen.1011273. PMID 38728357
ABSTRACT
Existing imaging genetics studies have been mostly limited in scope by using imaging-derived phenotypes defined by human experts. Here, leveraging new breakthroughs in self-supervised deep representation learning, we propose a new approach, image-based genome-wide association study (iGWAS), for identifying genetic factors associated with phenotypes discovered from medical images using contrastive learning. Using retinal fundus photos, our model extracts a 128-dimensional vector representing features of the retina as phenotypes. We identified 14 loci with genome-wide significance.
DOI
10.1371/journal.pgen.1011273

transferGWAS

AI GWAS Imaging Transfer Learning Deep Learning Retinal Fundus Representation Learning
PUBMED_LINK
35640976
FULL NAME
transferGWAS: GWAS of Images Using Deep Transfer Learning
DESCRIPTION
transferGWAS performs GWAS directly on full medical images using deep transfer learning: (1) a pretrained CNN (ResNet-based architecture, pretrained on ImageNet) extracts feature embeddings from raw images; (2) these learned representations are used as quantitative phenotypes for genetic association testing. Applied to UK Biobank retinal fundus images, identified 60 genomic regions including 7 novel candidate loci for eye-related traits. First demonstration of direct GWAS on whole images without predefined phenotype engineering.
URL
https://github.com/mkirchler/transferGWAS/
KEYWORDS
deep transfer learning, pretrained CNN, ResNet, retinal fundus, whole-image GWAS, representation learning, UK Biobank
TITLE
transferGWAS: GWAS of images using deep transfer learning.
Main citation
Kirchler M, Konigorski S, Norden M, Meltendorf C, Kloft M, Schurmann C, Lippert C. (2022) transferGWAS: GWAS of images using deep transfer learning. Bioinformatics, 38(14):3621-3628. doi:10.1093/bioinformatics/btac369. PMID 35640976
ABSTRACT
MOTIVATION: Medical images can provide rich information about diseases and their biology. However, investigating their association with genetic variation requires non-standard methods. We propose transferGWAS, a novel approach to perform genome-wide association studies directly on full medical images. First, we learn semantically meaningful representations of the images based on a transfer learning task, during which a deep neural network is trained on independent but similar data. Then, we perform genetic association tests with these representations. RESULTS: We validate the type I error rates and power of transferGWAS in simulation studies of synthetic images. Then we apply transferGWAS in a genome-wide association study of retinal fundus images from the UK Biobank. This first-of-a-kind GWAS of full imaging data yielded 60 genomic regions associated with retinal fundus images, of which 7 are novel candidate loci for eye-related traits and diseases.
DOI
10.1093/bioinformatics/btac369