Phenotyping
Catalog entries using this tag (links open the entry card on its page):
Entries
iGWAS
PUBMED_LINK
FULL NAME
Image-Based Genome-Wide Association of Self-Supervised Deep Phenotyping of Retina Fundus Images
DESCRIPTION
iGWAS uses self-supervised contrastive learning (SimCLR-style framework with a CNN encoder backbone) to extract a 128-dimensional phenotype vector directly from retinal fundus images without any manual labels. Trained on 40,000 EyePACS images via instance discrimination, then applied to 130,329 UK Biobank fundus images. GWAS on these 128 learned phenotypes identified 14 genome-wide significant loci. First demonstration of unsupervised deep phenotyping for image-based GWAS — discovering genetic associations without predefined human annotations.
KEYWORDS
self-supervised contrastive learning, SimCLR, CNN, retinal fundus, deep phenotyping, image-based GWAS, UK Biobank
TITLE
iGWAS: Image-Based Genome-Wide Association of Self-Supervised Deep Phenotyping of Retina Fundus Images.
Main citation
Xie Z, Zhang T, Kim S, Sun J, Forouzandeh P, Chen R, Zhi D. (2024) iGWAS: Image-Based Genome-Wide Association of Self-Supervised Deep Phenotyping of Retina Fundus Images. PLOS Genetics, 20(5):e1011273. doi:10.1371/journal.pgen.1011273. PMID 38728357
ABSTRACT
Existing imaging genetics studies have been mostly limited in scope by using imaging-derived phenotypes defined by human experts. Here, leveraging new breakthroughs in self-supervised deep representation learning, we propose a new approach, image-based genome-wide association study (iGWAS), for identifying genetic factors associated with phenotypes discovered from medical images using contrastive learning. Using retinal fundus photos, our model extracts a 128-dimensional vector representing features of the retina as phenotypes. We identified 14 loci with genome-wide significance.
DOI
10.1371/journal.pgen.1011273
MixEHR-SAGE
PUBMED_LINK
FULL NAME
MixEHR-SAGE - Multi-modal Topic Modeling for PheWAS and GWAS
DESCRIPTION
MixEHR-SAGE is a PheCode-guided multi-modal topic model that integrates diagnoses, procedures, and medications from EHR to enhance phenotyping for GWAS. By combining expert-informed priors with probabilistic inference, it identifies over 1000 interpretable phenotype topics from UK Biobank data and improves disease incidence prediction and GWAS discovery. Published in Briefings in Bioinformatics.
TITLE
PheCode-guided multi-modal topic modeling of electronic health records improves disease incidence prediction and GWAS discovery from UK Biobank.
ABSTRACT
Phenome-wide association studies rely on disease definitions derived from diagnostic codes, often failing to leverage the full richness of electronic health records (EHR). We present MixEHR-SAGE, a PheCode-guided multi-modal topic model that integrates diagnoses, procedures, and medications to enhance phenotyping from large-scale EHRs. Applied to 350,000 individuals with high-quality genetic data, MixEHR-SAGE-derived risk scores accurately predicted disease incidence and improved GWAS discovery.
DOI
10.1093/bib/bbag030