Skip to content

PheWAS

Catalog entries using this tag (links open the entry card on its page):

Entries

MixEHR-SAGE

AI GWAS Topic Modeling PheWAS EHR Phenotyping UK Biobank Brief Bioinform
PUBMED_LINK
41627341
FULL NAME
MixEHR-SAGE - Multi-modal Topic Modeling for PheWAS and GWAS
DESCRIPTION
MixEHR-SAGE is a PheCode-guided multi-modal topic model that integrates diagnoses, procedures, and medications from EHR to enhance phenotyping for GWAS. By combining expert-informed priors with probabilistic inference, it identifies over 1000 interpretable phenotype topics from UK Biobank data and improves disease incidence prediction and GWAS discovery. Published in Briefings in Bioinformatics.
TITLE
PheCode-guided multi-modal topic modeling of electronic health records improves disease incidence prediction and GWAS discovery from UK Biobank.
ABSTRACT
Phenome-wide association studies rely on disease definitions derived from diagnostic codes, often failing to leverage the full richness of electronic health records (EHR). We present MixEHR-SAGE, a PheCode-guided multi-modal topic model that integrates diagnoses, procedures, and medications to enhance phenotyping from large-scale EHRs. Applied to 350,000 individuals with high-quality genetic data, MixEHR-SAGE-derived risk scores accurately predicted disease incidence and improved GWAS discovery.
DOI
10.1093/bib/bbag030

FinnGen — Nature flagship paper (R7)

Nature Flagship PheWAS Low-frequency Variant
PUBMED_LINK
36653562
STAGE_PERIOD
2022–2023
DESCRIPTION
FinnGen R7 data freeze of 224,737 participants analyzed across 1,932 disease endpoints. Identified 30 new low-frequency variant associations enriched in Finland, and 2,733 genome-wide significant associations through phenome-wide scanning. Published in Nature (Kurki et al., 2023).
URL
https://www.finngen.fi/en
TITLE
FinnGen provides genetic insights from a well-phenotyped isolated population

MVP — Large-scale disease-specific GWAS & WGS

WGS PheWAS Multi-biobank EHR
STAGE_PERIOD
2023–2025
DESCRIPTION
Expanded disease-specific GWAS across hundreds of traits leveraging the deep EHR phenotyping in the VA system. Whole-genome sequencing of a subset of participants for comprehensive variant discovery. MVP data contributed to multi-biobank meta-analyses with FinnGen and UK Biobank spanning thousands of phenotypes.
URL
https://www.mvp.va.gov/