PheWAS
Catalog entries using this tag (links open the entry card on its page):
- MixEHR-SAGE — AI
- FinnGen — Nature flagship paper (R7) — Projects
- MVP — Large-scale disease-specific GWAS & WGS — Projects
- UK Biobank — Proteome PheWAS & MR — Projects
Entries
MixEHR-SAGE
PUBMED_LINK
FULL NAME
MixEHR-SAGE - Multi-modal Topic Modeling for PheWAS and GWAS
DESCRIPTION
MixEHR-SAGE is a PheCode-guided multi-modal topic model that integrates diagnoses, procedures, and medications from EHR to enhance phenotyping for GWAS. By combining expert-informed priors with probabilistic inference, it identifies over 1000 interpretable phenotype topics from UK Biobank data and improves disease incidence prediction and GWAS discovery. Published in Briefings in Bioinformatics.
TITLE
PheCode-guided multi-modal topic modeling of electronic health records improves disease incidence prediction and GWAS discovery from UK Biobank.
ABSTRACT
Phenome-wide association studies rely on disease definitions derived from diagnostic codes, often failing to leverage the full richness of electronic health records (EHR). We present MixEHR-SAGE, a PheCode-guided multi-modal topic model that integrates diagnoses, procedures, and medications to enhance phenotyping from large-scale EHRs. Applied to 350,000 individuals with high-quality genetic data, MixEHR-SAGE-derived risk scores accurately predicted disease incidence and improved GWAS discovery.
DOI
10.1093/bib/bbag030
FinnGen — Nature flagship paper (R7)
PUBMED_LINK
STAGE_PERIOD
2022–2023
DESCRIPTION
FinnGen R7 data freeze of 224,737 participants analyzed across 1,932 disease endpoints. Identified 30 new low-frequency variant associations enriched in Finland, and 2,733 genome-wide significant associations through phenome-wide scanning. Published in Nature (Kurki et al., 2023).
URL
TITLE
FinnGen provides genetic insights from a well-phenotyped isolated population
MVP — Large-scale disease-specific GWAS & WGS
STAGE_PERIOD
2023–2025
DESCRIPTION
Expanded disease-specific GWAS across hundreds of traits leveraging the deep EHR phenotyping in the VA system. Whole-genome sequencing of a subset of participants for comprehensive variant discovery. MVP data contributed to multi-biobank meta-analyses with FinnGen and UK Biobank spanning thousands of phenotypes.
URL
UK Biobank — Proteome PheWAS & MR
PUBMED_LINK
STAGE_PERIOD
2023.10
DESCRIPTION
Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases, using pQTL data from UKB-PPP and other cohorts.
URL
TITLE
Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases