Skip to content

Imaging

Catalog entries using this tag (links open the entry card on its page):

Entries

CHIEF

AI Imaging Pathology Foundation Model Weakly Supervised Cancer Diagnosis Histopathology
PUBMED_LINK
39232164
FULL NAME
CHIEF — Clinical Histopathology Imaging Evaluation Foundation Model
DESCRIPTION
CHIEF (Clinical Histopathology Imaging Evaluation Foundation) is a general-purpose weakly supervised machine learning framework from Harvard Medical School. Trained on 60,530 WSIs spanning 19 anatomical sites (44TB data), CHIEF leverages two complementary pretraining methods: unsupervised pretraining for tile-level feature identification and weakly supervised pretraining for whole-slide pattern recognition. Validated on 19,491 WSIs from 32 independent slide sets across 24 hospitals internationally. Outperforms SOTA deep learning methods by up to 36.1%, demonstrating strong generalization across diverse populations and slide preparation methods.
URL
https://github.com/hms-dbmi/CHIEF
TITLE
A pathology foundation model for cancer diagnosis and prognosis prediction.
Main citation
Wang X, Zhao J, Marostica E, Yuan W, Jin J, Zhang Y, Wang F, Li Y, Yu KH, Baris T, Anand D, Hughes K, Rosemon J, Bower T, Lee S, Weerasinghe R, Wright BJ, Robicsek A, Piening B, Bifulco C, Wang S, Poon H. (2024) A pathology foundation model for cancer diagnosis and prognosis prediction. Nature, 634(8035):970-978. doi:10.1038/s41586-024-07894-z. PMID 39232164
ABSTRACT
Histopathology image evaluation is indispensable for cancer diagnoses and subtype classification. Standard AI methods for histopathology image analyses have focused on optimizing specialized models for each diagnostic task, often with limited generalizability. To address this challenge, we devised CHIEF, a general-purpose weakly supervised machine learning framework to extract pathology imaging features for systematic cancer evaluation. CHIEF leverages two complementary pretraining methods to extract diverse pathology representations: unsupervised pretraining for tile-level feature identification and weakly supervised pretraining for whole-slide pattern recognition. Developed using 60,530 whole-slide images spanning 19 anatomical sites, CHIEF outperformed SOTA deep learning methods by up to 36.1%, showing its ability to address domain shifts observed in samples from diverse populations.
DOI
10.1038/s41586-024-07894-z

CONCH

AI Imaging Pathology Foundation Model Vision-Language Histopathology Mahmood Lab Zero-Shot
PUBMED_LINK
38504017
FULL NAME
CONCH — Contrastive learning from Captions for Histopathology (Vision-Language Foundation Model)
DESCRIPTION
CONCH (CONtrastive learning from Captions for Histopathology) is a vision-language foundation model from Mahmood Lab (Harvard/BWH). Pretrained on 1.17M histopathology image-text pairs from diverse sources (PubMed, educational resources, textbooks). Evaluated across 14 clinically relevant tasks including zero-shot cancer classification, text-to-image retrieval, image-to-text retrieval, caption generation, and tissue segmentation. Outperforms standard models including CLIP and PLIP. CONCH also works on non-H&E stains (IHC, special stains), demonstrating broad applicability. Available as an open-source model for academic use.
URL
https://github.com/mahmoodlab/CONCH
TITLE
A visual-language foundation model for computational pathology.
Main citation
Lu MY, Chen B, Williamson DFK, Chen RJ, Liang I, Ding T, Jaume G, Odintsov I, Le LP, Gerber G, Parwani AV, Zhang A, Mahmood F. (2024) A visual-language foundation model for computational pathology. Nature Medicine, 30(3):863-874. doi:10.1038/s41591-024-02856-4. PMID 38504017
ABSTRACT
We introduce CONCH, a visual-language foundation model developed using diverse sources of histopathology images and text. Trained on 1.17 million pathology image-text pairs, CONCH achieves state-of-the-art performance across 14 clinically relevant tasks, including zero-shot cancer classification, text-to-image and image-to-text retrieval, caption generation, and tissue segmentation. CONCH outperforms standard models like CLIP and PLIP, and generalizes to non-H&E stains including immunohistochemistry and special stains, demonstrating its versatility as a foundation model for computational pathology.
DOI
10.1038/s41591-024-02856-4

ENLIGHT-DeepPT

AI Imaging Histopathology Transcriptomics Treatment Response Precision Oncology
PUBMED_LINK
38961276
FULL NAME
ENLIGHT-DeepPT — Deep-Learning Framework for Cancer Treatment Response from Histopathology Images
DESCRIPTION
ENLIGHT-DeepPT (Deep Phenotyping of Tumors) is a deep-learning framework (ResNet50 + MLP) that predicts genome-wide tumor mRNA expression from routine H&E histopathology images across 16 TCGA cancer types. The imputed transcriptomics then drive treatment response prediction, achieving odds ratio of 2.28 across 5 independent treatment cohorts. Directly links medical imaging (histopathology) with genomics/transcriptomics via AI, enabling precision oncology from standard pathology slides.
TITLE
A deep-learning framework to predict cancer treatment response from histopathology images through imputed transcriptomics.
Main citation
Hoang DT, Shulman ED, Shuaib M, Nguyen JD, Maqbool HH, Nguyen Q, Iyer P, Liu S, Ruppin E, Stone EA. (2024) A deep-learning framework to predict cancer treatment response from histopathology images through imputed transcriptomics. Nature Cancer, 5(9):1305-1317. doi:10.1038/s43018-024-00793-2. PMID 38961276
ABSTRACT
Predicting cancer treatment response from routinely collected clinical material is a central challenge in precision oncology. Here we present ENLIGHT-DeepPT, a deep-learning framework that predicts genome-wide tumor mRNA expression from routine H&E histopathology images. Using a two-stage approach (image-to-transcriptomics via ResNet50 + MLP, then transcriptomics-to-treatment response), ENLIGHT-DeepPT achieves an odds ratio of 2.28 across 5 independent treatment cohorts spanning multiple cancer types and drug classes.
DOI
10.1038/s43018-024-00793-2

Haas ME (ML Liver Fat GWAS)

AI GWAS Imaging Machine Learning Liver Fat Abdominal MRI UK Biobank
PUBMED_LINK
34957434
FULL NAME
Machine Learning Enables New Insights into Genetic Contributions to Liver Fat Accumulation
DESCRIPTION
Developed an abdominal MRI-based machine-learning regression model (gradient-boosted regression on raw MRI signal intensities) to accurately estimate liver fat from UK Biobank abdominal MRI scans (correlation 0.97-0.99 with ground truth). Trained on 4,511 participants with gold-standard MRI biomarker measurements and applied to 32,192 additional individuals. GWAS identified 8 associated variants (5 novel: MTARC1, ADH1B, TRIB1, GPAM, MAST3) and a polygenic score strongly associated with future chronic liver disease risk (HR>1.32 per SD, p<9e-17).
KEYWORDS
MRI signal regression, liver fat quantification, abdominal MRI, hepatic steatosis, gradient boosting, UK Biobank
TITLE
Machine learning enables new insights into genetic contributions to liver fat accumulation.
Main citation
Haas ME, Pirruccello JP, Friedman SN, Wang M, ...&, Khera AV. (2021) Machine learning enables new insights into genetic contributions to liver fat accumulation. Cell Genom, 1 (3). doi:10.1016/j.xgen.2021.100066. PMID 34957434
ABSTRACT
Excess liver fat, called hepatic steatosis, is a leading risk factor for end-stage liver disease and cardiometabolic diseases but often remains undiagnosed in clinical practice because of the need for direct imaging assessments. We developed an abdominal MRI-based machine-learning algorithm to accurately estimate liver fat from a truth dataset of 4,511 middle-aged UK Biobank participants, enabling quantification in 32,192 additional individuals. A genome-wide association study of common genetic variants and liver fat replicated three known associations and identified five newly associated variants.
DOI
10.1016/j.xgen.2021.100066

iGWAS

AI GWAS Imaging Deep Learning Self-Supervised Learning Retinal Fundus Phenotyping Contrastive Learning
PUBMED_LINK
38728357
FULL NAME
Image-Based Genome-Wide Association of Self-Supervised Deep Phenotyping of Retina Fundus Images
DESCRIPTION
iGWAS uses self-supervised contrastive learning (SimCLR-style framework with a CNN encoder backbone) to extract a 128-dimensional phenotype vector directly from retinal fundus images without any manual labels. Trained on 40,000 EyePACS images via instance discrimination, then applied to 130,329 UK Biobank fundus images. GWAS on these 128 learned phenotypes identified 14 genome-wide significant loci. First demonstration of unsupervised deep phenotyping for image-based GWAS — discovering genetic associations without predefined human annotations.
KEYWORDS
self-supervised contrastive learning, SimCLR, CNN, retinal fundus, deep phenotyping, image-based GWAS, UK Biobank
TITLE
iGWAS: Image-Based Genome-Wide Association of Self-Supervised Deep Phenotyping of Retina Fundus Images.
Main citation
Xie Z, Zhang T, Kim S, Sun J, Forouzandeh P, Chen R, Zhi D. (2024) iGWAS: Image-Based Genome-Wide Association of Self-Supervised Deep Phenotyping of Retina Fundus Images. PLOS Genetics, 20(5):e1011273. doi:10.1371/journal.pgen.1011273. PMID 38728357
ABSTRACT
Existing imaging genetics studies have been mostly limited in scope by using imaging-derived phenotypes defined by human experts. Here, leveraging new breakthroughs in self-supervised deep representation learning, we propose a new approach, image-based genome-wide association study (iGWAS), for identifying genetic factors associated with phenotypes discovered from medical images using contrastive learning. Using retinal fundus photos, our model extracts a 128-dimensional vector representing features of the retina as phenotypes. We identified 14 loci with genome-wide significance.
DOI
10.1371/journal.pgen.1011273

KEEP

AI Imaging Pathology Foundation Model Vision-Language Knowledge Graph Rare Cancer Cancer Cell
PUBMED_LINK
41720085
FULL NAME
KEEP — Knowledge-Enhanced Pathology Vision-Language Foundation Model
DESCRIPTION
KEEP (KnowledgE-Enhanced Pathology) is a vision-language foundation model from Shanghai AI Lab / SJTU that systematically integrates disease knowledge into pretraining for cancer diagnosis. Uses a comprehensive disease knowledge graph with 11,454 diseases and 139,143 attributes from DO and UMLS to reorganize millions of pathology image-text pairs into 143,000 semantically structured groups aligned with disease ontology hierarchies. Across 18 public benchmarks (14,000+ WSIs) and 4 institutional rare cancer datasets (926 cases), KEEP consistently outperforms existing foundation models (CHIEF, CONCH, UNI), with substantial gains for rare subtypes (+8.5 pts balanced accuracy vs CONCH on 30 rare brain cancers). Published in Cancer Cell, Feb 2026.
URL
https://github.com/MAGIC-AI4Med/KEEP
TITLE
Knowledge-enhanced pretraining for vision-language pathology foundation model on cancer diagnosis.
Main citation
Zhou X, Sun L, He D, Guan W, Wang G, Wang R, Wang L, Yuan X, Sun X, Zhang Y, Sun K, Wang Y, Xie W. (2026) Knowledge-enhanced pretraining for vision-language pathology foundation model on cancer diagnosis. Cancer Cell, 44(4):777-791. doi:10.1016/j.ccell.2026.01.019. PMID 41720085
ABSTRACT
Vision-language foundation models have shown great promise in computational pathology but remain primarily data-driven, lacking explicit integration of medical knowledge. We introduce KEEP, a foundation model that systematically incorporates disease knowledge into pretraining for cancer diagnosis. KEEP leverages a comprehensive disease knowledge graph encompassing 11,454 diseases and 139,143 attributes to reorganize millions of pathology image-text pairs into 143,000 semantically structured groups aligned with disease ontology hierarchies. Across 18 public benchmarks (over 14,000 WSIs) and 4 institutional rare cancer datasets (926 cases), KEEP consistently outperformed existing foundation models, showing substantial gains for rare subtypes.
DOI
10.1016/j.ccell.2026.01.019

Khurshid S (DL LV Mass GWAS)

AI GWAS Imaging Deep Learning Cardiac MRI Left Ventricular Mass UK Biobank
PUBMED_LINK
36944631
FULL NAME
Clinical and Genetic Associations of Deep Learning-Derived Cardiac Magnetic Resonance-Based Left Ventricular Mass
DESCRIPTION
Applied a CNN-based segmentation model (U-Net style architecture) to automatically segment left ventricular myocardium from cardiac MRI in 43,230 UK Biobank participants. The segmented contours were used to compute left ventricular mass indexed to body surface area (LVMI), enabling GWAS that identified 12 associations (11 novel) implicating genes associated with cardiac contractility and cardiomyopathy. The LVMI polygenic risk score validated in independent Mass General Brigham cohort.
KEYWORDS
deep learning, cardiac MRI segmentation, U-Net, left ventricular mass, CNN, cardiomyopathy, UK Biobank
TITLE
Clinical and genetic associations of deep learning-derived cardiac magnetic resonance-based left ventricular mass.
Main citation
Khurshid S, Lazarte J, Pirruccello JP, ...&, Lubitz SA. (2023) Clinical and genetic associations of deep learning-derived cardiac magnetic resonance-based left ventricular mass. Nat Commun, 14 (1) 1558. doi:10.1038/s41467-023-37173-w. PMID 36944631
ABSTRACT
Left ventricular mass is a risk marker for cardiovascular events, and may indicate an underlying cardiomyopathy. Cardiac magnetic resonance is the gold-standard for left ventricular mass estimation, but is challenging to obtain at scale. Here, we use deep learning to enable genome-wide association study of cardiac magnetic resonance-derived left ventricular mass indexed to body surface area within 43,230 UK Biobank participants. We identify 12 genome-wide associations (1 known at TTN and 11 novel for left ventricular mass).
DOI
10.1038/s41467-023-37173-w

Liu Y (DL Organ MRI GWAS)

AI GWAS Imaging Deep Learning Abdominal MRI Organ Traits UK Biobank
PUBMED_LINK
34128465
FULL NAME
Genetic Architecture of 11 Organ Traits Derived from Abdominal MRI Using Deep Learning
DESCRIPTION
Applied a U-Net-based CNN segmentation pipeline to over 38,000 abdominal MRI scans from UK Biobank. The deep learning model automatically segmented 7 organs/tissues (liver, pancreas, kidneys, spleen, lungs, visceral adipose tissue, subcutaneous adipose tissue) and quantified their volume, fat content (via signal intensity), and iron content (via T2* mapping). GWAS on these 11 DL-derived traits identified 93 independent genome-wide significant associations (heritability 8-44%), including 4 novel liver trait associations.
KEYWORDS
deep learning, abdominal MRI segmentation, U-Net, organ volume quantification, liver fat, pancreas iron, UK Biobank
TITLE
Genetic architecture of 11 organ traits derived from abdominal MRI using deep learning.
Main citation
Liu Y, Basty N, Whitcher B, Bell JD, ...&, Cule M. (2021) Genetic architecture of 11 organ traits derived from abdominal MRI using deep learning. Elife, 10. doi:10.7554/eLife.65554. PMID 34128465
ABSTRACT
Cardiometabolic diseases are an increasing global health burden. While socioeconomic, environmental, behavioural, and genetic risk factors have been identified, a better understanding of the underlying mechanisms is required to develop more effective interventions. Magnetic resonance imaging (MRI) has been used to assess organ health, but biobank-scale studies are still in their infancy. Using over 38,000 abdominal MRI scans in the UK Biobank, we used deep learning to quantify volume, fat, and iron in seven organs and tissues, and demonstrate that imaging-derived phenotypes reflect health status. We identify 93 independent genome-wide significant associations.
DOI
10.7554/eLife.65554

mSTAR

AI Imaging Pathology Foundation Model Multimodal Gene Expression Whole-Slide HKUST
PUBMED_LINK
41387679
FULL NAME
mSTAR — Multimodal Self-TAught Pretraining (WSI + Reports + Gene Expression)
DESCRIPTION
mSTAR (Multimodal Self-TAught PRetraining) is a pathology foundation model from HKUST/SJTU that integrates three modalities: pathology slides (WSIs), expert pathology reports, and gene expression (RNA-Seq) data. Curates the largest multimodal dataset of 26,169 slide-level modality pairs across 32 cancer types from 10,275 TCGA patients (>116M patch images). Uses a two-stage paradigm: (1) slide-level contrastive learning across WSI-report-gene modalities, (2) self-taught training that propagates multimodal knowledge from slide aggregator (teacher) to patch extractor (student). Evaluated on 97 tasks across 15 application types, outperforming UNI, CONCH, CHIEF, and GigaPath. Key finding: multimodal integration yields greater improvements than simply expanding vision-only datasets (53x data efficiency vs Virchow). Published in Nat Commun, Dec 2025.
URL
https://github.com/Innse/mSTAR
TITLE
A multimodal knowledge-enhanced whole-slide pathology foundation model.
Main citation
Xu Y, Wang Y, Zhou F, Ma J, Yang S, Lin H, Wang X, Wang J, Liang L, Han A, Jin C, Cheng KT, Chen H. (2025) A multimodal knowledge-enhanced whole-slide pathology foundation model. Nature Communications, 16:11406. doi:10.1038/s41467-025-66220-x. PMID 41387679
ABSTRACT
Computational pathology has advanced through foundation models, yet faces challenges in multimodal integration and capturing whole-slide context. We present mSTAR, the pathology foundation model that incorporates three modalities: pathology slides, expert-created reports, and gene expression data, within a unified framework. Our dataset includes 26,169 slide-level modality pairs across 32 cancer types, comprising over 116 million patch images. This approach injects multimodal whole-slide context into patch representations, expanding modeling from single to multiple modalities and from patch-level to slide-level analysis. Across 97 tasks, mSTAR outperforms previous SOTA models, particularly in molecular prediction, revealing that multimodal integration yields greater improvements than simply expanding vision-only datasets.
DOI
10.1038/s41467-025-66220-x

Ning C (DL LVRWT GWAS)

AI GWAS Imaging Deep Learning Cardiac MRI Left Ventricular Wall Hypertrophic Cardiomyopathy UK Biobank
PUBMED_LINK
38036550
FULL NAME
Genome-Wide Association Analysis of Left Ventricular Imaging-Derived Phenotypes Identifies 72 Risk Loci
DESCRIPTION
Built a CNN-based deep learning algorithm for automated segmentation of left ventricular myocardium from cardiac MRI, enabling precise calculation of 12 regional wall thickness (LVRWT) measurements in 42,194 UK Biobank participants. GWAS of these 12 CNN-derived LVRWT traits identified 72 significant genetic loci involved in heart development and contraction pathways. Mendelian randomization confirmed causal relationships with hypertrophic cardiomyopathy. The PRS of inferoseptal LVRWT enabled identification of high-risk individuals.
KEYWORDS
deep learning, cardiac MRI, CNN, left ventricular wall thickness segmentation, hypertrophic cardiomyopathy, UK Biobank
TITLE
Genome-wide association analysis of left ventricular imaging-derived phenotypes identifies 72 risk loci and yields genetic insights into hypertrophic cardiomyopathy.
Main citation
Ning C, Fan L, Jin M, ...&, Miao X. (2023) Genome-wide association analysis of left ventricular imaging-derived phenotypes identifies 72 risk loci and yields genetic insights into hypertrophic cardiomyopathy. Nat Commun, 14 (1) 7900. doi:10.1038/s41467-023-43771-5. PMID 38036550
ABSTRACT
Left ventricular regional wall thickness (LVRWT) is an independent predictor of morbidity and mortality in cardiovascular diseases (CVDs). To identify specific genetic influences on individual LVRWT, we established a novel deep learning algorithm to calculate 12 LVRWTs accurately in 42,194 individuals from the UK Biobank with cardiac magnetic resonance (CMR) imaging. Genome-wide association studies of CMR-derived 12 LVRWTs identified 72 significant genetic loci associated with at least one LVRWT phenotype.
DOI
10.1038/s41467-023-43771-5

PathOrchestra

AI Imaging Pathology Foundation Model Self-Supervised Clinical-Grade Structured Report
PUBMED_LINK
41258399
FULL NAME
PathOrchestra — Comprehensive Pathology Foundation Model with 100+ Clinical-Grade Tasks
DESCRIPTION
PathOrchestra is a versatile pathology foundation model from Shanghai AI Lab and multiple Chinese institutions, trained via self-supervised learning on 287,424 H&E-stained WSIs from 21 tissue types across 3 independent clinical centers. Evaluated on the largest known clinical task benchmark (112 tasks: 61 private + 51 public) spanning digital slide preprocessing, pan-cancer classification (17 cancer types), lesion identification, multi-cancer subtype classification (36 tasks), biomarker assessment (36 tasks), gene expression prediction, and structured report generation. Achieves over 0.950 accuracy in 47 tasks. First model to generate structured pathology reports for colorectal cancer and lymphoma. Apache 2.0 open-source license.
URL
https://github.com/yanfang-research/PathOrchestra
TITLE
PathOrchestra: a comprehensive foundation model for computational pathology with over 100 diverse clinical-grade tasks.
Main citation
Yan F, et al. (2025) PathOrchestra: a comprehensive foundation model for computational pathology with over 100 diverse clinical-grade tasks. npj Digital Medicine, 8(1):695. doi:10.1038/s41746-025-02027-w. PMID 41258399
ABSTRACT
The complexity and variability of high-resolution pathological images present significant challenges in computational pathology. We present PathOrchestra, a versatile pathology foundation model trained via self-supervised learning on 287,424 slides from 21 tissue types across three centers. Evaluated on 112 tasks from 61 private and 51 public datasets, covering digital slide preprocessing, pan-cancer classification, lesion identification, multi-cancer subtype classification, biomarker assessment, gene expression prediction, and structured report generation. Across 27,755 WSIs and 9,415,729 ROI images, it achieved over 0.950 accuracy in 47 tasks. It is the first to generate structured reports for colorectal cancer and lymphoma.
DOI
10.1038/s41746-025-02027-w

Prov-GigaPath

AI Imaging Pathology Foundation Model Whole-Slide Microsoft Real-World Data
PUBMED_LINK
38778098
FULL NAME
Prov-GigaPath — Whole-Slide Foundation Model for Digital Pathology
DESCRIPTION
Prov-GigaPath by Microsoft Research, Providence, and UW is a whole-slide pathology foundation model pretrained on 1.3 billion 256x256 image tiles from 171,189 whole slides across 28 cancer centers (>30,000 patients, 31 tissue types). Uses a novel GigaPath vision transformer with dilated self-attention (LongNet) for gigapixel-level context. Achieves SOTA on 25/26 benchmark tasks including cancer subtyping, mutation prediction, and TMB classification. The first large-scale whole-slide foundation model trained on real-world clinical data.
URL
https://github.com/prov-gigapath/prov-gigapath
TITLE
A whole-slide foundation model for digital pathology from real-world data.
Main citation
Xu H, Usuyama N, Bagal V, Bredell M, Chamby A, Chen Z, Ding J, Fuhlbrück T, Géro Z, Gonzalez J, Gu Y, Xu Y, Wei MH, Wang W, Ma S, Wei F, Yang J, Li C, Gao J, Rosemon J, Bower T, Lee S, Weerasinghe R, Wright B, Robicsek A, Piening B, Bifulco C, Wang S, Poon H. (2024) A whole-slide foundation model for digital pathology from real-world data. Nature, 630(8015):181-188. doi:10.1038/s41586-024-07441-w. PMID 38778098
ABSTRACT
Digital pathology poses unique computational challenges, as a standard gigapixel slide may comprise tens of thousands of image tiles. Prior models have often resorted to subsampling a small portion of tiles for each slide, thus missing important slide-level context. Here we present Prov-GigaPath, a whole-slide pathology foundation model pretrained on 1.3 billion pathology image tiles in 171,189 whole slides from Providence, a large US health network comprising 28 cancer centres. To pretrain Prov-GigaPath, we propose GigaPath, a novel vision transformer for pretraining gigapixel pathology slides using dilated self-attention. Prov-GigaPath attains state-of-the-art performance on 25 out of 26 benchmark tasks.
DOI
10.1038/s41586-024-07441-w

TITAN

AI Imaging Pathology Foundation Model Vision-Language Whole-Slide Mahmood Lab
PUBMED_LINK
41193692
FULL NAME
TITAN — Transformer-based pathology Image and Text Alignment Network
DESCRIPTION
TITAN (Transformer-based pathology Image and Text Alignment Network) is a multimodal whole-slide foundation model from Mahmood Lab (Harvard/BWH). Pretrained on 335,645 WSIs via visual self-supervised learning and vision-language alignment with 423K synthetic captions from PathChat + 183K pathology reports. Without any fine-tuning, TITAN produces general-purpose slide representations for zero-shot classification, rare cancer retrieval, cross-modal retrieval, and pathology report generation. Outperforms both ROI and slide foundation models across diverse clinical tasks.
URL
https://github.com/mahmoodlab/TITAN
TITLE
A multimodal whole-slide foundation model for pathology.
Main citation
Ding T, Wagner SJ, Song AH, Chen RJ, Lu MY, Zhang A, Vaidya AJ, Jaume G, Shaban M, Kim A, Williamson DFK, Oldenburg L, Chen B, Alajaji A, Noor G, Sang Y, Peng T, Le LP, Mahmood F. (2025) A multimodal whole-slide foundation model for pathology. Nature Medicine, 31:3749-3761. doi:10.1038/s41591-025-03982-3. PMID 41193692
ABSTRACT
The field of computational pathology has been transformed with recent advances in foundation models that encode histopathology region-of-interests into versatile feature representations. However, translating these advancements to address complex clinical challenges at the patient and slide level remains constrained by limited clinical data. We propose TITAN, a multimodal whole-slide foundation model pretrained using 335,645 whole-slide images via visual self-supervised learning and vision-language alignment with pathology reports and 423,122 synthetic captions. Without any fine-tuning, TITAN can extract general-purpose slide representations and generate pathology reports that generalize to resource-limited clinical scenarios such as rare disease retrieval and cancer prognosis.
DOI
10.1038/s41591-025-03982-3

transferGWAS

AI GWAS Imaging Transfer Learning Deep Learning Retinal Fundus Representation Learning
PUBMED_LINK
35640976
FULL NAME
transferGWAS: GWAS of Images Using Deep Transfer Learning
DESCRIPTION
transferGWAS performs GWAS directly on full medical images using deep transfer learning: (1) a pretrained CNN (ResNet-based architecture, pretrained on ImageNet) extracts feature embeddings from raw images; (2) these learned representations are used as quantitative phenotypes for genetic association testing. Applied to UK Biobank retinal fundus images, identified 60 genomic regions including 7 novel candidate loci for eye-related traits. First demonstration of direct GWAS on whole images without predefined phenotype engineering.
URL
https://github.com/mkirchler/transferGWAS/
KEYWORDS
deep transfer learning, pretrained CNN, ResNet, retinal fundus, whole-image GWAS, representation learning, UK Biobank
TITLE
transferGWAS: GWAS of images using deep transfer learning.
Main citation
Kirchler M, Konigorski S, Norden M, Meltendorf C, Kloft M, Schurmann C, Lippert C. (2022) transferGWAS: GWAS of images using deep transfer learning. Bioinformatics, 38(14):3621-3628. doi:10.1093/bioinformatics/btac369. PMID 35640976
ABSTRACT
MOTIVATION: Medical images can provide rich information about diseases and their biology. However, investigating their association with genetic variation requires non-standard methods. We propose transferGWAS, a novel approach to perform genome-wide association studies directly on full medical images. First, we learn semantically meaningful representations of the images based on a transfer learning task, during which a deep neural network is trained on independent but similar data. Then, we perform genetic association tests with these representations. RESULTS: We validate the type I error rates and power of transferGWAS in simulation studies of synthetic images. Then we apply transferGWAS in a genome-wide association study of retinal fundus images from the UK Biobank. This first-of-a-kind GWAS of full imaging data yielded 60 genomic regions associated with retinal fundus images, of which 7 are novel candidate loci for eye-related traits and diseases.
DOI
10.1093/bioinformatics/btac369

UNI

AI Imaging Pathology Foundation Model Self-Supervised Computational Pathology
PUBMED_LINK
38504018
FULL NAME
UNI — General-Purpose Foundation Model for Computational Pathology
DESCRIPTION
UNI is a general-purpose self-supervised foundation model for computational pathology from Mahmood Lab (Harvard/BWH), pretrained on >100 million images from >100,000 H&E-stained WSIs (>77 TB) across 20 tissue types. Evaluated on 34 representative CPath tasks — outperforming prior models across cancer classification, organ transplant assessment, and rare disease analysis. Demonstrates resolution-agnostic classification, few-shot slide classification, and generalization to 108 cancer types in the OncoTree system. 1,300+ citations.
URL
https://github.com/mahmoodlab/UNI
TITLE
Towards a general-purpose foundation model for computational pathology.
Main citation
Chen RJ, Ding T, Lu MY, Williamson DFK, Jaume G, Chen B, Zhang A, Shao D, Song AH, Shaban M, Williams M, Oldenburg L, Weishaupt LL, Wang JJ, Vaidya A, Le LP, Gerber G, Sahai S, Williams W, Mahmood F. (2024) Towards a general-purpose foundation model for computational pathology. Nature Medicine, 30(3):850-862. doi:10.1038/s41591-024-02857-3. PMID 38504018
ABSTRACT
Quantitative evaluation of tissue images is crucial for computational pathology tasks. The high resolution of WSIs and the variability of morphological features present significant challenges. We introduce UNI, a general-purpose self-supervised model for pathology, pretrained using more than 100 million images from over 100,000 diagnostic H&E-stained WSIs across 20 major tissue types. The model was evaluated on 34 representative CPath tasks. UNI outperforms previous state-of-the-art models and demonstrates new capabilities including resolution-agnostic tissue classification, few-shot slide classification, and disease subtyping generalization to 108 cancer types.
DOI
10.1038/s41591-024-02857-3

Virchow

AI Imaging Pathology Foundation Model Paige Microsoft Rare Cancer Self-Supervised
PUBMED_LINK
39080966
FULL NAME
Virchow — Million-Scale Digital Pathology Foundation Model (Paige/Microsoft)
DESCRIPTION
Virchow is the first million-slide foundation model for computational pathology, developed by Paige in collaboration with Microsoft. A 632M-parameter ViT-H model trained using DINOv2 on 1.5 million H&E-stained WSIs from MSKCC (17 tissue types). Demonstrates clinical-grade pan-cancer detection with 0.95 AUC across nine common and seven rare cancers. With less training data, the pan-cancer detector built on Virchow achieves similar performance to tissue-specific clinical-grade models in production, outperforming them on rare cancer variants. Serves as the foundation for Paige's Virchow2 (3M WSIs, multimodal) and Virchow2G (1.8B parameters) models.
URL
https://huggingface.co/paige-ai/Virchow
TITLE
A foundation model for clinical-grade computational pathology and rare cancers detection.
Main citation
Vorontsov E, Bozkurt A, Casson A, Shaikovski G, Zelechowski M, Severson K, Zimmermann E, Hall J, Tenenholtz N, Fusi N, Yang E, Mathieu P, van Eck A, Lee D, Viret J, Robert E, Wang YK, Kunz JD, Lee MCH, Bernhard JH, Godrich RA, Oakley G, Millar E, Hanna M, Wen H, Retamero JA, Moye WA, Yousfi R, Kanan C, Klimstra DS, Rothrock B, Liu S, Fuchs TJ. (2024) A foundation model for clinical-grade computational pathology and rare cancers detection. Nature Medicine, 30(10):2924-2935. doi:10.1038/s41591-024-03141-0. PMID 39080966
ABSTRACT
The analysis of histopathology images with artificial intelligence aims to enable clinical decision support systems and precision medicine. We present Virchow, the largest foundation model for computational pathology to date. In addition to the evaluation of biomarker prediction and cell identification, we demonstrate that a large foundation model enables pan-cancer detection, achieving 0.95 specimen-level AUC across nine common and seven rare cancers. With less training data, the pan-cancer detector built on Virchow achieved similar performance to tissue-specific clinical-grade models in production and outperformed them on some rare variants of cancer.
DOI
10.1038/s41591-024-03141-0

UK Biobank — Brain MRI protocol

Imaging MRI
PUBMED_LINK
27643430
STAGE_PERIOD
2014–2022
DESCRIPTION
Multi-modal imaging including brain MRI, cardiac MRI, abdominal MRI, whole-body DXA, carotid ultrasound, and resting ECG. Target of 100k participants achieved by 2022.
URL
https://www.ukbiobank.ac.uk/
TITLE
Multimodal population brain imaging in the UK Biobank prospective epidemiological study