Machine Learning

Catalog entries using this tag (links open the entry card on its page):

Entries

Causal ML for scGenomics (Causal ML sc)

AI GWAS Causal ML Single Cell Machine Learning Nat Genet

PUBMED_LINK

40164735

FULL NAME

Causal Machine Learning for Single-Cell Genomics

DESCRIPTION

A Perspective from Nature Genetics delineating the application of causal machine learning to single-cell genomics. Discusses causal models, challenges in inferring causative roles of genes from single-cell omics data combined with perturbation screens, and the potential for integrating causal ML with GWAS to understand disease mechanisms at single-cell resolution.

Show full descriptionShow less

TITLE

Causal machine learning for single-cell genomics.

ABSTRACT

Advances in single-cell '-omics' allow unprecedented insights into the transcriptional profiles of individual cells and, when combined with large-scale perturbation screens, enable measuring of the effect of targeted perturbations on the whole transcriptome. In this Perspective, we delineate the application of causal machine learning to single-cell genomics and its associated challenges, presenting the causal model most commonly applied to single-cell biology.

Show full abstractShow less

DOI

10.1038/s41588-025-02124-2

Haas ME (ML Liver Fat GWAS)

AI GWAS Imaging Machine Learning Liver Fat Abdominal MRI UK Biobank

PUBMED_LINK

34957434

FULL NAME

Machine Learning Enables New Insights into Genetic Contributions to Liver Fat Accumulation

DESCRIPTION

Developed an abdominal MRI-based machine-learning regression model (gradient-boosted regression on raw MRI signal intensities) to accurately estimate liver fat from UK Biobank abdominal MRI scans (correlation 0.97-0.99 with ground truth). Trained on 4,511 participants with gold-standard MRI biomarker measurements and applied to 32,192 additional individuals. GWAS identified 8 associated variants (5 novel: MTARC1, ADH1B, TRIB1, GPAM, MAST3) and a polygenic score strongly associated with future chronic liver disease risk (HR>1.32 per SD, p<9e-17).

Show full descriptionShow less

KEYWORDS

MRI signal regression, liver fat quantification, abdominal MRI, hepatic steatosis, gradient boosting, UK Biobank

Show full keywordsShow less

TITLE

Machine learning enables new insights into genetic contributions to liver fat accumulation.

Main citation

Haas ME, Pirruccello JP, Friedman SN, Wang M, ...&, Khera AV. (2021) Machine learning enables new insights into genetic contributions to liver fat accumulation. Cell Genom, 1 (3). doi:10.1016/j.xgen.2021.100066. PMID 34957434

ABSTRACT

Excess liver fat, called hepatic steatosis, is a leading risk factor for end-stage liver disease and cardiometabolic diseases but often remains undiagnosed in clinical practice because of the need for direct imaging assessments. We developed an abdominal MRI-based machine-learning algorithm to accurately estimate liver fat from a truth dataset of 4,511 middle-aged UK Biobank participants, enabling quantification in 32,192 additional individuals. A genome-wide association study of common genetic variants and liver fat replicated three known associations and identified five newly associated variants.

Show full abstractShow less

DOI

10.1016/j.xgen.2021.100066

MILTON

AI GWAS Machine Learning Disease Prediction UK Biobank Multi-omics Nat Genet

PUBMED_LINK

39261665

FULL NAME

MILTON - Machine Learning with Phenotype Associations for Disease Prediction

DESCRIPTION

MILTON is an ensemble machine learning framework that utilizes biomarkers and multi-omics data to predict 3,213 diseases in the UK Biobank. It predicts incident disease cases undiagnosed at time of recruitment and demonstrates utility in augmenting genetic association discovery by empowering case-control GWAS with predicted phenotypes. Published in Nature Genetics.

Show full descriptionShow less

TITLE

Disease prediction with multi-omics and biomarkers empowers case-control genetic discoveries in the UK Biobank.

ABSTRACT

The emergence of biobank-level datasets offers new opportunities to discover novel biomarkers and develop predictive algorithms for human disease. Here, we present an ensemble machine-learning framework (machine learning with phenotype associations, MILTON) utilizing a range of biomarkers to predict 3,213 diseases in the UK Biobank. MILTON predicts incident disease cases undiagnosed at time of recruitment, largely outperforming available polygenic risk scores, and augments genetic association discovery.

Show full abstractShow less

DOI

10.1038/s41588-024-01898-1

PoPS

AI GWAS Gene Prioritization Machine Learning Polygenic Nat Genet

PUBMED_LINK

37443254

FULL NAME

PoPS - Polygenic Priority Score for Gene Prioritization

DESCRIPTION

PoPS (Polygenic Priority Score) is a method that learns trait-relevant gene features, such as cell-type-specific expression, to prioritize genes at GWAS loci. It leverages polygenic enrichments across multiple gene features to predict causal genes underlying complex traits and diseases. Published in Nature Genetics.

Show full descriptionShow less

URL

https://github.com/FinucaneLab/pops

TITLE

Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases.

ABSTRACT

Genome-wide association studies (GWASs) are a valuable tool for understanding the biology of complex human traits and diseases, but associated variants rarely point directly to causal genes. In the present study, we introduce a new method, polygenic priority score (PoPS), that learns trait-relevant gene features, such as cell-type-specific expression, to prioritize genes at GWAS loci. PoPS and the closest gene individually outperform other gene prioritization methods.

Show full abstractShow less

DOI

10.1038/s41588-023-01443-6

SynSurr

AI GWAS Machine Learning Phenotype Imputation Synthetic Surrogates Nat Genet

PUBMED_LINK

38872030

FULL NAME

SynSurr - Synthetic Surrogates for GWAS of Missing Phenotypes

DESCRIPTION

SynSurr (Synthetic Surrogate analysis) is a method that makes GWAS on imputed phenotypes robust to imputation errors. Rather than replacing missing values, SynSurr jointly analyzes the observed and imputed data to provide calibrated association statistics, improving power for genome-wide association studies of partially missing phenotypes in population biobanks. Published in Nature Genetics.

Show full descriptionShow less

TITLE

Synthetic surrogates improve power for genome-wide association studies of partially missing phenotypes in population biobanks.

ABSTRACT

Within population biobanks, incomplete measurement of certain traits limits the power for genetic discovery. Machine learning is increasingly used to impute the missing values from the available data. However, performing GWAS on imputed traits can introduce spurious associations. Here we introduce SynSurr analysis, which makes GWAS on imputed phenotypes robust to imputation errors by jointly analyzing observed and imputed data.

Show full abstractShow less

DOI

10.1038/s41588-024-01793-9