Skip to content

Tools Functional prediction

Curation of Functional prediction — listings under the GWAS Tools tab.

Summary Table

Click a column header to sort the table.

NAME Main citation YEAR
MENTR
Koido M et al., Nat Biomed Eng, 2023
2023
popEVE
Orenbuch R et al., Nat Genet, 2025
2025

MENTR

Tool
PUBMED_LINK
36411359
FULL NAME
mutation effect prediction on ncRNA transcription
DESCRIPTION
A machine-learning model (MENTR) that reliably links genome sequence and ncRNA expression at the cell type level
URL
https://github.com/koido/MENTR
TITLE
Prediction of the cell-type-specific transcription of non-coding RNAs from genome sequences via machine learning.
Main citation
Koido M, Hon CC, Koyama S, Kawaji H, ...&, Terao C. (2023) Prediction of the cell-type-specific transcription of non-coding RNAs from genome sequences via machine learning. Nat Biomed Eng, 7 (6) 830-844. doi:10.1038/s41551-022-00961-8. PMID 36411359
ABSTRACT
Gene transcription is regulated through complex mechanisms involving non-coding RNAs (ncRNAs). As the transcription of ncRNAs, especially of enhancer RNAs, is often low and cell type specific, how the levels of RNA transcription depend on genotype remains largely unexplored. Here we report the development and utility of a machine-learning model (MENTR) that reliably links genome sequence and ncRNA expression at the cell type level. Effects on ncRNA transcription predicted by the model were concordant with estimates from published studies in a cell-type-dependent manner, regardless of allele frequency and genetic linkage. Among 41,223 variants from genome-wide association studies, the model identified 7,775 enhancer RNAs and 3,548 long ncRNAs causally associated with complex traits across 348 major human primary cells and tissues, such as rare variants plausibly altering the transcription of enhancer RNAs to influence the risks of Crohn's disease and asthma. The model may aid the discovery of causal variants and the generation of testable hypotheses for biological mechanisms driving complex traits.
DOI
10.1038/s41551-022-00961-8

popEVE

Tool
PUBMED_LINK
41286104
DESCRIPTION
popEVE is a proteome-wide deep generative model that scores missense variant pathogenicity by combining cross-species evolutionary predictors with human population cohort data, aiming for well-calibrated, human-specific deleteriousness estimates.
URL
https://github.com/debbiemarkslab/popEVE
KEYWORDS
missense, pathogenicity, deep generative model, proteome-wide, UK Biobank
TITLE
Proteome-wide model for human disease genetics.
Main citation
Orenbuch R, Shearer CA, Kollasch AW, Spinner AD, ...&, Marks DS. (2025) Proteome-wide model for human disease genetics. Nat Genet, 57 (12) 3165-3174. doi:10.1038/s41588-025-02400-1. PMID 41286104
ABSTRACT
Missense variants remain a challenge in genetic interpretation owing to their subtle and context-dependent effects. Although current prediction models perform well in known disease genes, their scores are not calibrated across the proteome, limiting generalizability. To address this knowledge gap, we developed popEVE, a deep generative model combining evolutionary and human population data to estimate variant deleteriousness on a proteome-wide scale. popEVE achieves state-of-the-art performance without overestimating the burden of deleterious variants and identifies variants in 442 genes in a severe developmental disorder cohort, including 123 novel candidates. These genes are functionally similar to known disease genes, and their variants often localize to critical regions. Remarkably, popEVE can prioritize likely causal variants using only child exomes, enabling diagnosis even without parental sequencing. This work provides a generalizable framework for rare disease variant interpretation, especially in singleton cases, and demonstrates the utility of calibrated, evolution-informed scoring models for clinical genomics.
DOI
10.1038/s41588-025-02400-1