Tools Tissue and single cell

Curation of Tissue and single cell — listings under the GWAS Tools tab.

Summary Table

Click a column header to sort the table.

NAME	Main citation	YEAR
CoCoNet	Shang L et al., PLoS Genet, 2020	2020
EPIC	Wang R et al., PLoS Genet, 2022	2022
LDSC-SEG	Finucane HK et al., Nat Genet, 2018	2018
MAGMA	de Leeuw CA et al., PLoS Comput Biol, 2015	2015
RolyPoly	Calderon D et al., Am J Hum Genet, 2017	2017
SCARlink	Mitra S et al., Nat Genet, 2024	2024
SCAVENGE	Yu F et al., Nat Biotechnol, 2022	2022
SCENT	Sakaue S et al., Nat Genet, 2024	2024
TCSC	Amariuta T et al., Nat Genet, 2023	2023
cellAdmix	Mitchel J et al., Nat Genet, 2026	2026
gsMap	Song L et al., Nature, 2025	2025
pgBoost	Dorans, E. R., Jagadeesh, K., Dey, K., & Price, A. L. (2024). Linking regulatory variants to target genes by…	NA
sc-linker	Jagadeesh KA et al., Nat Genet, 2022	2022
scDRS	Zhang MJ et al., Nat Genet, 2022	2022
scEPS	Zou L et al., medRxiv, 2026	2026
scGWAS	Jia P et al., Genome Biol, 2022	2022
scPagwas	Ma Y et al., Cell Genom, 2023	2023
seismic	Lai Q et al., Nat Commun, 2025	2025

CoCoNet

Tool

PUBMED_LINK

32310941

DESCRIPTION

CoCoNet is a composite likelihood-based covariance regression network model for identifying trait-relevant tissues or cell types.

Show full descriptionShow less

URL

https://xiangzhou.github.io/software/

KEYWORDS

composite likelihood-based inference algorithm

Show full keywordsShow less

TITLE

Leveraging gene co-expression patterns to infer trait-relevant tissues in genome-wide association studies.

Main citation

Shang L, Smith JA, Zhou X. (2020) Leveraging gene co-expression patterns to infer trait-relevant tissues in genome-wide association studies. PLoS Genet, 16 (4) e1008734. doi:10.1371/journal.pgen.1008734. PMID 32310941

ABSTRACT

Genome-wide association studies (GWASs) have identified many SNPs associated with various common diseases. Understanding the biological functions of these identified SNP associations requires identifying disease/trait relevant tissues or cell types. Here, we develop a network method, CoCoNet, to facilitate the identification of trait-relevant tissues or cell types. Different from existing approaches, CoCoNet incorporates tissue-specific gene co-expression networks constructed from either bulk or single cell RNA sequencing (RNAseq) studies with GWAS data for trait-tissue inference. In particular, CoCoNet relies on a covariance regression network model to express gene-level effect measurements for the given GWAS trait as a function of the tissue-specific co-expression adjacency matrix. With a composite likelihood-based inference algorithm, CoCoNet is scalable to tens of thousands of genes. We validate the performance of CoCoNet through extensive simulations. We apply CoCoNet for an in-depth analysis of four neurological disorders and four autoimmune diseases, where we integrate the corresponding GWASs with bulk RNAseq data from 38 tissues and single cell RNAseq data from 10 cell types. In the real data applications, we show how CoCoNet can help identify specific glial cell types relevant for neurological disorders and identify disease-targeted colon tissues as relevant for autoimmune diseases.

Show full abstractShow less

DOI

10.1371/journal.pgen.1008734

EPIC

Tool

PUBMED_LINK

35709291

FULL NAME

cEll tyPe enrIChment

DESCRIPTION

Inferring relevant tissues and cell types for complex traits in genome-wide association studies

Show full descriptionShow less

URL

https://github.com/rujinwang/EPIC

KEYWORDS

GWAS, scRNA-seq

Show full keywordsShow less

TITLE

EPIC: Inferring relevant cell types for complex traits by integrating genome-wide association studies and single-cell RNA sequencing.

Main citation

Wang R, Lin DY, Jiang Y. (2022) EPIC: Inferring relevant cell types for complex traits by integrating genome-wide association studies and single-cell RNA sequencing. PLoS Genet, 18 (6) e1010251. doi:10.1371/journal.pgen.1010251. PMID 35709291

ABSTRACT

More than a decade of genome-wide association studies (GWASs) have identified genetic risk variants that are significantly associated with complex traits. Emerging evidence suggests that the function of trait-associated variants likely acts in a tissue- or cell-type-specific fashion. Yet, it remains challenging to prioritize trait-relevant tissues or cell types to elucidate disease etiology. Here, we present EPIC (cEll tyPe enrIChment), a statistical framework that relates large-scale GWAS summary statistics to cell-type-specific gene expression measurements from single-cell RNA sequencing (scRNA-seq). We derive powerful gene-level test statistics for common and rare variants, separately and jointly, and adopt generalized least squares to prioritize trait-relevant cell types while accounting for the correlation structures both within and between genes. Using enrichment of loci associated with four lipid traits in the liver and enrichment of loci associated with three neurological disorders in the brain as ground truths, we show that EPIC outperforms existing methods. We apply our framework to multiple scRNA-seq datasets from different platforms and identify cell types underlying type 2 diabetes and schizophrenia. The enrichment is replicated using independent GWAS and scRNA-seq datasets and further validated using PubMed search and existing bulk case-control testing results.

Show full abstractShow less

DOI

10.1371/journal.pgen.1010251

LDSC-SEG

Tool

PUBMED_LINK

29632380

FULL NAME

LD score regression applied to specifically expressed genes

URL

https://github.com/bulik/ldsc

KEYWORDS

LDSC, tissue, cell type

Show full keywordsShow less

TITLE

Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types.

Main citation

Finucane HK, Reshef YA, Anttila V, Slowikowski K, ...&, Price AL. (2018) Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat Genet, 50 (4) 621-629. doi:10.1038/s41588-018-0081-4. PMID 29632380

ABSTRACT

We introduce an approach to identify disease-relevant tissues and cell types by analyzing gene expression data together with genome-wide association study (GWAS) summary statistics. Our approach uses stratified linkage disequilibrium (LD) score regression to test whether disease heritability is enriched in regions surrounding genes with the highest specific expression in a given tissue. We applied our approach to gene expression data from several sources together with GWAS summary statistics for 48 diseases and traits (average N = 169,331) and found significant tissue-specific enrichments (false discovery rate (FDR) < 5%) for 34 traits. In our analysis of multiple tissues, we detected a broad range of enrichments that recapitulated known biology. In our brain-specific analysis, significant enrichments included an enrichment of inhibitory over excitatory neurons for bipolar disorder, and excitatory over inhibitory neurons for schizophrenia and body mass index. Our results demonstrate that our polygenic approach is a powerful way to leverage gene expression data for interpreting GWAS signals.

Show full abstractShow less

DOI

10.1038/s41588-018-0081-4

MAGMA

Tool

PUBMED_LINK

25885710

FULL NAME

Multi-marker Analysis of GenoMic Annotation

URL

https://ctg.cncr.nl/software/magma

TITLE

MAGMA: generalized gene-set analysis of GWAS data.

Main citation

de Leeuw CA, Mooij JM, Heskes T, Posthuma D. (2015) MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput Biol, 11 (4) e1004219. doi:10.1371/journal.pcbi.1004219. PMID 25885710

ABSTRACT

By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.

Show full abstractShow less

DOI

10.1371/journal.pcbi.1004219

RolyPoly

Tool

PUBMED_LINK

29106824

DESCRIPTION

RolyPoly is a regression-based polygenic model that can prioritize trait-relevant cell types and genes from GWAS summary statistics and gene expression data.

Show full descriptionShow less

URL

https://github.com/dcalderon/rolypoly

TITLE

Inferring Relevant Cell Types for Complex Traits by Using Single-Cell Gene Expression.

Main citation

Calderon D, Bhaskar A, Knowles DA, Golan D, ...&, Pritchard JK. (2017) Inferring Relevant Cell Types for Complex Traits by Using Single-Cell Gene Expression. Am J Hum Genet, 101 (5) 686-699. doi:10.1016/j.ajhg.2017.09.009. PMID 29106824

ABSTRACT

Previous studies have prioritized trait-relevant cell types by looking for an enrichment of genome-wide association study (GWAS) signal within functional regions. However, these studies are limited in cell resolution by the lack of functional annotations from difficult-to-characterize or rare cell populations. Measurement of single-cell gene expression has become a popular method for characterizing novel cell types, and yet limited work has linked single-cell RNA sequencing (RNA-seq) to phenotypes of interest. To address this deficiency, we present RolyPoly, a regression-based polygenic model that can prioritize trait-relevant cell types and genes from GWAS summary statistics and gene expression data. RolyPoly is designed to use expression data from either bulk tissue or single-cell RNA-seq. In this study, we demonstrated RolyPoly's accuracy through simulation and validated previously known tissue-trait associations. We discovered a significant association between microglia and late-onset Alzheimer disease and an association between schizophrenia and oligodendrocytes and replicating fetal cortical cells. Additionally, RolyPoly computes a trait-relevance score for each gene to reflect the importance of expression specific to a cell type. We found that differentially expressed genes in the prefrontal cortex of individuals with Alzheimer disease were significantly enriched with genes ranked highly by RolyPoly gene scores. Overall, our method represents a powerful framework for understanding the effect of common variants on cell types contributing to complex traits.

Show full abstractShow less

DOI

10.1016/j.ajhg.2017.09.009

SCARlink

Tool

PUBMED_LINK

38514783

FULL NAME

single-cell ATAC + RNA linking

DESCRIPTION

Single-cell ATAC+RNA linking (SCARlink) uses multiomic single-cell ATAC and RNA to predict gene expression from chromatin accessibility and predict regulatory regions.

Show full descriptionShow less

URL

https://github.com/snehamitra/SCARlink/

KEYWORDS

Possion regression, scATAC, scRNA, tile-level accessibility

Show full keywordsShow less

TITLE

Single-cell multi-ome regression models identify functional and disease-associated enhancers and enable chromatin potential analysis.

Main citation

Mitra S, Malik R, Wong W, Rahman A, ...&, Leslie CS. (2024) Single-cell multi-ome regression models identify functional and disease-associated enhancers and enable chromatin potential analysis. Nat Genet, 56 (4) 627-636. doi:10.1038/s41588-024-01689-8. PMID 38514783

ABSTRACT

We present a gene-level regulatory model, single-cell ATAC + RNA linking (SCARlink), which predicts single-cell gene expression and links enhancers to target genes using multi-ome (scRNA-seq and scATAC-seq co-assay) sequencing data. The approach uses regularized Poisson regression on tile-level accessibility data to jointly model all regulatory effects at a gene locus, avoiding the limitations of pairwise gene-peak correlations and dependence on peak calling. SCARlink outperformed existing gene scoring methods for imputing gene expression from chromatin accessibility across high-coverage multi-ome datasets while giving comparable to improved performance on low-coverage datasets. Shapley value analysis on trained models identified cell-type-specific gene enhancers that are validated by promoter capture Hi-C and are 11× to 15× and 5× to 12× enriched in fine-mapped eQTLs and fine-mapped genome-wide association study (GWAS) variants, respectively. We further show that SCARlink-predicted and observed gene expression vectors provide a robust way to compute a chromatin potential vector field to enable developmental trajectory analysis.

Show full abstractShow less

DOI

10.1038/s41588-024-01689-8

ARROW_SUMMARY

scRNA-seq + scATAC-seq → Tile-level chromatin accessibility modeling → Regularized Poisson regression (SCARlink) → Predict gene expression & link enhancers to genes → Identify functional and disease-associated enhancers

SCAVENGE

Tool

PUBMED_LINK

35668323

FULL NAME

Single Cell Analysis of Variant Enrichment through Network propagation of GEnomic data

URL

https://github.com/sankaranlab/SCAVENGE

KEYWORDS

GWAS, scATAC, network propagation

Show full keywordsShow less

TITLE

Variant to function mapping at single-cell resolution through network propagation.

Main citation

Yu F, Cato LD, Weng C, Liggett LA, ...&, Sankaran VG. (2022) Variant to function mapping at single-cell resolution through network propagation. Nat Biotechnol, 40 (11) 1644-1653. doi:10.1038/s41587-022-01341-y. PMID 35668323

ABSTRACT

Genome-wide association studies in combination with single-cell genomic atlases can provide insights into the mechanisms of disease-causal genetic variation. However, identification of disease-relevant or trait-relevant cell types, states and trajectories is often hampered by sparsity and noise, particularly in the analysis of single-cell epigenomic data. To overcome these challenges, we present SCAVENGE, a computational algorithm that uses network propagation to map causal variants to their relevant cellular context at single-cell resolution. We demonstrate how SCAVENGE can help identify key biological mechanisms underlying human genetic variation, applying the method to blood traits at distinct stages of human hematopoiesis, to monocyte subsets that increase the risk for severe Coronavirus Disease 2019 (COVID-19) and to intermediate lymphocyte developmental states that predispose to acute leukemia. Our approach not only provides a framework for enabling variant-to-function insights at single-cell resolution but also suggests a more general strategy for maximizing the inferences that can be made using single-cell genomic data.

Show full abstractShow less

DOI

10.1038/s41587-022-01341-y

SCENT

Tool

PUBMED_LINK

38594305

FULL NAME

single-cell enhancer target gene mapping

DESCRIPTION

SCENT uses single-cell multimodal data (e.g., 10X Multiome RNA/ATAC) and links ATAC-seq peaks (putative enhancers) to their target genes by modeling association between chromatin accessibility and gene expression across individual single cells.

Show full descriptionShow less

URL

https://github.com/immunogenomics/SCENT

KEYWORDS

Possion regression, scATAC-seq, scRNA-seq

Show full keywordsShow less

TITLE

Tissue-specific enhancer-gene maps from multimodal single-cell data identify causal disease alleles.

Main citation

Sakaue S, Weinand K, Isaac S, Dey KK, ...&, Raychaudhuri S. (2024) Tissue-specific enhancer-gene maps from multimodal single-cell data identify causal disease alleles. Nat Genet, 56 (4) 615-626. doi:10.1038/s41588-024-01682-1. PMID 38594305

ABSTRACT

Translating genome-wide association study (GWAS) loci into causal variants and genes requires accurate cell-type-specific enhancer-gene maps from disease-relevant tissues. Building enhancer-gene maps is essential but challenging with current experimental methods in primary human tissues. Here we developed a nonparametric statistical method, SCENT (single-cell enhancer target gene mapping), that models association between enhancer chromatin accessibility and gene expression in single-cell or nucleus multimodal RNA sequencing and ATAC sequencing data. We applied SCENT to 9 multimodal datasets including >120,000 single cells or nuclei and created 23 cell-type-specific enhancer-gene maps. These maps were highly enriched for causal variants in expression quantitative loci and GWAS for 1,143 diseases and traits. We identified likely causal genes for both common and rare diseases and linked somatic mutation hotspots to target genes. We demonstrate that application of SCENT to multimodal data from disease-relevant human tissue enables the scalable construction of accurate cell-type-specific enhancer-gene maps, essential for defining noncoding variant function.

Show full abstractShow less

DOI

10.1038/s41588-024-01682-1

ARROW_SUMMARY

Extract chromatin accessibility (ATAC-seq) & gene expression (RNA-seq) from single cells → Group cells by type → For each gene, define candidate enhancers within 1 Mb → Use distance-weighted non-parametric regression to model enhancer–gene associations → Assess significance via permutation testing → Build enhancer–gene links per cell type

TCSC

Tool

PUBMED_LINK

37580597

FULL NAME

Tissue co-regulation score regression

DESCRIPTION

TCSC is a statistical genetics method to identify causal tissues in diseases and complex traits. We leverage TWAS and GWAS summary statistics while explicitly modeling the genetic co-regulation of genes across tissues.

Show full descriptionShow less

URL

https://github.com/TiffanyAmariuta/TCSC/

TITLE

Modeling tissue co-regulation estimates tissue-specific contributions to disease.

Main citation

Amariuta T, Siewert-Rocks K, Price AL. (2023) Modeling tissue co-regulation estimates tissue-specific contributions to disease. Nat Genet, 55 (9) 1503-1511. doi:10.1038/s41588-023-01474-z. PMID 37580597

ABSTRACT

Integrative analyses of genome-wide association studies and gene expression data have implicated many disease-critical tissues. However, co-regulation of genetic effects on gene expression across tissues impedes distinguishing biologically causal tissues from tagging tissues. In the present study, we introduce tissue co-regulation score regression (TCSC), which disentangles causal tissues from tagging tissues by regressing gene-disease association statistics (from transcriptome-wide association studies) on tissue co-regulation scores, reflecting correlations of predicted gene expression across genes and tissues. We applied TCSC to 78 diseases/traits (average n = 302,000) and gene expression prediction models for 48 GTEx tissues. TCSC identified 21 causal tissue-trait pairs at a 5% false discovery rate (FDR), including well-established findings, biologically plausible new findings (for example, aorta artery and glaucoma) and increased specificity of known tissue-trait associations (for example, subcutaneous adipose, but not visceral adipose, and high-density lipoprotein). TCSC also identified 17 causal tissue-trait covariance pairs at 5% FDR. In conclusion, TCSC is a precise method for distinguishing causal tissues from tagging tissues.

Show full abstractShow less

DOI

10.1038/s41588-023-01474-z

cellAdmix

Tool

PUBMED_LINK

41559218

DESCRIPTION

cellAdmix detects and corrects segmentation errors in imaging-based spatial transcriptomics by factorizing local molecular neighborhoods—analogous to doublet removal in scRNA-seq—to reassign transcripts that spill across cell boundaries.

Show full descriptionShow less

URL

https://github.com/kharchenkolab/cellAdmix ,http://pklab.org/peterk/cellAdmix/

KEYWORDS

spatial transcriptomics, segmentation, matrix factorization, imaging-based ST

Show full keywordsShow less

TITLE

Impact and correction of segmentation errors in spatial transcriptomics.

Main citation

Mitchel J, Gao T, Petukhov V, Cole E, ...&, Kharchenko PV. (2026) Impact and correction of segmentation errors in spatial transcriptomics. Nat Genet, 58 (2) 434-444. doi:10.1038/s41588-025-02497-4. PMID 41559218

ABSTRACT

Spatial transcriptomics aims to elucidate how cells coordinate within tissues by connecting cellular states to their native microenvironments. Imaging-based assays are especially promising, capturing molecular and cellular features at subcellular resolution in three dimensions. Interpretation of such data, however, hinges on accurate cell segmentation. Assigning individual molecules to the correct cells remains challenging. Here we re-analyze data from multiple tissues and platforms to find that segmentation errors currently confound most downstream analysis of cellular state, including differential expression, neighbor influence and ligand-receptor interactions. The extent to which misassigned molecules impact the results can be striking, frequently dominating the results. Thus, we show that matrix factorization of local molecular neighborhoods can effectively identify and isolate such molecular admixtures, thereby reducing their impact on downstream analyses, in a manner analogous to doublet filtering in single-cell RNA sequencing. As the applications of spatial transcriptomics assays become more widespread, accounting for segmentation errors will be important for resolving molecular mechanisms of tissue biology.

Show full abstractShow less

DOI

10.1038/s41588-025-02497-4

gsMap

Tool

PUBMED_LINK

40108460

FULL NAME

genetically informed spatial mapping of cells for complex traits

DESCRIPTION

gsMap (genetically informed spatial mapping of cells for complex traits) integrates spatial transcriptomics (ST) data with genome-wide association study (GWAS) summary statistics to map cells to human complex traits, including diseases, in a spatially resolved manner.

Show full descriptionShow less

URL

https://github.com/JianYang-Lab/gsMap

KEYWORDS

spatial transciptomics

Show full keywordsShow less

TITLE

Spatially resolved mapping of cells associated with human complex traits.

Main citation

Song L, Chen W, Hou J, Guo M, ...&, Yang J. (2025) Spatially resolved mapping of cells associated with human complex traits. Nature, 641 (8064) 932-941. doi:10.1038/s41586-025-08757-x. PMID 40108460

ABSTRACT

Depicting spatial distributions of disease-relevant cells is crucial for understanding disease pathology1,2. Here we present genetically informed spatial mapping of cells for complex traits (gsMap), a method that integrates spatial transcriptomics data with summary statistics from genome-wide association studies to map cells to human complex traits, including diseases, in a spatially resolved manner. Using embryonic spatial transcriptomics datasets covering 25 organs, we benchmarked gsMap through simulation and by corroborating known trait-associated cells or regions in various organs. Applying gsMap to brain spatial transcriptomics data, we reveal that the spatial distribution of glutamatergic neurons associated with schizophrenia more closely resembles that for cognitive traits than that for mood traits such as depression. The schizophrenia-associated glutamatergic neurons were distributed near the dorsal hippocampus, with upregulated expression of calcium signalling and regulation genes, whereas depression-associated glutamatergic neurons were distributed near the deep medial prefrontal cortex, with upregulated expression of neuroplasticity and psychiatric drug target genes. Our study provides a method for spatially resolved mapping of trait-associated cells and demonstrates the gain of biological insights (such as the spatial distribution of trait-relevant cells and related signature genes) through these maps.

Show full abstractShow less

DOI

10.1038/s41586-025-08757-x

ARROW_SUMMARY

Spatial transcriptomics data + GWAS summary statistics → Graph Neural Network identifies homogeneous spatial domains → Compute Gene Specificity Scores (GSS) for each spot → Map GSS to nearby SNPs → Perform Stratified LD Score Regression (S-LDSC) to assess trait heritability enrichment → Aggregate spot-level p-values using the Cauchy Combination Test to identify trait-associated spatial regions

pgBoost

Tool

DESCRIPTION

pgBoost is an integrative modeling framework that trains a non-linear combination of existing linking strategies (including genomic distance) on fine-mapped eQTL data to assign a probabilistic score to each candidate SNP-gene link.

Show full descriptionShow less

URL

https://github.com/elizabethdorans/pgBoost

KEYWORDS

eQTL-informed gradient boosting

Show full keywordsShow less

PREPRINT_DOI

10.1101/2024.05.24.24307813

Main citation

Dorans, E. R., Jagadeesh, K., Dey, K., & Price, A. L. (2024). Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance. medRxiv, 2024-05.

sc-linker

Tool

PUBMED_LINK

36175791

DESCRIPTION

a framework for integrating single-cell RNA-sequencing, epigenomic SNP-to-gene maps and genome-wide association study summary statistics to infer the underlying cell types and processes by which genetic variants influence disease

Show full descriptionShow less

URL

https://alkesgroup.broadinstitute.org/LDSCORE/Jagadeesh_Dey_sclinker

KEYWORDS

GWAS, scRNA-seq

Show full keywordsShow less

TITLE

Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics.

Main citation

Jagadeesh KA, Dey KK, Montoro DT, Mohan R, ...&, Regev A. (2022) Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics. Nat Genet, 54 (10) 1479-1492. doi:10.1038/s41588-022-01187-9. PMID 36175791

ABSTRACT

Genome-wide association studies provide a powerful means of identifying loci and genes contributing to disease, but in many cases, the related cell types/states through which genes confer disease risk remain unknown. Deciphering such relationships is important for identifying pathogenic processes and developing therapeutics. In the present study, we introduce sc-linker, a framework for integrating single-cell RNA-sequencing, epigenomic SNP-to-gene maps and genome-wide association study summary statistics to infer the underlying cell types and processes by which genetic variants influence disease. The inferred disease enrichments recapitulated known biology and highlighted notable cell-disease relationships, including γ-aminobutyric acid-ergic neurons in major depressive disorder, a disease-dependent M-cell program in ulcerative colitis and a disease-specific complement cascade process in multiple sclerosis. In autoimmune disease, both healthy and disease-dependent immune cell-type programs were associated, whereas only disease-dependent epithelial cell programs were prominent, suggesting a role in disease response rather than initiation. Our framework provides a powerful approach for identifying the cell types and cellular processes by which genetic variants influence disease.

Show full abstractShow less

DOI

10.1038/s41588-022-01187-9

ARROW_SUMMARY

scRNA-seq data →️ Derive cell-type-specific gene programs →️ Map SNPs to genes using epigenomic data →️ Integrate with GWAS summary statistics →️ Identify disease-critical cell types and processes

scDRS

Tool

PUBMED_LINK

36050550

FULL NAME

single-cell Disease Relevance Score

DESCRIPTION

an approach that links scRNA-seq with polygenic disease risk at single-cell resolution, independent of annotated cell types

Show full descriptionShow less

URL

https://github.com/martinjzhang/scDRS

KEYWORDS

GWAS, scRNA-seq

Show full keywordsShow less

TITLE

Polygenic enrichment distinguishes disease associations of individual cells in single-cell RNA-seq data.

Main citation

Zhang MJ, Hou K, Dey KK, Sakaue S, ...&, Price AL. (2022) Polygenic enrichment distinguishes disease associations of individual cells in single-cell RNA-seq data. Nat Genet, 54 (10) 1572-1580. doi:10.1038/s41588-022-01167-z. PMID 36050550

ABSTRACT

Single-cell RNA sequencing (scRNA-seq) provides unique insights into the pathology and cellular origin of disease. We introduce single-cell disease relevance score (scDRS), an approach that links scRNA-seq with polygenic disease risk at single-cell resolution, independent of annotated cell types. scDRS identifies cells exhibiting excess expression across disease-associated genes implicated by genome-wide association studies (GWASs). We applied scDRS to 74 diseases/traits and 1.3 million single-cell gene-expression profiles across 31 tissues/organs. Cell-type-level results broadly recapitulated known cell-type-disease associations. Individual-cell-level results identified subpopulations of disease-associated cells not captured by existing cell-type labels, including T cell subpopulations associated with inflammatory bowel disease, partially characterized by their effector-like states; neuron subpopulations associated with schizophrenia, partially characterized by their spatial locations; and hepatocyte subpopulations associated with triglyceride levels, partially characterized by their higher ploidy levels. Genes whose expression was correlated with the scDRS score across cells (reflecting coexpression with GWAS disease-associated genes) were strongly enriched for gold-standard drug target and Mendelian disease genes.

Show full abstractShow less

DOI

10.1038/s41588-022-01167-z

ARROW_SUMMARY

GWAS summary statistics → Select putative disease genes via MAGMA → Compute scDRS using Monte Carlo-based score aggregation → Normalize with control gene sets → Rank cells by disease relevance → Identify enriched subpopulations and co-expressed gene networks

scEPS

Tool

FULL NAME

single-cell Expression exPlainability Statistics

DESCRIPTION

A method that integrates GWAS and single-cell disease atlas data to identify disease-associated cell neighborhoods by testing whether GWAS-prioritized gene expression explains more phenotypic variance than mean-expression-matched control genes. When applied to PRSs of healthy donors, captures genetic covariance between gene expression and disease, mitigating reverse causation. 1.77× and 5.13× more associations than CNA and scDRS, respectively.

Show full descriptionShow less

URL

https://github.com/Genentech/sceps

KEYWORDS

GWAS, scRNA-seq, single-cell, disease association, PRS, cell neighborhood

Show full keywordsShow less

TITLE

scEPS integrates genetic and single-cell disease atlas data to provide granular mechanistic insights into complex human diseases.

Main citation

Zou L, Whitley O, Tseng HW, Simopoulos C, ...&, Shi H. (2026) scEPS integrates genetic and single-cell disease atlas data to provide granular mechanistic insights into complex human diseases. medRxiv. doi:10.64898/2026.06.26.26356714

ABSTRACT

Integrating GWAS and single-cell data holds great potential for prioritizing causal disease biology at cellular resolution. Recent integrative approaches typically assess the enrichment of disease genetic signals in cell types or individual cells, without directly modeling disease phenotypes. We develop a new method, single-cell Expression exPlainability Statistics (scEPS), for identifying disease-associated cell neighborhoods, by explicitly testing whether the expression of GWAS-prioritized genes explains more variance in a disease than randomly selected, mean-expression-matched control genes. Crucially, when applied to PRSs of healthy donors, scEPS captures the genetic covariance between gene expression and diseases, mitigating the effect of reverse causation and prioritizing cell populations mediating the effects of GWAS genes. We applied scEPS to clinical diagnoses and PRSs of 4 neurological and 4 respiratory disorders, integrating brain and lung cell atlas data, respectively, with respective GWAS summary statistics data. scEPS recapitulated known and uncovered novel disease-associated cell populations, identifying 1.77x (s.e. 1.21) and 5.13x (s.e. 3.08) more significant associations than a CNA-based approach and scDRS, respectively. Furthermore, scEPS detected different cell populations, contrasting clinical diagnoses vs. their PRSs, revealing distinct biology for the active/symptomatic vs. preclinical/asymptomatic states of the disease.

Show full abstractShow less

DOI

10.64898/2026.06.26.26356714

ARROW_SUMMARY

GWAS summary statistics → Select putative disease genes → scEPS tests variance explained by GWAS genes vs. mean-expression-matched control genes at each cell neighborhood → Rank cell neighborhoods by disease association → Apply to clinical diagnoses and PRSs → Reveal active/symptomatic vs. preclinical/asymptomatic biology

scGWAS

Tool

PUBMED_LINK

36253801

FULL NAME

scRNA-seq assisted GWAS analysis

DESCRIPTION

scGWAS leverages scRNA-seq data to identify the genetically mediated associations between traits and cell types.

Show full descriptionShow less

URL

https://github.com/bsml320/scGWAS

TITLE

scGWAS: landscape of trait-cell type associations by integrating single-cell transcriptomics-wide and genome-wide association studies.

Main citation

Jia P, Hu R, Yan F, Dai Y, ...&, Zhao Z. (2022) scGWAS: landscape of trait-cell type associations by integrating single-cell transcriptomics-wide and genome-wide association studies. Genome Biol, 23 (1) 220. doi:10.1186/s13059-022-02785-w. PMID 36253801

ABSTRACT

BACKGROUND: The rapid accumulation of single-cell RNA sequencing (scRNA-seq) data presents unique opportunities to decode the genetically mediated cell-type specificity in complex diseases. Here, we develop a new method, scGWAS, which effectively leverages scRNA-seq data to achieve two goals: (1) to infer the cell types in which the disease-associated genes manifest and (2) to construct cellular modules which imply disease-specific activation of different processes. RESULTS: scGWAS only utilizes the average gene expression for each cell type followed by virtual search processes to construct the null distributions of module scores, making it scalable to large scRNA-seq datasets. We demonstrated scGWAS in 40 genome-wide association studies (GWAS) datasets (average sample size N ≈ 154,000) using 18 scRNA-seq datasets from nine major human/mouse tissues (totaling 1.08 million cells) and identified 2533 trait and cell-type associations, each with significant modules for further investigation. The module genes were validated using disease or clinically annotated references from ClinVar, OMIM, and pLI variants. CONCLUSIONS: We showed that the trait-cell type associations identified by scGWAS, while generally constrained to trait-tissue associations, could recapitulate many well-studied relationships and also reveal novel relationships, providing insights into the unsolved trait-tissue associations. Moreover, in each specific cell type, the associations with different traits were often mediated by different sets of risk genes, implying disease-specific activation of driving processes. In summary, scGWAS is a powerful tool for exploring the genetic basis of complex diseases at the cell type level using single-cell expression data.

Show full abstractShow less

DOI

10.1186/s13059-022-02785-w

scPagwas

Tool

PUBMED_LINK

37719150

FULL NAME

single-cell Pathway-guided GWAS analysis

DESCRIPTION

Obtain trait-relevant cell subpopulations by incorporating pathway activity transformed scRNA-seq data with GWAS data.

Show full descriptionShow less

URL

https://github.com/sulab-wmu/scPagwas

KEYWORDS

GWAS, scRNA-seq, pathway activation

Show full keywordsShow less

TITLE

Polygenic regression uncovers trait-relevant cellular contexts through pathway activation transformation of single-cell RNA sequencing data.

Main citation

Ma Y, Deng C, Zhou Y, Zhang Y, ...&, Su J. (2023) Polygenic regression uncovers trait-relevant cellular contexts through pathway activation transformation of single-cell RNA sequencing data. Cell Genom, 3 (9) 100383. doi:10.1016/j.xgen.2023.100383. PMID 37719150

ABSTRACT

Advances in single-cell RNA sequencing (scRNA-seq) techniques have accelerated functional interpretation of disease-associated variants discovered from genome-wide association studies (GWASs). However, identification of trait-relevant cell populations is often impeded by inherent technical noise and high sparsity in scRNA-seq data. Here, we developed scPagwas, a computational approach that uncovers trait-relevant cellular context by integrating pathway activation transformation of scRNA-seq data and GWAS summary statistics. scPagwas effectively prioritizes trait-relevant genes, which facilitates identification of trait-relevant cell types/populations with high accuracy in extensive simulated and real datasets. Cellular-level association results identified a novel subpopulation of naive CD8+ T cells related to COVID-19 severity and oligodendrocyte progenitor cell and microglia subsets with critical pathways by which genetic variants influence Alzheimer's disease. Overall, our approach provides new insights for the discovery of trait-relevant cell types and improves the mechanistic understanding of disease variants from a pathway perspective.

Show full abstractShow less

DOI

10.1016/j.xgen.2023.100383

ARROW_SUMMARY

GWAS summary statistics → Pathway activation transformation of scRNA-seq data → Polygenic regression → Prioritize trait-relevant genes → Identify trait-relevant cell types/populations → Discover novel cellular subpopulations and pathways

seismic

GWAS Single cell scRNA-seq Gene prioritization Tool

PUBMED_LINK

41034207

FULL NAME

Single-cell Expression Integration System for Mapping genetically Implicated Cell types

DESCRIPTION

R framework that links GWAS signals to single-cell-defined cell types via a cell-type gene specificity score (expression magnitude and consistency) and regression on gene-level association statistics, with influential-gene follow-up for interpretability.

Show full descriptionShow less

URL

https://github.com/ylaboratory/seismic ,https://ylaboratory.github.io/seismic/ ,https://doi.org/10.1038/s41467-025-63753-z

KEYWORDS

GWAS, scRNA-seq, cell type, MAGMA, post-GWAS interpretation

Show full keywordsShow less

TITLE

Disentangling associations between complex traits and cell types with seismic.

Main citation

Lai Q, Dannenfelser R, Roussarie JP, Yao V. (2025) Disentangling associations between complex traits and cell types with seismic. Nat Commun, 16 (1) 8744. doi:10.1038/s41467-025-63753-z. PMID 41034207

ABSTRACT

Integrating single-cell RNA sequencing with Genome-Wide Association Studies (GWAS) can uncover cell types involved in complex traits and disease. However, current methods often lack scalability, interpretability, and robustness. We present seismic, a framework that computes a novel specificity score capturing both expression magnitude and consistency across cell types and introduces influential gene analysis, an approach to identify genes driving each cell type-trait association. Across over 1000 cell-type characterizations at different granularities and 28 polygenic traits, seismic corroborates known associations and uncovers trait-relevant cell groups not apparent through other methodologies. In Parkinson's and Alzheimer's, seismic unveils both cell- and brain-region-specific differences in pathology. Analyzing a pathology-based Alzheimer's GWAS with seismic enables the identification of vulnerable neuron populations and molecular pathways implicated in their neurodegeneration. In general, seismic is a computationally efficient, powerful, and interpretable approach for mapping the relationships between polygenic traits and cell-type-specific expression, offering new insights into disease mechanisms.

Show full abstractShow less

DOI

10.1038/s41467-025-63753-z