Skip to content

Tissue_and_single_cell

Summary Table

NAME CITATION YEAR
CoCoNet Shang L, Smith JA, Zhou X. (2020) Leveraging gene co-expression patterns to infer trait-relevant tissues in genome-wide association studies PLoS Genet., 16 (4) e1008734. doi:10.1371/journal.pgen.1008734. PMID 32310941 2020
EPIC Wang R, Lin DY, Jiang Y. (2022) EPIC: Inferring relevant cell types for complex traits by integrating genome-wide association studies and single-cell RNA sequencing PLoS Genet., 18 (6) e1010251. doi:10.1371/journal.pgen.1010251. PMID 35709291 2022
LDSC-SEG Finucane HK, Reshef YA, Anttila V, Slowikowski K, ...&, Price AL. (2018) Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types Nat. Genet., 50 (4) 621-629. doi:10.1038/s41588-018-0081-4. PMID 29632380 2018
MAGMA de Leeuw CA, Mooij JM, Heskes T, Posthuma D. (2015) MAGMA: generalized gene-set analysis of GWAS data PLoS Comput. Biol., 11 (4) e1004219. doi:10.1371/journal.pcbi.1004219. PMID 25885710 2015
RolyPoly Calderon D, Bhaskar A, Knowles DA, Golan D, ...&, Pritchard JK. (2017) Inferring Relevant Cell Types for Complex Traits by Using Single-Cell Gene Expression Am. J. Hum. Genet., 101 (5) 686-699. doi:10.1016/j.ajhg.2017.09.009. PMID 29106824 2017
SCARlink Mitra S, Malik R, Wong W, Rahman A, ...&, Leslie CS. (2024) Single-cell multi-ome regression models identify functional and disease-associated enhancers and enable chromatin potential analysis Nat. Genet., () . doi:10.1038/s41588-024-01689-8. PMID 38514783 2024
SCAVENGE Yu F, Cato LD, Weng C, Liggett LA, ...&, Sankaran VG. (2022) Variant to function mapping at single-cell resolution through network propagation Nat. Biotechnol., 40 (11) 1644-1653. doi:10.1038/s41587-022-01341-y. PMID 35668323 2022
SCENT Sakaue S, Weinand K, Isaac S, Dey KK, ...&, Raychaudhuri S. (2024) Tissue-specific enhancer-gene maps from multimodal single-cell data identify causal disease alleles Nat. Genet., 56 (4) 615-626. doi:10.1038/s41588-024-01682-1. PMID 38594305 2024
TCSC Amariuta T, Siewert-Rocks K, Price AL. (2023) Modeling tissue co-regulation estimates tissue-specific contributions to disease Nat. Genet., 55 (9) 1503-1511. doi:10.1038/s41588-023-01474-z. PMID 37580597 2023
pgBoost Dorans, E. R., Jagadeesh, K., Dey, K., & Price, A. L. (2024). Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance. medRxiv, 2024-05. NA
sc-linker Jagadeesh KA, Dey KK, Montoro DT, Mohan R, ...&, Regev A. (2022) Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics Nat. Genet., 54 (10) 1479-1492. doi:10.1038/s41588-022-01187-9. PMID 36175791 2022
scDRS Zhang MJ, Hou K, Dey KK, Sakaue S, ...&, Price AL. (2022) Polygenic enrichment distinguishes disease associations of individual cells in single-cell RNA-seq data Nat. Genet., 54 (10) 1572-1580. doi:10.1038/s41588-022-01167-z. PMID 36050550 2022
scGWAS Jia P, Hu R, Yan F, Dai Y, ...&, Zhao Z. (2022) scGWAS: landscape of trait-cell type associations by integrating single-cell transcriptomics-wide and genome-wide association studies Genome Biol., 23 (1) 220. doi:10.1186/s13059-022-02785-w. PMID 36253801 2022

CoCoNet

  • NAME : CoCoNet
  • SHORT NAME : CoCoNet
  • FULL NAME : CoCoNet
  • DESCRIPTION : CoCoNet is a composite likelihood-based covariance regression network model for identifying trait-relevant tissues or cell types.
  • URL : https://xiangzhou.github.io/software/
  • KEYWORDS : composite likelihood-based inference algorithm
  • TITLE : Leveraging gene co-expression patterns to infer trait-relevant tissues in genome-wide association studies
  • DOI : 10.1371/journal.pgen.1008734
  • ABSTRACT : Genome-wide association studies (GWASs) have identified many SNPs associated with various common diseases. Understanding the biological functions of these identified SNP associations requires identifying disease/trait relevant tissues or cell types. Here, we develop a network method, CoCoNet, to facilitate the identification of trait-relevant tissues or cell types. Different from existing approaches, CoCoNet incorporates tissue-specific gene co-expression networks constructed from either bulk or single cell RNA sequencing (RNAseq) studies with GWAS data for trait-tissue inference. In particular, CoCoNet relies on a covariance regression network model to express gene-level effect measurements for the given GWAS trait as a function of the tissue-specific co-expression adjacency matrix. With a composite likelihood-based inference algorithm, CoCoNet is scalable to tens of thousands of genes. We validate the performance of CoCoNet through extensive simulations. We apply CoCoNet for an in-depth analysis of four neurological disorders and four autoimmune diseases, where we integrate the corresponding GWASs with bulk RNAseq data from 38 tissues and single cell RNAseq data from 10 cell types. In the real data applications, we show how CoCoNet can help identify specific glial cell types relevant for neurological disorders and identify disease-targeted colon tissues as relevant for autoimmune diseases.
  • CITATION : Shang L, Smith JA, Zhou X. (2020) Leveraging gene co-expression patterns to infer trait-relevant tissues in genome-wide association studies PLoS Genet., 16 (4) e1008734. doi:10.1371/journal.pgen.1008734. PMID 32310941
  • JOURNAL_INFO : PLoS genetics ; PLoS Genet. ; 2020 ; 16 ; 4 ; e1008734
  • PUBMED_LINK : 32310941

EPIC

  • NAME : EPIC
  • SHORT NAME : EPIC
  • FULL NAME : cEll tyPe enrIChment
  • DESCRIPTION : Inferring relevant tissues and cell types for complex traits in genome-wide association studies
  • URL : https://github.com/rujinwang/EPIC
  • KEYWORDS : GWAS, scRNA-seq
  • TITLE : EPIC: Inferring relevant cell types for complex traits by integrating genome-wide association studies and single-cell RNA sequencing
  • DOI : 10.1371/journal.pgen.1010251
  • ABSTRACT : More than a decade of genome-wide association studies (GWASs) have identified genetic risk variants that are significantly associated with complex traits. Emerging evidence suggests that the function of trait-associated variants likely acts in a tissue- or cell-type-specific fashion. Yet, it remains challenging to prioritize trait-relevant tissues or cell types to elucidate disease etiology. Here, we present EPIC (cEll tyPe enrIChment), a statistical framework that relates large-scale GWAS summary statistics to cell-type-specific gene expression measurements from single-cell RNA sequencing (scRNA-seq). We derive powerful gene-level test statistics for common and rare variants, separately and jointly, and adopt generalized least squares to prioritize trait-relevant cell types while accounting for the correlation structures both within and between genes. Using enrichment of loci associated with four lipid traits in the liver and enrichment of loci associated with three neurological disorders in the brain as ground truths, we show that EPIC outperforms existing methods. We apply our framework to multiple scRNA-seq datasets from different platforms and identify cell types underlying type 2 diabetes and schizophrenia. The enrichment is replicated using independent GWAS and scRNA-seq datasets and further validated using PubMed search and existing bulk case-control testing results.
  • COPYRIGHT : http://creativecommons.org/licenses/by/4.0/
  • CITATION : Wang R, Lin DY, Jiang Y. (2022) EPIC: Inferring relevant cell types for complex traits by integrating genome-wide association studies and single-cell RNA sequencing PLoS Genet., 18 (6) e1010251. doi:10.1371/journal.pgen.1010251. PMID 35709291
  • JOURNAL_INFO : PLoS genetics ; PLoS Genet. ; 2022 ; 18 ; 6 ; e1010251
  • PUBMED_LINK : 35709291

LDSC-SEG

  • NAME : LDSC-SEG
  • SHORT NAME : LDSC-SEG
  • FULL NAME : LD score regression applied to specifically expressed genes
  • URL : https://github.com/bulik/ldsc
  • KEYWORDS : LDSC, tissue, cell type
  • TITLE : Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types
  • DOI : 10.1038/s41588-018-0081-4
  • ABSTRACT : We introduce an approach to identify disease-relevant tissues and cell types by analyzing gene expression data together with genome-wide association study (GWAS) summary statistics. Our approach uses stratified linkage disequilibrium (LD) score regression to test whether disease heritability is enriched in regions surrounding genes with the highest specific expression in a given tissue. We applied our approach to gene expression data from several sources together with GWAS summary statistics for 48 diseases and traits (average N = 169,331) and found significant tissue-specific enrichments (false discovery rate (FDR) < 5%) for 34 traits. In our analysis of multiple tissues, we detected a broad range of enrichments that recapitulated known biology. In our brain-specific analysis, significant enrichments included an enrichment of inhibitory over excitatory neurons for bipolar disorder, and excitatory over inhibitory neurons for schizophrenia and body mass index. Our results demonstrate that our polygenic approach is a powerful way to leverage gene expression data for interpreting GWAS signals.
  • CITATION : Finucane HK, Reshef YA, Anttila V, Slowikowski K, ...&, Price AL. (2018) Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types Nat. Genet., 50 (4) 621-629. doi:10.1038/s41588-018-0081-4. PMID 29632380
  • JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2018 ; 50 ; 4 ; 621-629
  • PUBMED_LINK : 29632380

MAGMA

  • NAME : MAGMA
  • SHORT NAME : MAGMA
  • FULL NAME : Multi-marker Analysis of GenoMic Annotation
  • URL : https://ctg.cncr.nl/software/magma
  • TITLE : MAGMA: generalized gene-set analysis of GWAS data
  • DOI : 10.1371/journal.pcbi.1004219
  • ABSTRACT : By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.
  • COPYRIGHT : http://creativecommons.org/licenses/by/4.0/
  • CITATION : de Leeuw CA, Mooij JM, Heskes T, Posthuma D. (2015) MAGMA: generalized gene-set analysis of GWAS data PLoS Comput. Biol., 11 (4) e1004219. doi:10.1371/journal.pcbi.1004219. PMID 25885710
  • JOURNAL_INFO : PLoS computational biology ; PLoS Comput. Biol. ; 2015 ; 11 ; 4 ; e1004219
  • PUBMED_LINK : 25885710

RolyPoly

  • NAME : RolyPoly
  • SHORT NAME : RolyPoly
  • FULL NAME : RolyPoly
  • DESCRIPTION : RolyPoly is a regression-based polygenic model that can prioritize trait-relevant cell types and genes from GWAS summary statistics and gene expression data.
  • URL : https://github.com/dcalderon/rolypoly
  • TITLE : Inferring Relevant Cell Types for Complex Traits by Using Single-Cell Gene Expression
  • DOI : 10.1016/j.ajhg.2017.09.009
  • ABSTRACT : Previous studies have prioritized trait-relevant cell types by looking for an enrichment of genome-wide association study (GWAS) signal within functional regions. However, these studies are limited in cell resolution by the lack of functional annotations from difficult-to-characterize or rare cell populations. Measurement of single-cell gene expression has become a popular method for characterizing novel cell types, and yet limited work has linked single-cell RNA sequencing (RNA-seq) to phenotypes of interest. To address this deficiency, we present RolyPoly, a regression-based polygenic model that can prioritize trait-relevant cell types and genes from GWAS summary statistics and gene expression data. RolyPoly is designed to use expression data from either bulk tissue or single-cell RNA-seq. In this study, we demonstrated RolyPoly's accuracy through simulation and validated previously known tissue-trait associations. We discovered a significant association between microglia and late-onset Alzheimer disease and an association between schizophrenia and oligodendrocytes and replicating fetal cortical cells. Additionally, RolyPoly computes a trait-relevance score for each gene to reflect the importance of expression specific to a cell type. We found that differentially expressed genes in the prefrontal cortex of individuals with Alzheimer disease were significantly enriched with genes ranked highly by RolyPoly gene scores. Overall, our method represents a powerful framework for understanding the effect of common variants on cell types contributing to complex traits.
  • CITATION : Calderon D, Bhaskar A, Knowles DA, Golan D, ...&, Pritchard JK. (2017) Inferring Relevant Cell Types for Complex Traits by Using Single-Cell Gene Expression Am. J. Hum. Genet., 101 (5) 686-699. doi:10.1016/j.ajhg.2017.09.009. PMID 29106824
  • JOURNAL_INFO : American journal of human genetics ; Am. J. Hum. Genet. ; 2017 ; 101 ; 5 ; 686-699
  • PUBMED_LINK : 29106824
  • NAME : SCARlink
  • SHORT NAME : SCARlink
  • FULL NAME : single-cell ATAC + RNA linking
  • URL : https://github.com/snehamitra/SCARlink/
  • KEYWORDS : Possion regression, scATAC-seq, scRNA-seq, tile-level accessibility
  • TITLE : Single-cell multi-ome regression models identify functional and disease-associated enhancers and enable chromatin potential analysis
  • DOI : 10.1038/s41588-024-01689-8
  • ABSTRACT : We present a gene-level regulatory model, single-cell ATAC + RNA linking (SCARlink), which predicts single-cell gene expression and links enhancers to target genes using multi-ome (scRNA-seq and scATAC-seq co-assay) sequencing data. The approach uses regularized Poisson regression on tile-level accessibility data to jointly model all regulatory effects at a gene locus, avoiding the limitations of pairwise gene-peak correlations and dependence on peak calling. SCARlink outperformed existing gene scoring methods for imputing gene expression from chromatin accessibility across high-coverage multi-ome datasets while giving comparable to improved performance on low-coverage datasets. Shapley value analysis on trained models identified cell-type-specific gene enhancers that are validated by promoter capture Hi-C and are 11× to 15× and 5× to 12× enriched in fine-mapped eQTLs and fine-mapped genome-wide association study (GWAS) variants, respectively. We further show that SCARlink-predicted and observed gene expression vectors provide a robust way to compute a chromatin potential vector field to enable developmental trajectory analysis.
  • CITATION : Mitra S, Malik R, Wong W, Rahman A, ...&, Leslie CS. (2024) Single-cell multi-ome regression models identify functional and disease-associated enhancers and enable chromatin potential analysis Nat. Genet., () . doi:10.1038/s41588-024-01689-8. PMID 38514783
  • JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2024 ; ; ;
  • PUBMED_LINK : 38514783

SCAVENGE

  • NAME : SCAVENGE
  • SHORT NAME : SCAVENGE
  • FULL NAME : Single Cell Analysis of Variant Enrichment through Network propagation of GEnomic data
  • URL : https://github.com/sankaranlab/SCAVENGE
  • KEYWORDS : GWAS, scATAC-seq, network propagation
  • TITLE : Variant to function mapping at single-cell resolution through network propagation
  • DOI : 10.1038/s41587-022-01341-y
  • ABSTRACT : Genome-wide association studies in combination with single-cell genomic atlases can provide insights into the mechanisms of disease-causal genetic variation. However, identification of disease-relevant or trait-relevant cell types, states and trajectories is often hampered by sparsity and noise, particularly in the analysis of single-cell epigenomic data. To overcome these challenges, we present SCAVENGE, a computational algorithm that uses network propagation to map causal variants to their relevant cellular context at single-cell resolution. We demonstrate how SCAVENGE can help identify key biological mechanisms underlying human genetic variation, applying the method to blood traits at distinct stages of human hematopoiesis, to monocyte subsets that increase the risk for severe Coronavirus Disease 2019 (COVID-19) and to intermediate lymphocyte developmental states that predispose to acute leukemia. Our approach not only provides a framework for enabling variant-to-function insights at single-cell resolution but also suggests a more general strategy for maximizing the inferences that can be made using single-cell genomic data.
  • CITATION : Yu F, Cato LD, Weng C, Liggett LA, ...&, Sankaran VG. (2022) Variant to function mapping at single-cell resolution through network propagation Nat. Biotechnol., 40 (11) 1644-1653. doi:10.1038/s41587-022-01341-y. PMID 35668323
  • JOURNAL_INFO : Nature biotechnology ; Nat. Biotechnol. ; 2022 ; 40 ; 11 ; 1644-1653
  • PUBMED_LINK : 35668323

SCENT

  • NAME : SCENT
  • SHORT NAME : SCENT
  • FULL NAME : single-cell enhancer target gene mapping
  • URL : https://github.com/immunogenomics/SCENT
  • KEYWORDS : Possion regression, scATAC-seq, scRNA-seq
  • TITLE : Tissue-specific enhancer-gene maps from multimodal single-cell data identify causal disease alleles
  • DOI : 10.1038/s41588-024-01682-1
  • ABSTRACT : Translating genome-wide association study (GWAS) loci into causal variants and genes requires accurate cell-type-specific enhancer-gene maps from disease-relevant tissues. Building enhancer-gene maps is essential but challenging with current experimental methods in primary human tissues. Here we developed a nonparametric statistical method, SCENT (single-cell enhancer target gene mapping), that models association between enhancer chromatin accessibility and gene expression in single-cell or nucleus multimodal RNA sequencing and ATAC sequencing data. We applied SCENT to 9 multimodal datasets including >120,000 single cells or nuclei and created 23 cell-type-specific enhancer-gene maps. These maps were highly enriched for causal variants in expression quantitative loci and GWAS for 1,143 diseases and traits. We identified likely causal genes for both common and rare diseases and linked somatic mutation hotspots to target genes. We demonstrate that application of SCENT to multimodal data from disease-relevant human tissue enables the scalable construction of accurate cell-type-specific enhancer-gene maps, essential for defining noncoding variant function.
  • CITATION : Sakaue S, Weinand K, Isaac S, Dey KK, ...&, Raychaudhuri S. (2024) Tissue-specific enhancer-gene maps from multimodal single-cell data identify causal disease alleles Nat. Genet., 56 (4) 615-626. doi:10.1038/s41588-024-01682-1. PMID 38594305
  • JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2024 ; 56 ; 4 ; 615-626
  • PUBMED_LINK : 38594305

TCSC

  • NAME : TCSC
  • SHORT NAME : TCSC
  • FULL NAME : Tissue co-regulation score regression
  • DESCRIPTION : TCSC is a statistical genetics method to identify causal tissues in diseases and complex traits. We leverage TWAS and GWAS summary statistics while explicitly modeling the genetic co-regulation of genes across tissues.
  • URL : https://github.com/TiffanyAmariuta/TCSC/
  • TITLE : Modeling tissue co-regulation estimates tissue-specific contributions to disease
  • DOI : 10.1038/s41588-023-01474-z
  • ABSTRACT : Integrative analyses of genome-wide association studies and gene expression data have implicated many disease-critical tissues. However, co-regulation of genetic effects on gene expression across tissues impedes distinguishing biologically causal tissues from tagging tissues. In the present study, we introduce tissue co-regulation score regression (TCSC), which disentangles causal tissues from tagging tissues by regressing gene-disease association statistics (from transcriptome-wide association studies) on tissue co-regulation scores, reflecting correlations of predicted gene expression across genes and tissues. We applied TCSC to 78 diseases/traits (average n = 302,000) and gene expression prediction models for 48 GTEx tissues. TCSC identified 21 causal tissue-trait pairs at a 5% false discovery rate (FDR), including well-established findings, biologically plausible new findings (for example, aorta artery and glaucoma) and increased specificity of known tissue-trait associations (for example, subcutaneous adipose, but not visceral adipose, and high-density lipoprotein). TCSC also identified 17 causal tissue-trait covariance pairs at 5% FDR. In conclusion, TCSC is a precise method for distinguishing causal tissues from tagging tissues.
  • CITATION : Amariuta T, Siewert-Rocks K, Price AL. (2023) Modeling tissue co-regulation estimates tissue-specific contributions to disease Nat. Genet., 55 (9) 1503-1511. doi:10.1038/s41588-023-01474-z. PMID 37580597
  • JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2023 ; 55 ; 9 ; 1503-1511
  • PUBMED_LINK : 37580597

pgBoost

  • NAME : pgBoost
  • SHORT NAME : pgBoost
  • FULL NAME : pgBoost
  • DESCRIPTION : pgBoost is an integrative modeling framework that trains a non-linear combination of existing linking strategies (including genomic distance) on fine-mapped eQTL data to assign a probabilistic score to each candidate SNP-gene link.
  • URL : https://github.com/elizabethdorans/pgBoost
  • KEYWORDS : eQTL-informed gradient boosting
  • PREPRINT_DOI : 10.1101/2024.05.24.24307813
  • SERVER : medrxiv
  • CITATION : Dorans, E. R., Jagadeesh, K., Dey, K., & Price, A. L. (2024). Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance. medRxiv, 2024-05.

sc-linker

  • NAME : sc-linker
  • SHORT NAME : sc-linker
  • FULL NAME : sc-linker
  • DESCRIPTION : a framework for integrating single-cell RNA-sequencing, epigenomic SNP-to-gene maps and genome-wide association study summary statistics to infer the underlying cell types and processes by which genetic variants influence disease
  • URL : https://alkesgroup.broadinstitute.org/LDSCORE/Jagadeesh_Dey_sclinker
  • KEYWORDS : GWAS, scRNA-seq
  • TITLE : Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics
  • DOI : 10.1038/s41588-022-01187-9
  • ABSTRACT : Genome-wide association studies provide a powerful means of identifying loci and genes contributing to disease, but in many cases, the related cell types/states through which genes confer disease risk remain unknown. Deciphering such relationships is important for identifying pathogenic processes and developing therapeutics. In the present study, we introduce sc-linker, a framework for integrating single-cell RNA-sequencing, epigenomic SNP-to-gene maps and genome-wide association study summary statistics to infer the underlying cell types and processes by which genetic variants influence disease. The inferred disease enrichments recapitulated known biology and highlighted notable cell-disease relationships, including γ-aminobutyric acid-ergic neurons in major depressive disorder, a disease-dependent M-cell program in ulcerative colitis and a disease-specific complement cascade process in multiple sclerosis. In autoimmune disease, both healthy and disease-dependent immune cell-type programs were associated, whereas only disease-dependent epithelial cell programs were prominent, suggesting a role in disease response rather than initiation. Our framework provides a powerful approach for identifying the cell types and cellular processes by which genetic variants influence disease.
  • CITATION : Jagadeesh KA, Dey KK, Montoro DT, Mohan R, ...&, Regev A. (2022) Identifying disease-critical cell types and cellular processes by integrating single-cell RNA-sequencing and human genetics Nat. Genet., 54 (10) 1479-1492. doi:10.1038/s41588-022-01187-9. PMID 36175791
  • JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2022 ; 54 ; 10 ; 1479-1492
  • PUBMED_LINK : 36175791

scDRS

  • NAME : scDRS
  • SHORT NAME : scDRS
  • FULL NAME : single-cell Disease Relevance Score
  • URL : https://github.com/martinjzhang/scDRS
  • KEYWORDS : GWAS, scRNA-seq
  • TITLE : Polygenic enrichment distinguishes disease associations of individual cells in single-cell RNA-seq data
  • DOI : 10.1038/s41588-022-01167-z
  • ABSTRACT : Single-cell RNA sequencing (scRNA-seq) provides unique insights into the pathology and cellular origin of disease. We introduce single-cell disease relevance score (scDRS), an approach that links scRNA-seq with polygenic disease risk at single-cell resolution, independent of annotated cell types. scDRS identifies cells exhibiting excess expression across disease-associated genes implicated by genome-wide association studies (GWASs). We applied scDRS to 74 diseases/traits and 1.3 million single-cell gene-expression profiles across 31 tissues/organs. Cell-type-level results broadly recapitulated known cell-type-disease associations. Individual-cell-level results identified subpopulations of disease-associated cells not captured by existing cell-type labels, including T cell subpopulations associated with inflammatory bowel disease, partially characterized by their effector-like states; neuron subpopulations associated with schizophrenia, partially characterized by their spatial locations; and hepatocyte subpopulations associated with triglyceride levels, partially characterized by their higher ploidy levels. Genes whose expression was correlated with the scDRS score across cells (reflecting coexpression with GWAS disease-associated genes) were strongly enriched for gold-standard drug target and Mendelian disease genes.
  • CITATION : Zhang MJ, Hou K, Dey KK, Sakaue S, ...&, Price AL. (2022) Polygenic enrichment distinguishes disease associations of individual cells in single-cell RNA-seq data Nat. Genet., 54 (10) 1572-1580. doi:10.1038/s41588-022-01167-z. PMID 36050550
  • JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2022 ; 54 ; 10 ; 1572-1580
  • PUBMED_LINK : 36050550

scGWAS

  • NAME : scGWAS
  • SHORT NAME : scGWAS
  • FULL NAME : scRNA-seq assisted GWAS analysis
  • DESCRIPTION : scGWAS leverages scRNA-seq data to identify the genetically mediated associations between traits and cell types.
  • URL : https://github.com/bsml320/scGWAS
  • TITLE : scGWAS: landscape of trait-cell type associations by integrating single-cell transcriptomics-wide and genome-wide association studies
  • DOI : 10.1186/s13059-022-02785-w
  • ABSTRACT : BACKGROUND: The rapid accumulation of single-cell RNA sequencing (scRNA-seq) data presents unique opportunities to decode the genetically mediated cell-type specificity in complex diseases. Here, we develop a new method, scGWAS, which effectively leverages scRNA-seq data to achieve two goals: (1) to infer the cell types in which the disease-associated genes manifest and (2) to construct cellular modules which imply disease-specific activation of different processes. RESULTS: scGWAS only utilizes the average gene expression for each cell type followed by virtual search processes to construct the null distributions of module scores, making it scalable to large scRNA-seq datasets. We demonstrated scGWAS in 40 genome-wide association studies (GWAS) datasets (average sample size N ≈ 154,000) using 18 scRNA-seq datasets from nine major human/mouse tissues (totaling 1.08 million cells) and identified 2533 trait and cell-type associations, each with significant modules for further investigation. The module genes were validated using disease or clinically annotated references from ClinVar, OMIM, and pLI variants. CONCLUSIONS: We showed that the trait-cell type associations identified by scGWAS, while generally constrained to trait-tissue associations, could recapitulate many well-studied relationships and also reveal novel relationships, providing insights into the unsolved trait-tissue associations. Moreover, in each specific cell type, the associations with different traits were often mediated by different sets of risk genes, implying disease-specific activation of driving processes. In summary, scGWAS is a powerful tool for exploring the genetic basis of complex diseases at the cell type level using single-cell expression data.
  • CITATION : Jia P, Hu R, Yan F, Dai Y, ...&, Zhao Z. (2022) scGWAS: landscape of trait-cell type associations by integrating single-cell transcriptomics-wide and genome-wide association studies Genome Biol., 23 (1) 220. doi:10.1186/s13059-022-02785-w. PMID 36253801
  • JOURNAL_INFO : Genome biology ; Genome Biol. ; 2022 ; 23 ; 1 ; 220
  • PUBMED_LINK : 36253801