Single cell
Catalog entries using this tag (links open the entry card on its page):
- scTWAS — GWAS Tools
- seismic — GWAS Tools
- CellRank — Single cell
- CytoTRACE — Single cell
- Giotto — Single cell
- Harmony — Single cell
- pySCENIC — Single cell
- Scanpy — Single cell
- scds — Single cell
- SCENIC — Single cell
- Seurat — Single cell
- SingleR — Single cell
- SoupX — Single cell
- Squidpy — Single cell
Entries
scTWAS
PUBMED_LINK
DESCRIPTION
Statistical framework for cell-type-resolved transcriptome-wide association using single-cell RNA-seq: models sparsity and technical noise via latent variables and moment-based estimation to improve genetically regulated expression prediction and gene–trait discovery.
URL
KEYWORDS
TWAS, single-cell, cell-type-specific, latent variable, GReX
TITLE
scTWAS: a powerful statistical framework for single-cell transcriptome-wide association studies.
Main citation
Lin Z, Su C. (2026) scTWAS: a powerful statistical framework for single-cell transcriptome-wide association studies. Nat Commun, () . doi:10.1038/s41467-026-70374-7. PMID 41820391
ABSTRACT
Transcriptome-wide association studies (TWAS) have successfully identified genes associated with complex traits and diseases, but most have been performed using bulk gene expression data, which aggregate signals across heterogeneous cell types. Population-scale single-cell RNA sequencing data now make it possible to perform TWAS at the cell-type resolution, but present unique challenges due to strong noises, technical variations, and high sparsity. Here, we propose scTWAS, a statistical method to conduct cell-type-specific TWAS using single-cell data. Leveraging a latent-variable model and moment-based estimation to address the challenges of single-cell data, scTWAS consistently improves the prediction of genetically regulated gene expression across cell types in both blood and brain tissues. Compared to existing methods, scTWAS identifies substantially more gene-trait associations across 29 hematological traits and three immune-related diseases in immune cell types. An application to Alzheimer's disease also reveals cell-subtype-specific associations, including MS4A6A in the disease-associated microglial subtype and PPP1R37 in the inflammatory microglial subtype.
DOI
10.1038/s41467-026-70374-7
seismic
PUBMED_LINK
FULL NAME
Single-cell Expression Integration System for Mapping genetically Implicated Cell types
DESCRIPTION
R framework that links GWAS signals to single-cell-defined cell types via a cell-type gene specificity score (expression magnitude and consistency) and regression on gene-level association statistics, with influential-gene follow-up for interpretability.
URL
KEYWORDS
GWAS, scRNA-seq, cell type, MAGMA, post-GWAS interpretation
TITLE
Disentangling associations between complex traits and cell types with seismic.
Main citation
Lai Q, Dannenfelser R, Roussarie JP, Yao V. (2025) Disentangling associations between complex traits and cell types with seismic. Nat Commun, 16 (1) 8744. doi:10.1038/s41467-025-63753-z. PMID 41034207
ABSTRACT
Integrating single-cell RNA sequencing with Genome-Wide Association Studies (GWAS) can uncover cell types involved in complex traits and disease. However, current methods often lack scalability, interpretability, and robustness. We present seismic, a framework that computes a novel specificity score capturing both expression magnitude and consistency across cell types and introduces influential gene analysis, an approach to identify genes driving each cell type-trait association. Across over 1000 cell-type characterizations at different granularities and 28 polygenic traits, seismic corroborates known associations and uncovers trait-relevant cell groups not apparent through other methodologies. In Parkinson's and Alzheimer's, seismic unveils both cell- and brain-region-specific differences in pathology. Analyzing a pathology-based Alzheimer's GWAS with seismic enables the identification of vulnerable neuron populations and molecular pathways implicated in their neurodegeneration. In general, seismic is a computationally efficient, powerful, and interpretable approach for mapping the relationships between polygenic traits and cell-type-specific expression, offering new insights into disease mechanisms.
DOI
10.1038/s41467-025-63753-z
CellRank
PUBMED_LINK
TITLE
CellRank for directed single-cell fate mapping.
Main citation
Lange M, Bergen V, Klein M, Setty M, ...&, Theis FJ. (2022) CellRank for directed single-cell fate mapping. Nat Methods, 19 (2) 159-170. doi:10.1038/s41592-021-01346-6. PMID 35027767
ABSTRACT
Computational trajectory inference enables the reconstruction of cell state dynamics from single-cell RNA sequencing experiments. However, trajectory inference requires that the direction of a biological process is known, largely limiting its application to differentiating systems in normal development. Here, we present CellRank ( https://cellrank.org ) for single-cell fate mapping in diverse scenarios, including regeneration, reprogramming and disease, for which direction is unknown. Our approach combines the robustness of trajectory inference with directional information from RNA velocity, taking into account the gradual and stochastic nature of cellular fate decisions, as well as uncertainty in velocity vectors. On pancreas development data, CellRank automatically detects initial, intermediate and terminal populations, predicts fate potentials and visualizes continuous gene expression trends along individual lineages. Applied to lineage-traced cellular reprogramming data, predicted fate probabilities correctly recover reprogramming outcomes. CellRank also predicts a new dedifferentiation trajectory during postinjury lung regeneration, including previously unknown intermediate cell states, which we confirm experimentally.
DOI
10.1038/s41592-021-01346-6
CytoTRACE
PUBMED_LINK
DESCRIPTION
CytoTRACE (Cellular (Cyto) Trajectory Reconstruction Analysis using gene Counts and Expression) is a computational method that predicts the differentiation state of cells from single-cell RNA-sequencing data. CytoTRACE leverages a simple, yet robust, determinant of developmental potential—the number of detectably expressed genes per cell, or gene counts. We have validated CytoTRACE on ~150K single-cell transcriptomes spanning 315 cell phenotypes, 52 lineages, 14 tissue types, 9 scRNA-seq platforms, and 5 species.
URL
TITLE
Single-cell transcriptional diversity is a hallmark of developmental potential.
Main citation
Gulati GS, Sikandar SS, Wesche DJ, Manjunath A, ...&, Newman AM. (2020) Single-cell transcriptional diversity is a hallmark of developmental potential. Science, 367 (6476) 405-411. doi:10.1126/science.aax0249. PMID 31974247
ABSTRACT
Single-cell RNA sequencing (scRNA-seq) is a powerful approach for reconstructing cellular differentiation trajectories. However, inferring both the state and direction of differentiation is challenging. Here, we demonstrate a simple, yet robust, determinant of developmental potential-the number of expressed genes per cell-and leverage this measure of transcriptional diversity to develop a computational framework (CytoTRACE) for predicting differentiation states from scRNA-seq data. When applied to diverse tissue types and organisms, CytoTRACE outperformed previous methods and nearly 19,000 annotated gene sets for resolving 52 experimentally determined developmental trajectories. Additionally, it facilitated the identification of quiescent stem cells and revealed genes that contribute to breast tumorigenesis. This study thus establishes a key RNA-based feature of developmental potential and a platform for delineation of cellular hierarchies.
DOI
10.1126/science.aax0249
Giotto
PUBMED_LINK
DESCRIPTION
The Giotto package consists of two modules, Giotto Analyzer and Viewer (see www.spatialgiotto.com), which provide tools to process, analyze and visualize single-cell spatial expression data.
URL
TITLE
Giotto: a toolbox for integrative analysis and visualization of spatial expression data.
Main citation
Dries R, Zhu Q, Dong R, Eng CL, ...&, Yuan GC. (2021) Giotto: a toolbox for integrative analysis and visualization of spatial expression data. Genome Biol, 22 (1) 78. doi:10.1186/s13059-021-02286-2. PMID 33685491
ABSTRACT
Spatial transcriptomic and proteomic technologies have provided new opportunities to investigate cells in their native microenvironment. Here we present Giotto, a comprehensive and open-source toolbox for spatial data analysis and visualization. The analysis module provides end-to-end analysis by implementing a wide range of algorithms for characterizing tissue composition, spatial expression patterns, and cellular interactions. Furthermore, single-cell RNAseq data can be integrated for spatial cell-type enrichment analysis. The visualization module allows users to interactively visualize analysis outputs and imaging features. To demonstrate its general applicability, we apply Giotto to a wide range of datasets encompassing diverse technologies and platforms.
DOI
10.1186/s13059-021-02286-2
Harmony
PUBMED_LINK
DESCRIPTION
Fast, sensitive and accurate integration of single-cell data with Harmony
URL
TITLE
Fast, sensitive and accurate integration of single-cell data with Harmony.
Main citation
Korsunsky I, Millard N, Fan J, Slowikowski K, ...&, Raychaudhuri S. (2019) Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods, 16 (12) 1289-1296. doi:10.1038/s41592-019-0619-0. PMID 31740819
ABSTRACT
The emerging diversity of single-cell RNA-seq datasets allows for the full transcriptional characterization of cell types across a wide variety of biological and clinical conditions. However, it is challenging to analyze them together, particularly when datasets are assayed with different technologies, because biological and technical differences are interspersed. We present Harmony (https://github.com/immunogenomics/harmony), an algorithm that projects cells into a shared embedding in which cells group by cell type rather than dataset-specific conditions. Harmony simultaneously accounts for multiple experimental and biological factors. In six analyses, we demonstrate the superior performance of Harmony to previously published algorithms while requiring fewer computational resources. Harmony enables the integration of ~106 cells on a personal computer. We apply Harmony to peripheral blood mononuclear cells from datasets with large experimental differences, five studies of pancreatic islet cells, mouse embryogenesis datasets and the integration of scRNA-seq with spatial transcriptomics data.
DOI
10.1038/s41592-019-0619-0
pySCENIC
DESCRIPTION
pySCENIC is a lightning-fast python implementation of the SCENIC pipeline (Single-Cell rEgulatory Network Inference and Clustering) which enables biologists to infer transcription factors, gene regulatory networks and cell types from single-cell RNA-seq data.
URL
Scanpy
PUBMED_LINK
URL
TITLE
SCANPY: large-scale single-cell gene expression data analysis.
Main citation
Wolf FA, Angerer P, Theis FJ. (2018) SCANPY: large-scale single-cell gene expression data analysis. Genome Biol, 19 (1) 15. doi:10.1186/s13059-017-1382-0. PMID 29409532
ABSTRACT
SCANPY is a scalable toolkit for analyzing single-cell gene expression data. It includes methods for preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing, and simulation of gene regulatory networks. Its Python-based implementation efficiently deals with data sets of more than one million cells ( https://github.com/theislab/Scanpy ). Along with SCANPY, we present ANNDATA, a generic class for handling annotated data matrices ( https://github.com/theislab/anndata ).
DOI
10.1186/s13059-017-1382-0
scds
PUBMED_LINK
DESCRIPTION
The scds package provides methods to annotate doublets in scRNA-seq data computationally.
URL
TITLE
scds: computational annotation of doublets in single-cell RNA sequencing data.
Main citation
Bais AS, Kostka D. (2020) scds: computational annotation of doublets in single-cell RNA sequencing data. Bioinformatics, 36 (4) 1150-1158. doi:10.1093/bioinformatics/btz698. PMID 31501871
ABSTRACT
MOTIVATION: Single-cell RNA sequencing (scRNA-seq) technologies enable the study of transcriptional heterogeneity at the resolution of individual cells and have an increasing impact on biomedical research. However, it is known that these methods sometimes wrongly consider two or more cells as single cells, and that a number of so-called doublets is present in the output of such experiments. Treating doublets as single cells in downstream analyses can severely bias a study's conclusions, and therefore computational strategies for the identification of doublets are needed. RESULTS: With scds, we propose two new approaches for in silico doublet identification: Co-expression based doublet scoring (cxds) and binary classification based doublet scoring (bcds). The co-expression based approach, cxds, utilizes binarized (absence/presence) gene expression data and, employing a binomial model for the co-expression of pairs of genes, yields interpretable doublet annotations. bcds, on the other hand, uses a binary classification approach to discriminate artificial doublets from original data. We apply our methods and existing computational doublet identification approaches to four datasets with experimental doublet annotations and find that our methods perform at least as well as the state of the art, at comparably little computational cost. We observe appreciable differences between methods and across datasets and that no approach dominates all others. In summary, scds presents a scalable, competitive approach that allows for doublet annotation of datasets with thousands of cells in a matter of seconds. AVAILABILITY AND IMPLEMENTATION: scds is implemented as a Bioconductor R package (doi: 10.18129/B9.bioc.scds). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
DOI
10.1093/bioinformatics/btz698
SCENIC+
PUBMED_LINK
URL
TITLE
SCENIC+: single-cell multiomic inference of enhancers and gene regulatory networks.
Main citation
Bravo González-Blas C, De Winter S, Hulselmans G, Hecker N, ...&, Aerts S. (2023) SCENIC+: single-cell multiomic inference of enhancers and gene regulatory networks. Nat Methods, 20 (9) 1355-1367. doi:10.1038/s41592-023-01938-4. PMID 37443338
ABSTRACT
Joint profiling of chromatin accessibility and gene expression in individual cells provides an opportunity to decipher enhancer-driven gene regulatory networks (GRNs). Here we present a method for the inference of enhancer-driven GRNs, called SCENIC+. SCENIC+ predicts genomic enhancers along with candidate upstream transcription factors (TFs) and links these enhancers to candidate target genes. To improve both recall and precision of TF identification, we curated and clustered a motif collection with more than 30,000 motifs. We benchmarked SCENIC+ on diverse datasets from different species, including human peripheral blood mononuclear cells, ENCODE cell lines, melanoma cell states and Drosophila retinal development. Next, we exploit SCENIC+ predictions to study conserved TFs, enhancers and GRNs between human and mouse cell types in the cerebral cortex. Finally, we use SCENIC+ to study the dynamics of gene regulation along differentiation trajectories and the effect of TF perturbations on cell state. SCENIC+ is available at scenicplus.readthedocs.io .
DOI
10.1038/s41592-023-01938-4
Seurat
SingleR
PUBMED_LINK
DESCRIPTION
a computational method for unbiased cell type recognition of scRNA-seq.
URL
TITLE
Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage.
Main citation
Aran D, Looney AP, Liu L, Wu E, ...&, Bhattacharya M. (2019) Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat Immunol, 20 (2) 163-172. doi:10.1038/s41590-018-0276-y. PMID 30643263
ABSTRACT
Tissue fibrosis is a major cause of mortality that results from the deposition of matrix proteins by an activated mesenchyme. Macrophages accumulate in fibrosis, but the role of specific subgroups in supporting fibrogenesis has not been investigated in vivo. Here, we used single-cell RNA sequencing (scRNA-seq) to characterize the heterogeneity of macrophages in bleomycin-induced lung fibrosis in mice. A novel computational framework for the annotation of scRNA-seq by reference to bulk transcriptomes (SingleR) enabled the subclustering of macrophages and revealed a disease-associated subgroup with a transitional gene expression profile intermediate between monocyte-derived and alveolar macrophages. These CX3CR1+SiglecF+ transitional macrophages localized to the fibrotic niche and had a profibrotic effect in vivo. Human orthologs of genes expressed by the transitional macrophages were upregulated in samples from patients with idiopathic pulmonary fibrosis. Thus, we have identified a pathological subgroup of transitional macrophages that are required for the fibrotic response to injury.
DOI
10.1038/s41590-018-0276-y
SoupX
PUBMED_LINK
URL
TITLE
SoupX removes ambient RNA contamination from droplet-based single-cell RNA sequencing data.
Main citation
Young MD, Behjati S. (2020) SoupX removes ambient RNA contamination from droplet-based single-cell RNA sequencing data. Gigascience, 9 (12) . doi:10.1093/gigascience/giaa151. PMID 33367645
ABSTRACT
BACKGROUND: Droplet-based single-cell RNA sequence analyses assume that all acquired RNAs are endogenous to cells. However, any cell-free RNAs contained within the input solution are also captured by these assays. This sequencing of cell-free RNA constitutes a background contamination that confounds the biological interpretation of single-cell transcriptomic data. RESULTS: We demonstrate that contamination from this "soup" of cell-free RNAs is ubiquitous, with experiment-specific variations in composition and magnitude. We present a method, SoupX, for quantifying the extent of the contamination and estimating "background-corrected" cell expression profiles that seamlessly integrate with existing downstream analysis tools. Applying this method to several datasets using multiple droplet sequencing technologies, we demonstrate that its application improves biological interpretation of otherwise misleading data, as well as improving quality control metrics. CONCLUSIONS: We present SoupX, a tool for removing ambient RNA contamination from droplet-based single-cell RNA sequencing experiments. This tool has broad applicability, and its application can improve the biological utility of existing and future datasets.
DOI
10.1093/gigascience/giaa151
Squidpy
PUBMED_LINK
URL
TITLE
Squidpy: a scalable framework for spatial omics analysis.
Main citation
Palla G, Spitzer H, Klein M, Fischer D, ...&, Theis FJ. (2022) Squidpy: a scalable framework for spatial omics analysis. Nat Methods, 19 (2) 171-178. doi:10.1038/s41592-021-01358-2. PMID 35102346
ABSTRACT
Spatial omics data are advancing the study of tissue organization and cellular communication at an unprecedented scale. Flexible tools are required to store, integrate and visualize the large diversity of spatial omics data. Here, we present Squidpy, a Python framework that brings together tools from omics and image analysis to enable scalable description of spatial molecular data, such as transcriptome or multivariate proteins. Squidpy provides efficient infrastructure and numerous analysis methods that allow to efficiently store, manipulate and interactively visualize spatial omics data. Squidpy is extensible and can be interfaced with a variety of already existing libraries for the scalable analysis of spatial omics data.
DOI
10.1038/s41592-021-01358-2