Skip to content

Single Cell QC

Curation of QC — listings under the Single cell tab.

Summary Table

Click a column header to sort the table.

NAME CATEGORY Main citation YEAR
scds Doublets
Bais AS et al., Bioinformatics, 2020
2020
SoupX QC
Young MD et al., Gigascience, 2020
2020

Doublets

scds

Single cell
PUBMED_LINK
31501871
DESCRIPTION
The scds package provides methods to annotate doublets in scRNA-seq data computationally.
URL
https://www.bioconductor.org/packages/release/bioc/html/scds.html
TITLE
scds: computational annotation of doublets in single-cell RNA sequencing data.
Main citation
Bais AS, Kostka D. (2020) scds: computational annotation of doublets in single-cell RNA sequencing data. Bioinformatics, 36 (4) 1150-1158. doi:10.1093/bioinformatics/btz698. PMID 31501871
ABSTRACT
MOTIVATION: Single-cell RNA sequencing (scRNA-seq) technologies enable the study of transcriptional heterogeneity at the resolution of individual cells and have an increasing impact on biomedical research. However, it is known that these methods sometimes wrongly consider two or more cells as single cells, and that a number of so-called doublets is present in the output of such experiments. Treating doublets as single cells in downstream analyses can severely bias a study's conclusions, and therefore computational strategies for the identification of doublets are needed. RESULTS: With scds, we propose two new approaches for in silico doublet identification: Co-expression based doublet scoring (cxds) and binary classification based doublet scoring (bcds). The co-expression based approach, cxds, utilizes binarized (absence/presence) gene expression data and, employing a binomial model for the co-expression of pairs of genes, yields interpretable doublet annotations. bcds, on the other hand, uses a binary classification approach to discriminate artificial doublets from original data. We apply our methods and existing computational doublet identification approaches to four datasets with experimental doublet annotations and find that our methods perform at least as well as the state of the art, at comparably little computational cost. We observe appreciable differences between methods and across datasets and that no approach dominates all others. In summary, scds presents a scalable, competitive approach that allows for doublet annotation of datasets with thousands of cells in a matter of seconds. AVAILABILITY AND IMPLEMENTATION: scds is implemented as a Bioconductor R package (doi: 10.18129/B9.bioc.scds). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
DOI
10.1093/bioinformatics/btz698

QC

SoupX

Single cell
PUBMED_LINK
33367645
URL
https://github.com/constantAmateur/SoupX
TITLE
SoupX removes ambient RNA contamination from droplet-based single-cell RNA sequencing data.
Main citation
Young MD, Behjati S. (2020) SoupX removes ambient RNA contamination from droplet-based single-cell RNA sequencing data. Gigascience, 9 (12) . doi:10.1093/gigascience/giaa151. PMID 33367645
ABSTRACT
BACKGROUND: Droplet-based single-cell RNA sequence analyses assume that all acquired RNAs are endogenous to cells. However, any cell-free RNAs contained within the input solution are also captured by these assays. This sequencing of cell-free RNA constitutes a background contamination that confounds the biological interpretation of single-cell transcriptomic data. RESULTS: We demonstrate that contamination from this "soup" of cell-free RNAs is ubiquitous, with experiment-specific variations in composition and magnitude. We present a method, SoupX, for quantifying the extent of the contamination and estimating "background-corrected" cell expression profiles that seamlessly integrate with existing downstream analysis tools. Applying this method to several datasets using multiple droplet sequencing technologies, we demonstrate that its application improves biological interpretation of otherwise misleading data, as well as improving quality control metrics. CONCLUSIONS: We present SoupX, a tool for removing ambient RNA contamination from droplet-based single-cell RNA sequencing experiments. This tool has broad applicability, and its application can improve the biological utility of existing and future datasets.
DOI
10.1093/gigascience/giaa151