UCell: Robust and scalable single-cell gene signature scoring
UCell is an R package for scoring gene signatures in single-cell datasets. UCell scores, based on the Mann-Whitney U statistic, are robust to dataset size and heterogeneity, and their calculation demands relatively less computing time and memory than other robust methods, enabling the processing of large datasets (>10^5 cells). UCell can be applied to any cell vs. gene data matrix, and includes functions to directly interact with Seurat and Bioconductor’s SingleCellExperiment objects.
Find the installation instructions for the package and usage vignettes below.
See also pyUCell for a Python implementation of UCell.
Calculation of UCell scores has been updated as follows:
UCell=1−U/Umax
where
U=∑i=1nri−2n(n+1)
and the normalization factor Umax is:
Umax=n⋅maxRank−2n(n+1)
n is the number of genes in the signature, ri are the cell-wise ranks for each of the n genes, maxRank is a parameter capping the ranking to the top genes (1500 by default), and U is the Mann-Whitney U statistic (bounded by 0 and Umax). Earlier implementations used Umax=n⋅maxRank to normalize the U statistics. While for typical applications results should be similar between the two implementations, the new normalization provides more homogeneous UCell score distributions for large gene sets.
New in version >= 2.1.2
Single-cell data are sparse. It can be useful to ‘impute’ scores by neighboring cells and partially correct this sparsity. The new function SmoothKNN performs smoothing of single-cell signature scores by weighted average of the k-nearest neighbors in a given dimensionality reduction. It can be applied directly on SingleCellExperiment or Seurat objects to smooth UCell scores:
For easy retrieval and storing of signatures, check out SignatuR:
remotes::install_github("carmonalab/SignatuR")
library(SignatuR)
#e.g. get a cycling signature
cycling.G1S <- GetSignature(SignatuR$Hs$Programs$cellCycle.G1S)
Note that UCell supports positive and negative gene sets within a signature. Simply append + or - signs to the genes to include them in positive and negative sets, respectively. For example:
my_signature <- c("CD2+","CD8A+","CD4-")
Get help
See more information about UCell and its functions by typing ?UCell within R. Please address your questions and bug reports at: UCell issues.
UCell: Robust and scalable single-cell gene signature scoring
UCellis an R package for scoring gene signatures in single-cell datasets. UCell scores, based on the Mann-Whitney U statistic, are robust to dataset size and heterogeneity, and their calculation demands relatively less computing time and memory than other robust methods, enabling the processing of large datasets (>10^5 cells). UCell can be applied to any cell vs. gene data matrix, and includes functions to directly interact with Seurat and Bioconductor’s SingleCellExperiment objects.Find the installation instructions for the package and usage vignettes below.
See also pyUCell for a Python implementation of UCell.
Package Installation
UCell is on Bioconductor To install the package from Bioc run:
For previous releases of
UCell, you may download a tagged version from GitHub:Test the package
Load sample data and test your installation:
Vignettes and examples
Vignettes to run UCell on matrices, SingleCellExperiment or Seurat objects can be found at the UCell Bioc page.
Additional tutorials are also available at:
New in version >= 2.7.6
Calculation of UCell scores has been updated as follows:
UCell=1−U/Umax
where
U=∑i=1nri−2n(n+1)
and the normalization factor Umax is:
Umax=n⋅maxRank−2n(n+1)
n is the number of genes in the signature, ri are the cell-wise ranks for each of the n genes, maxRank is a parameter capping the ranking to the top genes (1500 by default), and U is the Mann-Whitney U statistic (bounded by 0 and Umax). Earlier implementations used Umax=n⋅maxRank to normalize the U statistics. While for typical applications results should be similar between the two implementations, the new normalization provides more homogeneous UCell score distributions for large gene sets.
New in version >= 2.1.2
Single-cell data are sparse. It can be useful to ‘impute’ scores by neighboring cells and partially correct this sparsity. The new function
SmoothKNNperforms smoothing of single-cell signature scores by weighted average of the k-nearest neighbors in a given dimensionality reduction. It can be applied directly on SingleCellExperiment or Seurat objects to smooth UCell scores:Interacting with signatures
For easy retrieval and storing of signatures, check out SignatuR:
Note that UCell supports positive and negative gene sets within a signature. Simply append + or - signs to the genes to include them in positive and negative sets, respectively. For example:
Get help
See more information about UCell and its functions by typing
?UCellwithin R. Please address your questions and bug reports at: UCell issues.Citation
UCell and pyUCell: single-cell gene signature scoring for R and Python. Massimo Andreatta & Santiago J Carmona (2026) Bioinformatics https://doi.org/10.1093/bioinformatics/btag055
UCell: robust and scalable single-cell gene signature scoring. Massimo Andreatta & Santiago J Carmona (2021) CSBJ https://doi.org/10.1016/j.csbj.2021.06.043