Find DEG between two groups. Two methods are supported:
"wilcoxon"
and "pseudoBulk"
. Wilcoxon rank sum test is
performed on single-cell level, while pseudo-bulk method aggregates cells
basing on biological replicates and calls bulk RNAseq DE methods, DESeq2 wald
test. When real biological replicates are not available, pseudo replicates
can be generated. Please see below for detailed scenario usage.
Usage
runPairwiseDEG(
object,
groupTest,
groupCtrl,
variable1 = NULL,
variable2 = NULL,
method = c("wilcoxon", "pseudoBulk"),
usePeak = FALSE,
useReplicate = NULL,
nPsdRep = 5,
minCellPerRep = 10,
seed = 1,
verbose = getOption("ligerVerbose", TRUE)
)
runMarkerDEG(
object,
conditionBy = NULL,
splitBy = NULL,
method = c("wilcoxon", "pseudoBulk"),
useDatasets = NULL,
usePeak = FALSE,
useReplicate = NULL,
nPsdRep = 5,
minCellPerRep = 10,
seed = 1,
verbose = getOption("ligerVerbose", TRUE)
)
runWilcoxon(
object,
data.use = NULL,
compare.method = c("clusters", "datasets")
)
Arguments
- object
A liger object, with normalized data available
- groupTest, groupCtrl, variable1, variable2
Condition specification. See
?runPairwiseDEG
section Pairwise DEG Scenarios for detail.- method
DEG test method to use. Choose from
"wilcoxon"
or"pseudoBulk"
. Default"wilcoxon"
- usePeak
Logical. Whether to use peak count instead of gene count. Only supported when ATAC datasets are involved. Default
FALSE
.- useReplicate
cellMeta
variable of biological replicate annotation. Only used withmethod = "pseudoBulk"
. DefaultNULL
will createnPsdRep
pseudo replicates per group.- nPsdRep
Number of pseudo replicates to create. Only used when
method = "pseudoBulk", useReplicate = NULL
. Default5
.- minCellPerRep
Numeric, will not make pseudo-bulk for replicate with less than this number of cells. Default
10
.- seed
Random seed to use for pseudo-replicate generation. Default
1
.- verbose
Logical. Whether to show information of the progress. Default
getOption("ligerVerbose")
orTRUE
if users have not set.- conditionBy
cellMeta
variable(s). Marker detection will be performed for each level of this variable. Multiple variables will be combined. DefaultNULL
uses default cluster.- splitBy
Split data by
cellMeta
variable(s) here and identify markers forconditionBy
within each chunk. DefaultNULL
.- useDatasets
Datasets to perform marker detection within. Default
NULL
will use all datasets.- data.use
Same as
useDatasets
.- compare.method
Choose from
"clusters"
(default) or"datasets"
."clusters"
compares each cluster against all other cells, while"datasets"
run within each cluster and compare each dataset against all other datasets.
Pairwise DEG Scenarios
Users can select classes of cells from a variable in cellMeta
.
variable1
and variable2
are used to specify a column in
cellMeta
, and groupTest
and groupCtrl
are used to specify
existing classes from variable1
and variable2
, respectively.
When variable2
is missing, groupCtrl
will be considered from
variable1
.
For example, when variable1 = "celltype"
and variable2 = NULL
,
groupTest
and groupCtrl
should be valid cell types in
object$celltype
.
When variable1
is "celltype" and variable2
is "gender",
groupTest
should be a valid cell type from object$celltype
and
groupCtrl
should be a valid class from object$gender
.
When both variable1
and variable2
are missing, groupTest
and groupCtrl
should be valid index of cells in object
.
Marker Detection Scenarios
Marker detection is generally performed in a one vs. rest manner. The
grouping of such condition is specified by conditionBy
, which should
be a column name in cellMeta
. When splitBy
is specified as
another variable name in cellMeta
, the marker detection will be
iteratively done for within each level of splitBy
variable.
For example, when conditionBy = "celltype"
and splitBy = NULL
,
marker detection will be performed by comparing all cells of "celltype_i"
against all other cells, and etc.
When conditionBy = "celltype"
and splitBy = "gender"
, marker
detection will be performed by comparing "celltype_i" cells from "gender_j"
against other cells from "gender_j", and etc.
Examples
# Compare between cluster "0" and cluster "1"
degStats <- runPairwiseDEG(pbmcPlot, groupTest = 0, groupCtrl = 1,
variable1 = "leiden_cluster")
#> ℹ Running Wilcoxon rank-sum test
#> ✔ Running Wilcoxon rank-sum test ... done
#>
# Compare between all cells from cluster "5" and
# all cells from dataset "stim"
degStats <- runPairwiseDEG(pbmcPlot, groupTest = "5", groupCtrl = "stim",
variable1 = "leiden_cluster",
variable2 = "dataset")
#> ℹ Running Wilcoxon rank-sum test
#> ✔ Running Wilcoxon rank-sum test ... done
#>
# Identify markers for each cluster. Equivalent to old version
# `runWilcoxon(method = "cluster")`
markerStats <- runMarkerDEG(pbmcPlot, conditionBy = "leiden_cluster")
#> ℹ Running Wilcoxon rank-sum test
#> ✔ Running Wilcoxon rank-sum test ... done
#>
# Identify dataset markers within each cluster. Equivalent to old version
# `runWilcoxon(method = "dataset")`.
markerStatsList <- runMarkerDEG(pbmcPlot, conditionBy = "dataset",
splitBy = "leiden_cluster")
#> ℹ Running Wilcoxon rank-sum test
#> ✔ Running Wilcoxon rank-sum test ... done
#>
#> ℹ Running Wilcoxon rank-sum test
#> ✔ Running Wilcoxon rank-sum test ... done
#>
#> ℹ Running Wilcoxon rank-sum test
#> ✔ Running Wilcoxon rank-sum test ... done
#>
#> ℹ Running Wilcoxon rank-sum test
#> ✔ Running Wilcoxon rank-sum test ... done
#>
#> ℹ Running Wilcoxon rank-sum test
#> ✔ Running Wilcoxon rank-sum test ... done
#>
#> ℹ Running Wilcoxon rank-sum test
#> ✔ Running Wilcoxon rank-sum test ... done
#>
#> ℹ Running Wilcoxon rank-sum test
#> ✔ Running Wilcoxon rank-sum test ... done
#>
#> ℹ Running Wilcoxon rank-sum test
#> ✔ Running Wilcoxon rank-sum test ... done
#>