This metric quantifies how much the factorization and alignment distorts the geometry of the original datasets. The greater the agreement, the less distortion of geometry there is. This is calculated by performing dimensionality reduction on the original and quantile aligned (or just factorized) datasets, and measuring similarity between the k nearest neighbors for each cell in original and aligned datasets. The Jaccard index is used to quantify similarity, and is the final metric averages across all cells.
Note that for most datasets, the greater the chosen nNeighbor
, the
greater the agreement in general. Although agreement can theoretically
approach 1, in practice it is usually no higher than 0.2-0.3.
Usage
calcAgreement(
object,
ndims = 40,
nNeighbors = 15,
useRaw = FALSE,
byDataset = FALSE,
seed = 1,
dr.method = NULL,
k = nNeighbors,
use.aligned = NULL,
rand.seed = seed,
by.dataset = byDataset
)
Arguments
- object
liger
object. Should call quantile_norm before calling.- ndims
Number of factors to produce in NMF. Default
40
.- nNeighbors
Number of nearest neighbors to use in calculating Jaccard index. Default
15
.- useRaw
Whether to evaluate just factorized \(H\) matrices instead of using quantile aligned \(H.norm\) matrix. Default
FALSE
uses aligned matrix.- byDataset
Whether to return agreement calculated for each dataset instead of the average for all datasets. Default
FALSE
.- seed
Random seed to allow reproducible results. Default
1
.- dr.method
[defunct] We no longer support other methods but just NMF.
- k, rand.seed, by.dataset
[Deprecated] See Usage for replacement.
- use.aligned
[defunct] Use
useRaw
instead.
Value
A numeric vector of agreement metric. A single value if
byDataset = FALSE
or each dataset a value otherwise.
Examples
if (requireNamespace("RcppPlanc", quietly = TRUE)) {
pbmc <- pbmc %>%
normalize %>%
selectGenes %>%
scaleNotCenter %>%
runINMF %>%
quantileNorm
calcAgreement(pbmc)
}
#> ℹ Normalizing datasets "ctrl"
#> ℹ Normalizing datasets "stim"
#> ✔ Normalizing datasets "stim" ... done
#>
#> ℹ Normalizing datasets "ctrl"
#> ✔ Normalizing datasets "ctrl" ... done
#>
#> ℹ Selecting variable features for dataset "ctrl"
#> ✔ ... 168 features selected out of 249 shared features.
#> ℹ Selecting variable features for dataset "stim"
#> ✔ ... 166 features selected out of 249 shared features.
#> ✔ Finally 173 shared variable features are selected.
#> ℹ Scaling dataset "ctrl"
#> ✔ Scaling dataset "ctrl" ... done
#>
#> ℹ Scaling dataset "stim"
#> ✔ Scaling dataset "stim" ... done
#>
#> ℹ Using largest dataset of recommended type as reference: "ctrl" with 300 cells
#> [1] 0.3723238