pscrub {fMRIscrub} | R Documentation |
Projection scrubbing
Description
Projection scrubbing is a data-driven method for identifying artifact-contaminated volumes in fMRI. It works by identifying component directions in the data likely to represent patterns of burst noise, and then computing a composite measure of outlyingness based on leverage within these directions, at each volume. The projection can be PCA, ICA, or "fused PCA." Projection scrubbing can also be used for other outlier detection tasks involving high-dimensional data.
Usage
pscrub(
X,
projection = c("ICA", "PCA"),
nuisance = "DCT4",
center = TRUE,
scale = TRUE,
comps_mean_dt = FALSE,
comps_var_dt = FALSE,
PESEL = TRUE,
kurt_quantile = 0.99,
get_dirs = FALSE,
full_PCA = FALSE,
get_outliers = TRUE,
cutoff = 4,
seed = 0,
ICA_method = c("C", "R"),
verbose = FALSE
)
Arguments
X |
Wide numeric data matrix ( |
projection |
One of the following: |
nuisance |
Nuisance signals to regress from each column of Detrending is highly recommended for time-series data, especially if there are many time points or evolving circumstances affecting the data. Additionally, if kurtosis is being used to select the projection directions, trends can induce positive or negative kurtosis, contaminating the connection between high kurtosis and outlier presence. Detrending should not be used with non-time-series data because the observations are not temporally related. Additional nuisance regressors can be specified like so:
|
center , scale |
Center the columns of the data by their medians, and scale the
columns of the data by their median absolute deviations (MADs)? Default: Note that centering and scaling occur after nuisance regression, so even if
|
comps_mean_dt , comps_var_dt |
Stabilize the mean and variance of each
projection component's timecourse prior to computing kurtosis and leverage?
These arguments should be Slow-moving mean and variance patterns in the components will interfere with
the roles of kurtosis and leverage in identifying outliers. While
Overall, for fMRI we recommend enabling |
PESEL |
Use |
kurt_quantile |
What quantile cutoff should be used to select the
components? Default: We model each component as a length |
get_dirs |
Should the projection directions be returned? This is the
|
full_PCA |
Only applies to the PCA projection. Return the full SVD?
Default: |
get_outliers |
Should outliers be flagged based on |
cutoff |
Median leverage cutoff value. Default: |
seed |
Set a seed right before the call to |
ICA_method |
The |
verbose |
Should occasional updates be printed? Default: |
Details
Refer to the projection scrubbing vignette for a demonstration and an
outline of the algorithm: vignette("projection_scrubbing", package="fMRIscrub")
Value
A "pscrub"
object, i.e. a list with components
- measure
A numeric vector of leverage values.
- outlier_cutoff
The numeric outlier cutoff value (
cutoff
times the median leverage).- outlier_flag
A logical vector where
TRUE
indicates where leverage exceeds the cutoff, signaling suspected outlier presence.- mask
-
A length
P
numeric vector corresponding to the data locations inX
. Each value indicates whether the location was masked:- 0
The data location was not masked out.
- -1
The data location was masked out, because it had at least one
NA
orNaN
value.- -2
The data location was masked out, because it was constant.
- PCA
-
This will be a list with components:
- U
The
T
byQ
PC score matrix.- D
The standard deviation of each PC.
- V
The
P
byQ
PC directions matrix. Included only ifget_dirs
.- highkurt
The length
Q
logical vector indicating scores of high kurtosis.- U_dt
Detrended components of
U
. Included only if components were mean- or variance-detrended.- highkurt
The length
Q
logical vector indicating detrended scores of high kurtosis.- nPCs_PESEL
The number of PCs selected by PESEL.
- nPCs_avgvar
The number of above-average variance PCs.
where
Q
is the number of PCs selected by PESEL or of above-average variance (or the greater of the two if both were used). If PCA was not used, all entries exceptnPCs_PESEL
and/ornPCs_avgvar
will not be included, depending on which method(s) was used to select the number of PCs. - ICA
-
If ICA was used, this will be a list with components:
- S
The
P
byQ
source signals matrix. Included only ifget_dirs
- M
The
T
byQ
mixing matrix.- highkurt
The length
Q
logical vector indicating mixing scores of high kurtosis.- M_dt
Detrended components of
M
. Included only if components were mean- or variance-detrended.- highkurt
The length
Q
logical vector indicating detrended mixing scores of high kurtosis. Included only if components were mean- or variance-detrended.
References
Mejia, A. F., Nebel, M. B., Eloyan, A., Caffo, B. & Lindquist, M. A. PCA leverage: outlier detection for high-dimensional functional magnetic resonance imaging data. Biostatistics 18, 521-536 (2017).
Pham, D., McDonald, D., Ding, L., Nebel, M. B. & Mejia, A. Less is more: balancing noise reduction and data retention in fMRI with projection scrubbing. (2022).
Examples
library(fastICA)
psx = pscrub(Dat1[seq(70),seq(800,950)], ICA_method="R")