rapidsplit {rapidsplithalf} | R Documentation |
rapidsplit
Description
A very fast algorithm for computing stratified permutated split-half reliability.
Usage
rapidsplit(
data,
subjvar,
diffvars = NULL,
stratvars = NULL,
subscorevar = NULL,
aggvar,
splits,
aggfunc = c("means", "medians"),
errorhandling = list(type = c("none", "fixedpenalty"), errorvar = NULL, fixedpenalty =
600, blockvar = NULL),
standardize = FALSE,
include.scores = TRUE,
verbose = TRUE,
check = TRUE
)
## S3 method for class 'rapidsplit'
print(x, ...)
## S3 method for class 'rapidsplit'
plot(
x,
type = c("average", "minimum", "maximum", "random", "all"),
show.labels = TRUE,
...
)
rapidsplit.chunks(
data,
subjvar,
diffvars = NULL,
stratvars = NULL,
subscorevar = NULL,
aggvar,
splits,
aggfunc = c("means", "medians", "custom"),
errorhandling = list(type = c("none", "fixedpenalty"), errorvar = NULL, fixedpenalty =
600, blockvar = NULL),
standardize = FALSE,
include.scores = TRUE,
verbose = TRUE,
check = TRUE,
chunks = 4,
cluster = NULL
)
Arguments
data |
Dataset, a |
subjvar |
Subject ID variable name, a |
diffvars |
Names of variables that determine which conditions
need to be subtracted from each other, |
stratvars |
Additional variables that the splits should
be stratified by; a |
subscorevar |
A |
aggvar |
Name of variable whose values to aggregate, a |
splits |
Number of split-halves to average, an |
aggfunc |
The function by which to aggregate the variable
defined in |
errorhandling |
A list with 4 named items, to be used to replace error trials
with the block mean of correct responses plus a fixed penalty, as in the IAT D-score.
The 4 items are |
standardize |
Whether to divide by scores by the subject's SD; a |
include.scores |
Include all individual split-half scores? |
verbose |
Display progress bars? Defaults to |
check |
Check input for possible problems? |
x |
|
... |
Ignored. |
type |
Character argument indicating what should be plotted.
By default, this plots the random split whose correlation is closest to the average.
However, this can also plot the random split with
the |
show.labels |
Should participant IDs be shown above their points in the scatterplot?
Defaults to |
chunks |
Number of chunks to divide the splits in, for more memory-efficient computation, and to divide over multiple cores if requested. |
cluster |
Chunks will be run on separate cores if a cluster is provided,
or an |
Details
The order of operations (with optional steps between brackets) is:
Splitting
(Replacing error trials within block within split)
Computing aggregates per condition (per subscore) per person
Subtracting conditions from each other
(Dividing the resulting (sub)score by the SD of the data used to compute that (sub)score)
(Averaging subscores together into a single score per person)
Correlating scores from one half with scores from the other half
Applying the Spearman-Brown correction using
spearmanBrown()
Computing the average split-half reliability using
cormean()
Value
A list
containing
-
r
: the averaged reliability. -
allcors
: a vector with the reliability of each iteration. -
nobs
: the number of participants. -
scores
: the individual participants scores in each split-half, contained in a list with two matrices (Only included if requested withinclude.scores
).
Note
This function can use a lot of memory in one go. If you're computing the reliability of a large dataset or you have little RAM, it may pay off to use the sequential version of this function instead:
rapidsplit.chunks()
It is currently unclear it is better to pre-process your data before or after splitting it. If you are computing the IAT D-score, you can therefore use
errorhandling
andstandardize
to perform these two actions after splitting, or you can process your data before splitting and forgo these two options.
Examples
data(foodAAT)
# Reliability of the double difference score:
# [RT(push food)-RT(pull food)] - [RT(push object)-RT(pull object)]
frel<-rapidsplit(data=foodAAT,
subjvar="subjectid",
diffvars=c("is_pull","is_target"),
stratvars="stimid",
aggvar="RT",
splits=100)
print(frel)
plot(frel,type="all")
# Compute a single random split-half reliability of the error rate
rapidsplit(data=foodAAT,
subjvar="subjectid",
aggvar="error",
splits=1,
aggfunc="means")
# Compute the reliability of an IAT D-score
data(raceIAT)
rapidsplit(data=raceIAT,
subjvar="session_id",
diffvars="congruent",
subscorevar="blocktype",
aggvar="latency",
errorhandling=list(type="fixedpenalty",errorvar="error",
fixedpenalty=600,blockvar="block_number"),
splits=100,
standardize=TRUE)
# Unstratified reliability of the median RT
rapidsplit.chunks(data=foodAAT,
subjvar="subjectid",
aggvar="RT",
splits=100,
aggfunc="medians",
chunks=8)
# Compute the reliability of Tukey's trimean of the RT
# on 2 CPU cores
trimean<-function(x){
sum(quantile(x,c(.25,.5,.75))*c(1,2,1))/4
}
rapidsplit.chunks(data=foodAAT,
subjvar="subjectid",
aggvar="RT",
splits=200,
aggfunc=trimean,
cluster=2)