setup {specr} | R Documentation |
Specifying analytical decisions in a specification setup
Description
Creates all possible specifications as a combination of
different dependent and independent variables, model types, control
variables, potential subset analyses, as well as potentially other
analytic choices. This function represents the first step in the
analytic framework implemented in the package specr
. The resulting
class specr.setup
then needs to be passed to the core function of
the package called specr()
, which fits the specified models across
all specifications.
Usage
setup(
data,
x,
y,
model,
controls = NULL,
subsets = NULL,
add_to_formula = NULL,
fun1 = function(x) broom::tidy(x, conf.int = TRUE),
fun2 = function(x) broom::glance(x),
simplify = FALSE
)
Arguments
data |
The data set that should be used for the analysis |
x |
A vector denoting independent variables |
y |
A vector denoting the dependent variables |
model |
A vector denoting the model(s) that should be estimated. |
controls |
A vector of the control variables that should be included. Defaults to NULL. |
subsets |
Specification of potential subsets/groups as list. There are two ways
in which these can be specified that both start from the assumption that the
"grouping" variable is in the data set. The simplest way is to provide a named
vector within the list, whose name is the variable that should be used for
subsetting and whose values are the values that reflect the subsets (e.g.,
|
add_to_formula |
A string specifying aspects that should always be included in the formula (e.g. a constant covariate, random effect structures...) |
fun1 |
A function that extracts the parameters of interest from the fitted models. Defaults to tidy, which works with a large range of different models. |
fun2 |
A function that extracts fit indices of interest from the models.
Defaults to glance, which works with a large range of
different models. Note: Different models result in different fit indices. Thus,
if you use different models within one specification curve analysis, this may not
work. In this case, you can simply set |
simplify |
Logical value indicating what type of combinations between control variables should be included in the specification. If FALSE (default), all combinations between the provided variables are created (none, each individually, each combination between each variable, all variables). If TRUE, only no covariates, each individually, and all covariates are included as specifications (akin to the default in specr version 0.2.1). |
Details
Empirical results are often contingent on analytical decisions that are equally defensible, often arbitrary, and motivated by different reasons. This decisions may introduce bias or at least variability. To this end, specification curve analyses (Simonsohn et al., 2020) or multiverse analyses (Steegen et al., 2016) refer to identifying the set of theoretically justified, statistically valid (and potentially also non-redundant specifications, fitting the "multiverse" of models represented by these specifications and extract relevant parameters often to display the results graphically as a so-called specification curve. This allows readers to identify consequential specifications decisions and how they affect the results or parameter of interest.
Use of this function
A general overview is provided in the vignettes vignette("specr")
.
It is assumed that you want to estimate the relationship between two variables
(x
and y
). What varies may be what variables should be used for
x
and y
, what model should be used to estimate the relationship,
whether the relationship should be estimated for certain subsets, and whether
different combinations of control variables should be included. This
allows to (re)produce almost any analytical decision imaginable. See examples
below for how a number of typical analytical decision can be implemented.
Afterwards you pass the resulting object of a class specr.setup
to the
function specr()
to run the specification curve analysis.
Note, the resulting class of specr.setup
allows to use generic functions.
Use methods(class = "specr.setup")
for an overview on available methods and
e.g., ?summary.specr.setup
to view the dedicated help page.
Value
An object of class specr.setup
which includes all possible
specifications based on combinations of the analytic choices. The
resulting list includes a specification tibble, the data set, and additional
information about the universe of specifications. Use
methods(class = "specr.setup")
for an overview on available methods.
References
Simonsohn, U., Simmons, J.P. & Nelson, L.D. (2020). Specification curve analysis. Nature Human Behaviour, 4, 1208–1214. https://doi.org/10.1038/s41562-020-0912-z
Steegen, S., Tuerlinckx, F., Gelman, A., & Vanpaemel, W. (2016). Increasing Transparency Through a Multiverse Analysis. Perspectives on Psychological Science, 11(5), 702-712. https://doi.org/10.1177/1745691616658637
See Also
specr()
for the second step of actually running the actual specification curve analysis
summary.specr.setup()
for how to summarize and inspect the resulting specifications
plot.specr.setup()
for creating a visual summary of the specification setup.
Examples
## Example 1 ----
# Setting up typical specifications
specs <- setup(data = example_data,
x = c("x1", "x2"),
y = c("y1", "y2"),
model = "lm",
controls = c("c1", "c2", "c3"),
subsets = list(group1 = c("young", "middle", "old"),
group2 = c("female", "male")),
simplify = TRUE)
# Check specifications
summary(specs, rows = 18)
## Example 2 ----
# Setting up specifications for multilevel models
specs <- setup(data = example_data,
x = c("x1", "x2"),
y = c("y1", "y2"),
model = c("lmer"), # multilevel model
subsets = list(group1 = c("young", "old"), # only young and old!
group2 = unique(example_data$group2)),# alternative specification
controls = c("c1", "c2"),
add_to_formula = "(1|group2)") # random effect in all models
# Check specifications
summary(specs)
## Example 3 ----
# Setting up specifications with a different parameter extract functions
# Create custom extract function to extract different parameter and model
tidy_99 <- function(x) {
fit <- broom::tidy(x,
conf.int = TRUE,
conf.level = .99) # different alpha error rate
fit$full_model = list(x) # include entire model fit object as list
return(fit)
}
# Setup specs
specs <- setup(data = example_data,
x = c("x1", "x2"),
y = c("y1", "y2"),
model = "lm",
fun1 = tidy_99, # pass new function to setup
add_to_formula = "c1 + c2") # set of covariates in all models
# Check specifications
summary(specs)