get_confidence_interval {infer} | R Documentation |
Compute confidence interval
Description
Compute a confidence interval around a summary statistic. Both
simulation-based and theoretical methods are supported, though only
type = "se"
is supported for theoretical methods.
Learn more in vignette("infer")
.
Usage
get_confidence_interval(x, level = 0.95, type = NULL, point_estimate = NULL)
get_ci(x, level = 0.95, type = NULL, point_estimate = NULL)
Arguments
x |
A distribution. For simulation-based inference, a data frame
containing a distribution of |
level |
A numerical value between 0 and 1 giving the confidence level. Default value is 0.95. |
type |
A string giving which method should be used for creating the
confidence interval. The default is |
point_estimate |
A data frame containing the observed statistic (in a
|
Details
A null hypothesis is not required to compute a confidence interval. However,
including hypothesize()
in a pipeline leading to get_confidence_interval()
will not break anything. This can be useful when computing a confidence
interval using the same distribution used to compute a p-value.
Theoretical confidence intervals (i.e. calculated by supplying the output
of assume()
to the x
argument) require that the point estimate lies on
the scale of the data. The distribution defined in assume()
will be
recentered and rescaled to align with the point estimate, as can be shown
in the output of visualize()
when paired with shade_confidence_interval()
.
Confidence intervals are implemented for the following distributions and
point estimates:
-
distribution = "t"
:point_estimate
should be the output ofcalculate()
withstat = "mean"
orstat = "diff in means"
-
distribution = "z"
:point_estimate
should be the output ofcalculate()
withstat = "prop"
orstat = "diff in props"
Value
A tibble containing the following columns:
-
term
: The explanatory variable (or intercept) in question. Only supplied if the input had been previously passed tofit()
. -
lower_ci
,upper_ci
: The lower and upper bounds of the confidence interval, respectively.
Aliases
get_ci()
is an alias of get_confidence_interval()
.
conf_int()
is a deprecated alias of get_confidence_interval()
.
See Also
Other auxillary functions:
get_p_value()
Examples
boot_dist <- gss %>%
# We're interested in the number of hours worked per week
specify(response = hours) %>%
# Generate bootstrap samples
generate(reps = 1000, type = "bootstrap") %>%
# Calculate mean of each bootstrap sample
calculate(stat = "mean")
boot_dist %>%
# Calculate the confidence interval around the point estimate
get_confidence_interval(
# At the 95% confidence level; percentile method
level = 0.95
)
# for type = "se" or type = "bias-corrected" we need a point estimate
sample_mean <- gss %>%
specify(response = hours) %>%
calculate(stat = "mean")
boot_dist %>%
get_confidence_interval(
point_estimate = sample_mean,
# At the 95% confidence level
level = 0.95,
# Using the standard error method
type = "se"
)
# using a theoretical distribution -----------------------------------
# define a sampling distribution
sampling_dist <- gss %>%
specify(response = hours) %>%
assume("t")
# get the confidence interval---note that the
# point estimate is required here
get_confidence_interval(
sampling_dist,
level = .95,
point_estimate = sample_mean
)
# using a model fitting workflow -----------------------
# fit a linear model predicting number of hours worked per
# week using respondent age and degree status.
observed_fit <- gss %>%
specify(hours ~ age + college) %>%
fit()
observed_fit
# fit 100 models to resamples of the gss dataset, where the response
# `hours` is permuted in each. note that this code is the same as
# the above except for the addition of the `generate` step.
null_fits <- gss %>%
specify(hours ~ age + college) %>%
hypothesize(null = "independence") %>%
generate(reps = 100, type = "permute") %>%
fit()
null_fits
get_confidence_interval(
null_fits,
point_estimate = observed_fit,
level = .95
)
# more in-depth explanation of how to use the infer package
## Not run:
vignette("infer")
## End(Not run)