ncases_desc {quest}R Documentation

Describe Number of Cases in Data by Group

Description

ncases_desc computes descriptive statistics about the number of cases by group in a data.frame. This is often done in diary studies to obtain information about compliance for the sample. Through the use of the ov.min, prop, and inclusive arguments, the user can specify how many missing values are allowed in a row for it to be counted. ncases_desc is simply ncases_by + psych::describe.

Usage

ncases_desc(
  data,
  vrb.nm = str2str::pick(names(data), val = grp.nm, not = TRUE),
  grp.nm,
  ov.min = 1,
  prop = TRUE,
  inclusive = TRUE,
  interp = FALSE,
  skew = TRUE,
  ranges = TRUE,
  trim = 0.1,
  type = 3,
  quant = c(0.25, 0.75),
  IQR = FALSE
)

Arguments

data

data.frame of data.

vrb.nm

character vector of colnames from data specifying the set of variables to base the ncases on.

grp.nm

character vector of colnames from data specifying the grouping variables.

ov.min

minimum frequency of observed values required per row. If prop = TRUE, then this is a decimal between 0 and 1. If prop = FALSE, then this is a integer between 0 and length(vrb.nm).

prop

logical vector of length 1 specifying whether ov.min should refer to the proportion of observed values (TRUE) or the count of observed values (FALSE).

inclusive

logical vector of length 1 specifying whether the case should be included if the frequency of observed values in a row is exactly equal to ov.min.

interp

logical vector of length 1 specifying whether the median should be standard (FALSE) or interpolated (TRUE).

skew

logical vector of length 1 specifying whether skewness and kurtosis should be calculated (TRUE) or not (FALSE).

ranges

logical vector of length 1 specifying whether the minimum, maximum, and range (i.e., maximum - minimum) should be calculated (TRUE) or not (FALSE). Note, if ranges = FALSE, the trimmed mean and median absolute deviation is also not computed as per the psych::describe function behavior.

trim

numeric vector of length 1 specifying the top and bottom quantiles of data that are to be excluded when calculating the trimmed mean. For example, the default value of 0.1 means that only data within the 10th - 90th quantiles are used for calculating the trimmed mean.

type

numeric vector of length 1 specifying the type of skewness and kurtosis coefficients to compute. See the details of psych::describe. The options are 1, 2, or 3.

quant

numeric vector specifying the quantiles to compute. Foe example, the default value of c(0.25, 0.75) computes the 25th and 75th quantiles of the group number of cases. If quant = NULL, then no quantiles are returned.

IQR

logical vector of length 1 specifying whether to compute the Interquartile Range (TRUE) or not (FALSE), which is simply the 75th quantil - 25th quantile.

Value

numeric vector containing descriptive statistics about number of cases by group. Note, which elements are returned depends on the arguments. See each argument's description.

n

number of groups

mean

mean

sd

standard deviation

median

median (standard if interp = FALSE, interpolated if interp = TRUE)

trimmed

trimmed mean based on trim

mad

median absolute difference

min

minimum

max

maximum

range

maximum - minumum

skew

skewness

kurtosis

kurtosis

se

standard error of the mean

IQR

75th quantile - 25th quantile

QX.XX

quantiles, which are named by quant (e.g., 0.25 = "Q0.25")

See Also

ncases_by describe

Examples

tmp_nm <- c("outcome","case","session","trt_time")
dat <- as.data.frame(lmeInfo::Bryant2016)[tmp_nm]
stats_by <- psych::statsBy(dat, group = "case") # doesn't include everything you want
ncases_desc(data = dat, grp.nm = "case")
dat2 <- as.data.frame(ChickWeight)
ncases_desc(data = dat2, grp.nm = "Chick")
ncases_desc(data = dat2, grp.nm = "Chick", trim = .05)
ncases_desc(data = dat2, grp.nm = "Chick", ranges = FALSE)
ncases_desc(data = dat2, grp.nm = "Chick", quant = NULL)
ncases_desc(data = dat2, grp.nm = "Chick", IQR = TRUE)

[Package quest version 0.2.0 Index]