R: Multilevel Number of Cases

ncases_ml {quest}

R Documentation

Multilevel Number of Cases

Description

ncases_ml computes the number cases and number of groups in the data that are at least partially observed, given a specified frequency of observed values across a set of columns. ncases_ml allows the user to specify the frequency of columns that need to be observed in order to count the case. Groups can be excluded if no rows in the data for a group have enough observed values to be counted as cases. This is simply a combination of partial.cases + nrow_ml. Note, ncases_ml is essentially a version of nrow_ml that accounts for missing data.

Usage

ncases_ml(
  data,
  vrb.nm = str2str::pick(names(data), val = grp.nm, not = TRUE),
  grp.nm,
  ov.min = 1L,
  prop = TRUE,
  inclusive = TRUE
)

Arguments

`data`	data.frame of data.
`vrb.nm`	a character vector of colnames from `data` specifying the variables which will be used to determine the partially observed cases.
`grp.nm`	character vector of colnames from `data` specifying the grouping variables.
`ov.min`	minimum frequency of observed values required per row. If `prop` = TRUE, then this is a decimal between 0 and 1. If `prop` = FALSE, then this is a integer between 0 and `length(vrb.nm)`.
`prop`	logical vector of length 1 specifying whether `ov.min` should refer to the proportion of observed values (TRUE) or the count of observed values (FALSE).
`inclusive`	logical vector of length 1 specifying whether the case should be included if the frequency of observed values in a row is exactly equal to `ov.min`.

Value

list with two elements providing the sample sizes (accouning for missing data). The first element is named "within" and contains the number of cases in the data. The second element is named "between" and contains the number of groups in the data. Cases are counted if if the frequency of observed values is greater than (or equal to, if inclusive = TRUE).

Examples


# NO MISSING DATA

# one grouping variable
ncases_ml(data = as.data.frame(ChickWeight), grp.nm = "Chick")

# multiple grouping variables
ncases_ml(data = mtcars, grp.nm = c("vs","am"))

# YES MISSING DATA

# only within
nrow_ml(data = airquality, grp.nm = "Month")
ncases_ml(data = airquality, grp.nm = "Month")

# both within and between
airquality2 <- airquality
airquality2[airquality2$"Month" == 6, "Ozone"] <- NA
nrow_ml(data = airquality2, grp.nm = "Month")
ncases_ml(data = airquality2, grp.nm = "Month")