R: Number of Cases in Data by Group

ncases_by {quest}

R Documentation

Number of Cases in Data by Group

Description

ncases_by computes the ncases of a data.frame by group. Through the use of the ov.min, prop, and inclusive arguments, the user can specify how many missing values are allowed in a row for it to be counted. ncases_by is simply a wrapper for ncases + agg_dfm.

Usage

ncases_by(
  data,
  vrb.nm = str2str::pick(names(data), val = grp.nm, not = TRUE),
  grp.nm,
  sep = ".",
  ov.min = 1L,
  prop = TRUE,
  inclusive = TRUE
)

Arguments

`data`	data.frame of data.
`vrb.nm`	character vector of colnames from `data` specifying the set of variables to base the ncases on.
`grp.nm`	character vector of colnames from `data` specifying the grouping variables.
`sep`	character vector of length 1 specifying what string to use to separate the groups when naming the return object. `sep` is only used if `grp.nm` has length > 1 (aka multiple grouping variables)
`ov.min`	minimum frequency of observed values required per row. If `prop` = TRUE, then this is a decimal between 0 and 1. If `prop` = FALSE, then this is a integer between 0 and `length(vrb.nm)`.
`prop`	logical vector of length 1 specifying whether `ov.min` should refer to the proportion of observed values (TRUE) or the count of observed values (FALSE).
`inclusive`	logical vector of length 1 specifying whether the case should be included if the frequency of observed values in a row is exactly equal to `ov.min`.

Value

atomic vector with names = unique(interaction(data[grp.nm], sep = sep)) and length = length(unique(interaction(data[grp.nm], sep = sep))) providing the ncases for each group.

Examples


# one grouping variables
tmp_nm <- c("outcome","case","session","trt_time")
dat <- as.data.frame(lmeInfo::Bryant2016)[tmp_nm]
stats_by <- psych::statsBy(dat,
   group = "case") # requires you to include "case" column in dat
ncases_by(data = dat, grp.nm = "case")
dat2 <- as.data.frame(ChickWeight)
ncases_by(data = dat2, grp.nm = "Chick")

# two grouping variables
tmp <- reshape(psych::bfi[1:10, ], varying = 1:25, timevar = "item",
   ids = row.names(psych::bfi)[1:10], direction = "long", sep = "")
tmp_nm <- c("id","item","N","E","C","A","O") # Roxygen runs the whole script
dat3 <- str2str::stack2(tmp[tmp_nm], select.nm = c("N","E","C","A","O"),
   keep.nm = c("id","item"))
ncases_by(dat3, grp.nm = c("id","vrb_names"))