ncases_by {quest}R Documentation

Number of Cases in Data by Group

Description

ncases_by computes the ncases of a data.frame by group. Through the use of the ov.min, prop, and inclusive arguments, the user can specify how many missing values are allowed in a row for it to be counted. ncases_by is simply a wrapper for ncases + agg_dfm.

Usage

ncases_by(
  data,
  vrb.nm = str2str::pick(names(data), val = grp.nm, not = TRUE),
  grp.nm,
  sep = ".",
  ov.min = 1L,
  prop = TRUE,
  inclusive = TRUE
)

Arguments

data

data.frame of data.

vrb.nm

character vector of colnames from data specifying the set of variables to base the ncases on.

grp.nm

character vector of colnames from data specifying the grouping variables.

sep

character vector of length 1 specifying what string to use to separate the groups when naming the return object. sep is only used if grp.nm has length > 1 (aka multiple grouping variables)

ov.min

minimum frequency of observed values required per row. If prop = TRUE, then this is a decimal between 0 and 1. If prop = FALSE, then this is a integer between 0 and length(vrb.nm).

prop

logical vector of length 1 specifying whether ov.min should refer to the proportion of observed values (TRUE) or the count of observed values (FALSE).

inclusive

logical vector of length 1 specifying whether the case should be included if the frequency of observed values in a row is exactly equal to ov.min.

Value

atomic vector with names = unique(interaction(data[grp.nm], sep = sep)) and length = length(unique(interaction(data[grp.nm], sep = sep))) providing the ncases for each group.

See Also

nrow_by ncases agg_dfm

Examples


# one grouping variables
tmp_nm <- c("outcome","case","session","trt_time")
dat <- as.data.frame(lmeInfo::Bryant2016)[tmp_nm]
stats_by <- psych::statsBy(dat,
   group = "case") # requires you to include "case" column in dat
ncases_by(data = dat, grp.nm = "case")
dat2 <- as.data.frame(ChickWeight)
ncases_by(data = dat2, grp.nm = "Chick")

# two grouping variables
tmp <- reshape(psych::bfi[1:10, ], varying = 1:25, timevar = "item",
   ids = row.names(psych::bfi)[1:10], direction = "long", sep = "")
tmp_nm <- c("id","item","N","E","C","A","O") # Roxygen runs the whole script
dat3 <- str2str::stack2(tmp[tmp_nm], select.nm = c("N","E","C","A","O"),
   keep.nm = c("id","item"))
ncases_by(dat3, grp.nm = c("id","vrb_names"))


[Package quest version 0.2.0 Index]