R: Length of Data Columns by Group

lengths_by {quest}

R Documentation

Length of Data Columns by Group

Description

lengths_by computes the the length of multiple columns in a data.frame by group. The argument na.rm can be used to include (FALSE) or exclude (TRUE) missing values. Through the use of na.rm = TRUE, the number of observed values for each variable by each group can be computed.

Usage

lengths_by(data, vrb.nm, grp.nm, na.rm = FALSE, sep = ".")

Arguments

`data`	data.frame of data.
`vrb.nm`	character vector of colnames from `data` specifying the variables.
`grp.nm`	character vector of colnames from `data` specifying the groups.
`na.rm`	logical vector of length 1 specifying whether to include (FALSE) or exclude (TRUE) missing values.
`sep`	character vector of length 1 specifying what string should separate different group values when naming the return object. This argument is only used if grp is a list of atomic vectors (e.g., data.frame).

Value

data.frame with colnames = vrb.nm and rownames = length(levels(interaction(grp))) providing the number of elements (excluding missing values if na.rm = TRUE) in each column by group.

Examples


lengths_by(mtcars, vrb.nm = c("mpg","cyl","disp"), grp = "gear")
lengths_by(mtcars, vrb.nm = c("mpg","cyl","disp"),
   grp = c("gear","vs")) # can handle multiple grouping variables
lengths_by(mtcars, vrb.nm = c("mpg","cyl","disp"),
   grp = c("gear","am")) # can handle zero lengths
lengths_by(airquality, c("Ozone","Solar.R","Wind"), grp = "Month",
   na.rm = FALSE) # include missing values
lengths_by(airquality, c("Ozone","Solar.R","Wind"), grp = "Month",
   na.rm = TRUE) # exclude missing values