descrip {rigr} | R Documentation |
Descriptive Statistics
Description
Produces table of relevant descriptive statistics for an arbitrary number of
variables of class integer
, numeric
, Surv
, Date
,
or factor
. Descriptive statistics can be obtained within strata, and
the user can specify that only a subset of the data be used. Descriptive
statistics include the count of observations, the count of cases with
missing values, the mean, standard deviation, geometric mean, minimum, and
maximum. The user can specify arbitrary quantiles to be estimated, as well
as specifying the estimation of proportions of observations within specified
ranges.
Usage
descrip(
...,
strata = NULL,
subset = NULL,
probs = c(0.25, 0.5, 0.75),
geomInclude = FALSE,
replaceZeroes = FALSE,
restriction = Inf,
above = NULL,
below = NULL,
labove = NULL,
rbelow = NULL,
lbetween = NULL,
rbetween = NULL,
interval = NULL,
linterval = NULL,
rinterval = NULL,
lrinterval = NULL
)
Arguments
... |
an arbitrary number of variables for which descriptive statistics
are desired. The arguments can be vectors, matrices, or lists. Individual
columns of a matrix or elements of a list may be of class |
strata |
a vector, matrix, or list of stratification variables. Descriptive
statistics will be computed within strata defined by each unique combination
of the stratification variables, as well as in the combined sample.
If |
subset |
a vector indicating a subset to be used for all descriptive statistics.
If |
probs |
a vector of probabilities between 0 and 1 indicating quantile estimates to be included in the descriptive statistics. Default is to compute 25th, 50th (median) and 75th percentiles. |
geomInclude |
if not |
replaceZeroes |
if not |
restriction |
a value used for computing restricted means, standard deviations,
and geometric means with censored time-to-event data. The default value of
|
above |
a vector of values used to dichotomize variables. The descriptive
statistics will include an estimate for each variable of the proportion of
measurements with values greater than each element of |
below |
a vector of values used to dichotomize variables. The descriptive
statistics will include an estimate for each variable of the proportion of
measurements with values less than each element of |
labove |
a vector of values used to dichotomize variables. The descriptive
statistics will include an estimate for each variable of the proportion of
measurements with values greater than or equal to each element of |
rbelow |
a vector of values used to dichotomize variables. The descriptive
statistics will include an estimate for each variable of the proportion of
measurements with values less than or equal to each element of |
lbetween |
a vector of values with |
rbetween |
a vector of values with |
interval |
a two-column matrix of values in which each row is used to define intervals of interest to categorize variables. The descriptive statistics will include an estimate for each variable of the proportion of measurements with values between two elements in a row, with neither endpoint included in each interval. |
linterval |
a two-column matrix of values in which each row is used to define intervals of interest to categorize variables. The descriptive statistics will include an estimate for each variable of the proportion of measurements with values between two elements in a row, with the left-hand endpoint included in each interval. |
rinterval |
a two-column matrix of values in which each row is used to define intervals of interest to categorize variables. The descriptive statistics will include an estimate for each variable of the proportion of measurements with values between two elements in a row, with the right-hand endpoint included in each interval. |
lrinterval |
a two-column matrix of values in which each row is used to define intervals of interest to categorize variables. The descriptive statistics will include an estimate for each variable of the proportion of measurements with values between two elements in a row, with both endpoints included in each interval. |
Details
This function
depends on the survival
R package. You should execute
library(survival)
if that library has not been previously installed.
Quantiles are computed for uncensored data using the default method in
quantile()
. For variables of class factor
, descriptive
statistics will be computed using the integer coding for factors. For
variables of class Surv
, estimated proportions and quantiles will be
computed from Kaplan-Meier estimates, as will be restricted means,
restricted standard deviations, and restricted geometric means. For
variables of class Date
, estimated proportions will be labeled using
the Julian date since January 1, 1970.
Value
An object of class uDescriptives
is returned. Descriptive
statistics for each variable in the entire subsetted sample, as well as
within each stratum if any is defined, are contained in a matrix with rows
corresponding to variables and strata and columns corresponding to the
descriptive statistics. Descriptive statistics include
N: the number of observations.
Msng: the number of observations with missing values.
Mean: the mean of the nonmissing observations (this is potentially a restricted mean for right-censored time-to-event data).
Std Dev: the standard deviation of the nonmissing observations (this is potentially a restricted standard deviation for right-censored time to event data).
Geom Mn: the geometric mean of the nonmissing observations (this is potentially a restricted geometric mean for right-censored time to event data). Nonpositive values in the variable will generate
NA
, unlessreplaceZeroes
was specified.Min: the minimum value of the nonmissing observations (this is potentially restricted for right-censored time-to-event data).
Quantiles: columns corresponding to the quantiles specified by
probs
(these are potentially restricted for right-censored time-to-event data).Max: the maximum value of the nonmissing observations (this is potentially restricted for right-censored time-to-event data).
Proportions: columns corresponding to the proportions as specified by
above
,below
,labove
,rbelow
,lbetween
,rbetween
,interval
,linterval
,rinterval
, andlrinterval
.restriction: the threshold for restricted means, standard deviations, and geometric means.
firstEvent: the time of the first event for censored time-to-event variables.
lastEvent: the time of the last event for censored time-to-event variables.
isDate: an indicator that the variable is a
Date
object.
Examples
# Read in the data
data(mri)
# Create the table
descrip(mri)