expandCategorical {gnm} | R Documentation |
Expand Data Frame by Re-expressing Categorical Data as Counts
Description
Expands the rows of a data frame by re-expressing observations of a
categorical variable specified by catvar
, such that the
column(s) corresponding to catvar
are replaced by a factor
specifying the possible categories for each observation and a vector
of 0/1 counts over these categories.
Usage
expandCategorical(data, catvar, sep = ".", countvar = "count",
idvar = "id", as.ordered = FALSE, group = TRUE)
Arguments
data |
a data frame. |
catvar |
a character vector specifying factors in |
sep |
a character string used to separate the concatenated
values of |
countvar |
(optional) a character string to be used for the name of the new count variable. |
idvar |
(optional) a character string to be used for the name of the new factor identifying the original rows (cases). |
as.ordered |
logical - whether the new interaction factor should
be of class |
group |
logical: whether or not to group individuals with common values over all covariates. |
Details
Each row of the data frame is replicated c
times, where c
is the number of levels of the interaction of the factors specified by
catvar
. In the expanded data frame, the columns specified by
catvar
are replaced by a factor specifying the r
possible
categories for each case, named by the concatenated values of
catvar
separated by sep
. The ordering of factor levels
will be preserved in the creation of the new factor, but this factor
will not be of class "ordered"
unless the argument
as.ordered = TRUE
. A variable with name countvar
is added
to the data frame which is equal to 1 for the observed category in each
case and 0 elsewhere. Finally a factor with name idvar
is added
to index the cases.
Value
The expanded data frame as described in Details.
Note
Re-expressing categorical data in this way allows a multinomial response to be modelled as a poisson response, see examples.
Author(s)
Heather Turner
References
Anderson, J. A. (1984) Regression and Ordered Categorical Variables. J. R. Statist. Soc. B, 46(1), 1-30.
See Also
Examples
### Example from help(multinom, package = "nnet")
library(MASS)
example(birthwt)
library(nnet)
bwt.mu <- multinom(low ~ ., data = bwt)
## Equivalent using gnm - include unestimable main effects in model so
## that interactions with low0 automatically set to zero, else could use
## 'constrain' argument.
bwtLong <- expandCategorical(bwt, "low", group = FALSE)
bwt.po <- gnm(count ~ low*(. - id), eliminate = id, data = bwtLong, family =
"poisson")
summary(bwt.po) # same deviance; df reflect extra id parameters
### Example from ?backPain
set.seed(1)
summary(backPain)
backPainLong <- expandCategorical(backPain, "pain")
## Fit models described in Table 5 of Anderson (1984)
noRelationship <- gnm(count ~ pain, eliminate = id,
family = "poisson", data = backPainLong)
oneDimensional <- update(noRelationship,
~ . + Mult(pain, x1 + x2 + x3))