| srproj {textir} | R Documentation |
Multinomial Inverse Regression (MNIR)
Description
Estimation of MNIR sufficient reduction projections. Note that mnlm is just a call to dmr from the distrom package.
Usage
srproj(obj, counts, dir=1:K, ...)
mnlm(cl, covars, counts, mu=NULL, bins=NULL, verb=0, ...)
Arguments
cl |
A |
covars |
A dense |
counts |
A dense |
mu |
Pre-specified fixed effects for each observation in the Poisson regression linear equation. See the same argument in |
bins |
Number of bins into which we will attempt to collapse each column of |
verb |
Whether to print some info. See the same argument in |
obj |
Either a |
dir |
The attribute ( |
... |
Additional arguments to |
Details
These functions provide the first two steps of multinomial inverse regression (see MNIR paper).
mnlm fits multinomial logistic regression
parameters under gamma lasso penalization on a factorized Poisson likelihood. The mnlm function, which remains in this package for backwards compatability only, is just call to the dmr function of the distrom library (see DMR paper). For simplicity, we recommend using dmr instead of mnlm. For model selection, coefficients, prediction, and plotting see the relevant functions in help(dmr).
srproj calculates the MNIR Sufficient Reduction projection from text counts on to the attribute dimensions of interest (covars in mnlm or dmr). In particular, for counts C, with row sums m, and mnlm/dmr coefficients \phi_j corresponding to attribute j,
z_j = C'\phi_j/m is the SR projection in the direction of j. The MNIR paper explains how V=[v_1 ... v_K], your original covariates/attributes, are independent of text counts C given SR projections Z=[z_1 ... z_K].
The final step of MNIR is ‘forward regression’ for any element of V onto Z and the remaining elements of V. We do not provide a function for this because you are free to use whatever you want; see the MNIR and DMR papers for linear, logistic, and random forest forward regression examples.
Note that if you were previously using textir not for inverse regression, but rather just as fast code for multinomial logistic regression, you probably want to work directly with the gamlr (binary response) or dmr (multinomial response) packages.
Value
srproj returns a matrix with columns corresponding to directions dir, plus an additional column m holding the row totals of counts.
mnlm returns a dmr s3 object. See help(dmr) for details.
Author(s)
Matt Taddy mataddy@gmail.com
References
Taddy (2013, JASA), Multinomial Inverse Regression for Text Analysis (MNIR).
Taddy (2015, AoAS), Distributed Multinomial Regression (DMR).
Taddy (2016, JCGS), The Gamma Lasso (GL).
See Also
congress109, we8there, dmr
Examples
### Ripley's Cushing Data; see help(Cushings) ###
library(MASS)
data(Cushings)
Cushings[,1:2] <- log(Cushings[,1:2])
train <- Cushings[Cushings$Type!="u",]
newdata <- as.matrix(Cushings[Cushings$Type == "u", 1:2])
## fit, coefficients, predict, and plot
# you could replace 'mnlm' with 'dmr' here.
fit <- mnlm(NULL,
covars=train[,1:2],
counts=factor(train$Type))
## dmr applies corrected AICc selection by default
round(coef(fit),1)
round(predict(fit, newdata, type="response"),1)
par(mfrow=c(1,3))
for(j in c("a","b","c")){
plot(fit[[j]]); mtext(j,line=2) }
## see we8there and congress109 for MNIR and srproj examples