srproj {textir} | R Documentation |
Multinomial Inverse Regression (MNIR)
Description
Estimation of MNIR sufficient reduction projections. Note that mnlm
is just a call to dmr
from the distrom
package.
Usage
srproj(obj, counts, dir=1:K, ...)
mnlm(cl, covars, counts, mu=NULL, bins=NULL, verb=0, ...)
Arguments
cl |
A |
covars |
A dense |
counts |
A dense |
mu |
Pre-specified fixed effects for each observation in the Poisson regression linear equation. See the same argument in |
bins |
Number of bins into which we will attempt to collapse each column of |
verb |
Whether to print some info. See the same argument in |
obj |
Either a |
dir |
The attribute ( |
... |
Additional arguments to |
Details
These functions provide the first two steps of multinomial inverse regression (see MNIR paper).
mnlm
fits multinomial logistic regression
parameters under gamma lasso penalization on a factorized Poisson likelihood. The mnlm
function, which remains in this package for backwards compatability only, is just call to the dmr
function of the distrom
library (see DMR paper). For simplicity, we recommend using dmr
instead of mnlm
. For model selection, coefficients, prediction, and plotting see the relevant functions in help(dmr)
.
srproj
calculates the MNIR Sufficient Reduction projection from text counts
on to the attribute dimensions of interest (covars
in mnlm
or dmr
). In particular, for counts C
, with row sums m
, and mnlm
/dmr
coefficients \phi_j
corresponding to attribute j
,
z_j = C'\phi_j/m
is the SR projection in the direction of j
. The MNIR paper explains how V=[v_1 ... v_K]
, your original covariates/attributes, are independent of text counts C
given SR projections Z=[z_1 ... z_K]
.
The final step of MNIR is ‘forward regression’ for any element of V
onto Z
and the remaining elements of V
. We do not provide a function for this because you are free to use whatever you want; see the MNIR and DMR papers for linear, logistic, and random forest forward regression examples.
Note that if you were previously using textir
not for inverse regression, but rather just as fast code for multinomial logistic regression, you probably want to work directly with the gamlr
(binary response) or dmr
(multinomial response) packages.
Value
srproj
returns a matrix with columns corresponding to directions dir
, plus an additional column m
holding the row totals of counts
.
mnlm
returns a dmr
s3 object. See help(dmr)
for details.
Author(s)
Matt Taddy mataddy@gmail.com
References
Taddy (2013, JASA), Multinomial Inverse Regression for Text Analysis (MNIR).
Taddy (2015, AoAS), Distributed Multinomial Regression (DMR).
Taddy (2016, JCGS), The Gamma Lasso (GL).
See Also
congress109, we8there, dmr
Examples
### Ripley's Cushing Data; see help(Cushings) ###
library(MASS)
data(Cushings)
Cushings[,1:2] <- log(Cushings[,1:2])
train <- Cushings[Cushings$Type!="u",]
newdata <- as.matrix(Cushings[Cushings$Type == "u", 1:2])
## fit, coefficients, predict, and plot
# you could replace 'mnlm' with 'dmr' here.
fit <- mnlm(NULL,
covars=train[,1:2],
counts=factor(train$Type))
## dmr applies corrected AICc selection by default
round(coef(fit),1)
round(predict(fit, newdata, type="response"),1)
par(mfrow=c(1,3))
for(j in c("a","b","c")){
plot(fit[[j]]); mtext(j,line=2) }
## see we8there and congress109 for MNIR and srproj examples