star.nominal {EffectStars}R Documentation

Effect stars for multinomial logit models

Description

The package EffectStars2 provides a more up-to-date implementation of effect stars!

The function computes and visualizes multinomial logit models. The computation is done with help of the package VGAM. The visualization is based on the function stars from the package graphics.

Usage

star.nominal(formula, data, xij = NULL, conf.int = FALSE, symmetric = TRUE, 
    pred.coding = "reference", printpvalues = TRUE, test.rel = TRUE, refLevel = 1, 
    maxit = 100, scale = TRUE, nlines = NULL, select = NULL, catstar = TRUE, 
    dist.x = 1, dist.y = 1, dist.cov = 1, dist.cat = 1, xpd = TRUE, main = "", 
    lwd.stars = 1, col.fill = "gray90", col.circle = "black", lwd.circle = 1, 
    lty.circle = "longdash", lty.conf = "dotted", cex.labels = 1, cex.cat = 0.8, 
    xlim = NULL, ylim = NULL)

Arguments

formula

An object of class “formula”. Formula for the multinomial logit model to be fitted and visualized.

data

An object of class “data.frame” containing the covariates used in formula.

xij

An object of class list, used if category-specific covariates are to be inlcuded. Every element is a formula referring to one of the category-specific covariates. For details see help for xij in vglm.control and the details below.

conf.int

If TRUE, confidence intervals are drawn.

symmetric

Which side constraint for the coefficients in the multinomial logit model shall be used for the plot? Default TRUE uses symmetric side constraints, FALSE uses the reference category specified by refLevel. If category-specific covariates are specified using xij, automatically symmetric = FALSE is set. Symmetric side constraints are not possible in the case of category-specific covariates.

pred.coding

Which coding for categorical predictors with more than two categories is to be used? Default pred.coding="reference" uses the first category as reference category, the alternative pred.coding="effect" uses effect coding equivalent to symmetric side constraints. For pred.coding="effect" a star for every category is plotted, for pred.coding="reference" no star for the reference category is plotted.

printpvalues

If TRUE, p-values for the respective coefficients are printed besides the category labels. P-values are recieved by a Wald test.

test.rel

Provides a Likelihood-Ratio-Test to test the relevance of the explanatory covariates. The corresponding p-values will be printed behind the covariates labels. test.rel=FALSE might save a lot of time.

refLevel

Reference category for multinomial logit model. Ignored if symmetric=TRUE. See also multinomial.

maxit

Maximal number of iterations to fit the multinomial logit model. See also vglm.control.

scale

If TRUE, the stars are scaled to equal maximal ray length.

nlines

If specified, nlines gives the number of lines in which the effect stars are plotted.

select

Numeric vector to choose only a subset of the stars to be plotted. Default is to plot all stars. Numbers refer to total amount of predictors, including intercept and dummy variables.

catstar

A logical argument to specify if all category-specific effects in the model should be visualized with an additional star. Ignored if xij=NULL.

dist.x

Optional factor to increase/decrease distances between the centers of the stars on the x-axis. Values greater than 1 increase, values smaller than 1 decrease the distances.

dist.y

Optional factor to increase/decrease distances between the centers of the stars on the y-axis. Values greater than 1 increase, values smaller than 1 decrease the distances.

dist.cov

Optional factor to increase/decrease distances between the stars and the covariates labels above the stars. Values greater than 1 increase, values smaller than 1 decrease the distances.

dist.cat

Optional factor to increase/decrease distances between the stars and the category labels around the stars. Values greater than 1 increase, values smaller than 1 decrease the distances.

xpd

If FALSE, all plotting is clipped to the plot region, if TRUE, all plotting is clipped to the figure region, and if NA, all plotting is clipped to the device region. See also par.

main

An overall title for the plot. See also plot.

lwd.stars

Line width of the stars. See also lwd in par.

col.fill

Color of background of the circle. See also col in par.

col.circle

Color of margin of the circle. See also col in par.

lwd.circle

Line width of the circle. See also lwd in par.

lty.circle

Line type of the circle. See also lty in par.

lty.conf

Line type of confidence intervals. Ignored, if conf.int=FALSE. See also lty in par.

cex.labels

Size of labels for covariates placed above the corresponding star. See also cex in par.

cex.cat

Size of labels for categories placed around the corresponding star. See also cex in par.

xlim

Optional specification of the x coordinates ranges. See also xlim in plot.window

ylim

Optional specification of the y coordinates ranges. See also ylim in plot.window

Details

The underlying models are fitted with the function vglm from the package VGAM. The family argument for vglm is multinomial(parallel=FALSE).

The stars show the exponentials of the estimated coefficients. In multinomial logit models the exponential coefficients can be interpreted as odds. More precisely, for the model with symmetric side constraints, the exponential e^{\gamma_{rj}}, r=1,\ldots,k represents the multiplicative effect of the covariate j on the odds \frac{P(Y=r|x)}{GM(x)} if x_j increases by one unit and GM(x) is the median response. For the model with reference category k, the exponential e^{\gamma_{rj}}, r=1,\ldots,k-1 represents the multiplicative effect of the covariate j on the odds \frac{P(Y=r|x)}{P(Y=k|x)} if x_j increases by one unit.

In addition to the stars, we plot a cirlce that refers to the case where the coefficients of the corresponding star are zero. Therefore, the radii of these circles are always exp(0)=1. If scale=TRUE, the stars are scaled so that they all have the same maximal ray length. In this case, the actual appearances of the circles differ, but they still refer to the no-effects case where all the coefficients are zero. Now the circles can be used to compare different stars based on their respective circles radii. The distances between the rays of a star and the cirlce correspond to the p-values that are printed beneath the category levels if printpvalues=TRUE. The closer a star ray lies to the no–effects circle, the more the p-value is increased.
The p-values beneath the covariate labels, which are given if test.rel=TRUE, correspond to the distance between the circle and the star as a whole. They refer to a likelihood ratio test if all the coefficients from one covariate are zero (i.e. the variable is left out completely) and thus would lie exactly upon the cirlce.
The appearance of the circles can be modified by col.circle, lwd.circle and lty.circle.

The argument xij is important because it has to be used to include category-specific covariates. If its default xij=NULL is kept, an ordinary multinomial logit model without category-specific covariates is fitted. If category-specific covariates are to be included, attention has to be paid to the exact usage of xij. Our xij argument is identical to the xij argument used in the embedded vglm function. For details see also vglm.control. The data are thought to be present in a wide format, i.e. a category-specific covariate consists of k columns. Before calling star.nominal, the values for the reference category (defined by refLevel) have to be subtracted from the values of the further categories. Additionally, the resulting variable for the first response category (but not the reference category) has to be duplicated. This duplicate should be denoted by an appropriate name for the category-specific variable, independent from the different response categories. It will be used as an assignment variable for the corresponding coefficient of the covariate and has to be included in to the formula. For every category-specific covariate, a formula has to be specified in the xij argument. On the left hand side of that formula, the assignment variable has to be placed. On the right hand side, the variables containing the differences from the values for the reference category are written. So the left hand side of the formula contains k-1 terms. The order of these terms has to be chosen according to the order of the response categories, ignoring the reference category. Examples for effect stars for models with category-specific covariates are recieved by typing vignette("election") or vignette("plebiscite").

It is strongly recommended to standardize metric covariates, display of effect stars can benefit greatly as in general differences between the coefficients are increased.

Value

P-values are only available if the corresponding option is set TRUE.
catspec and catspecse are only available if xij is specified.

odds

Odds or exponential coefficients of the multinomial logit model

coefficients

Coefficients of the multinomial logit model

se

Standard errors of the coefficients

pvalues

P-values of Wald tests for the respective coefficients

catspec

Coefficients for the category-specific covariates

catspecse

Standard errors for the coefficients for the category-specific covariates

p_rel

P-values of Likelihood-Ratio-Tests for the relevance of the explanatory covariates

xlim

xlim values that were automatically produced. May be helpfull if you want to specify your own xlim

ylim

ylim values that were automatically produced. May be helpfull if you want to specify your own ylim

Author(s)

Gunther Schauberger
gunther.schauberger@tum.de
https://www.sg.tum.de/epidemiologie/team/schauberger/

References

Tutz, G. and Schauberger, G. (2012): Visualization of Categorical Response Models - from Data Glyphs to Parameter Glyphs, Journal of Computational and Graphical Statistics 22(1), 156-177.

Gerhard Tutz (2012): Regression for Categorical Data, Cambridge University Press

See Also

star.sequential, star.cumulative

Examples

## Not run: 
data(election)

# simple multinomial logit model
star.nominal(Partychoice ~ Age + Religion + Democracy + Pol.Interest + 
                 Unemployment + Highschool + Union + West + Gender, election)

# Use effect coding for the categorical predictor religion
star.nominal(Partychoice ~ Age + Religion + Democracy + Pol.Interest + 
                 Unemployment + Highschool + Union + West + Gender, election,
                 pred.coding = "effect")                 

# Use reference category "FDP" instead of symmetric side constraints
star.nominal(Partychoice ~ Age + Religion + Democracy + Pol.Interest + 
                 Unemployment + Highschool + Union + West + Gender, election,
                 refLevel = 3, symmetric = FALSE)
                 
# Use category-specific covariates, subtract values for reference 
# category CDU
election[,13:16] <- election[,13:16] - election[,12]
election[,18:21] <- election[,18:21] - election[,17]
election[,23:26] <- election[,23:26] - election[,22]
election[,28:31] <- election[,28:31] - election[,27]

election$Social <- election$Social_SPD
election$Immigration <- election$Immigration_SPD
election$Nuclear <- election$Nuclear_SPD
election$Left_Right <- election$Left_Right_SPD

star.nominal(Partychoice ~ Social + Immigration + Nuclear + Left_Right + Age + 
Religion + Democracy + Pol.Interest + Unemployment + Highschool + Union + West + 
Gender, data = election, 
xij = list(Social ~ Social_SPD + Social_FDP + Social_Greens + Social_Left,
Immigration ~ Immigration_SPD + Immigration_FDP + Immigration_Greens + Immigration_Left,
Nuclear ~ Nuclear_SPD + Nuclear_FDP + Nuclear_Greens + Nuclear_Left,
Left_Right ~ Left_Right_SPD + Left_Right_FDP + Left_Right_Greens + Left_Right_Left),
symmetric = FALSE)

## End(Not run)

[Package EffectStars version 1.9-1 Index]