cj {cregg} | R Documentation |
Simple Conjoint Analyses and Visualization
Description
Simple analyses of conjoint (factorial) experiments and visualization of results.
Usage
cj(
data,
formula,
id = ~0,
weights = NULL,
estimate = c("amce", "frequencies", "mm", "amce_differences", "mm_differences"),
feature_order = NULL,
feature_labels = NULL,
level_order = c("ascending", "descending"),
by = NULL,
...
)
Arguments
data |
A data frame containing variables specified in |
formula |
A formula specifying a model to be estimated. ; all levels across features should be unique. For |
id |
An RHS formula specifying a variable holding respondent identifiers, to be used for clustering standard errors. |
weights |
An (optional) RHS formula specifying a variable holding survey weights. |
estimate |
A character string specifying an estimate type. Current options are average marginal component effects (or AMCEs, “amce”, estimated via |
feature_order |
An (optional) character vector specifying the names of feature (RHS) variables in the order they should be encoded in the resulting data frame. |
feature_labels |
A named list of “fancy” feature labels to be used in output. By default, the function looks for a “label” attribute on each variable in |
level_order |
A character string specifying levels (within each feature) should be ordered increasing or decreasing in the final output. This is mostly only consequential for plotting via |
by |
A formula containing only RHS variables, specifying grouping factors over which to perform estimation. |
... |
Additional arguments to |
Details
The main function cj
is a convenience function wrapper around the underlying estimation functions that provide for average marginal component effects (AMCEs), by default, via the amce
function, marginal means (MMs) via the mm
function, and display frequencies via cj_freqs
and cj_props
. Additional estimands may be supported in the future through their own functions and through the cj
interface. Plotting is provided via ggplot2 for all types of estimates.
The only additional functionality provided by cj
over the underlying functions is the by
argument, which will perform operations on subsets of data
, returning a single data frame. This can be useful, for example, for evaluating profile spillover effects and subgroup results, or in any situation where one might be inclined to use a for
loop or lapply
, calling cj
repeatedly on subgroups.
Note: Some features of cregg (namely, the amce_diffs
) function, or estimate = "amce_diff"
here) only work with full factorial conjoint experiments. Designs involving two-way constraints between features are supported simply by expressing interactions between constrained terms in formula
(again, except for amce_diffs
). Higher-order constraints may be supported in the future.
Value
A data frame with special class to facilitate plotting (e.g., “cj_amce”, “cj_mm”, etc.)
Author(s)
Thomas J. Leeper <thosjleeper@gmail.com>
See Also
Functions: amce
, mm
, cj_freqs
, mm_diffs
, plot.cj_amce
, cj_tidy
Data: immigration
, taxes
Examples
# load data
requireNamespace("ggplot2")
data("immigration")
immigration$contest_no <- factor(immigration$contest_no)
data("taxes")
# calculate MMs
f1 <- ChosenImmigrant ~ Gender + Education +
LanguageSkills + CountryOfOrigin + Job + JobExperience +
JobPlans + ReasonForApplication + PriorEntry
d1 <- cj(immigration, f1, id = ~ CaseID, estimate = "mm", h0 = 0.5)
# plot MMs
plot(d1, vline = 0.5)
# calculate MMs for survey-weighted data
d1 <- cj(taxes, chose_plan ~ taxrate1 + taxrate2 + taxrate3 +
taxrate4 + taxrate5 + taxrate6 + taxrev, id = ~ ID,
weights = ~ weight, estimate = "mm", h0 = 0.5)
# plot MMs
plot(d1, vline = 0.5)
# MMs split by profile number
stacked <- cj(immigration, f1, id = ~ CaseID,
estimate = "mm", by = ~ contest_no)
## plot with grouping
plot(stacked, group = "contest_no", vline = 0.5, feature_headers = FALSE)
## plot with facetting
plot(stacked) + ggplot2::facet_wrap(~ contest_no, nrow = 1L)
# estimate AMCEs
d2 <- cj(immigration, f1, id = ~ CaseID)
# plot AMCEs
plot(d2)
## subgroup analysis
immigration$ethnosplit <- cut(immigration$ethnocentrism, 2)
x <- cj(na.omit(immigration), ChosenImmigrant ~ Gender + Education + LanguageSkills,
id = ~ CaseID, estimate = "mm", h0 = 0.5, by = ~ ethnosplit)
plot(x, group = "ethnosplit", vline = 0.5)
# combinations of/interactions between features
immigration$language_entry <-
interaction(immigration$LanguageSkills, immigration$PriorEntry, sep = "_")
## higher-order MMs for feature combinations
cj(immigration, ChosenImmigrant ~ language_entry,
id = ~CaseID, estimate = "mm", h0 = 0.5)
## constrained designs
## in a constrained design, some cells are unobserved:
subset(cj_props(immigration, ~ Job + Education), Proportion == 0)
## MMs and AMCEs only use data from observed cells
## In `immigraation`, this means while the MM for `Job == "Janitor"` is an average
## across all levels of Education:
mm(subset(immigration, Job == "Janitor"), ChosenImmigrant ~ Education)
## the MM for `Job == "Doctor"` is an average across only 3 levels of education:
mm(subset(immigration, Job == "Doctor"), ChosenImmigrant ~ Education)
## Use `cj_props()` to see constraints:
subset(cj_props(immigration, ~ Job + Education), Job == "Doctor" & Proportion != 0)
## Substantively, the MM of "Doctor" might be higher than other levels of `Job`
## this could be due to the feature itself or due to the fact that it is constrained
## with a different subset of other feature levels than alternative levels of `Job`
## this may mean analysts want to report MMs (or AMCEs) only for the unconstrained levels:
elev <- c("Two-Year College", "College Degree", "Graduate Degree")
jlev <- c("Financial Analyst", "Computer Programmer", "Research Scientist", "Doctor")
mm(subset(immigration, Education %in% elev), ChosenImmigrant ~ Job)
mm(subset(immigration, Job %in% jlev), ChosenImmigrant ~ Education)
## or, present estimates excluding constrained levels:
mm(subset(immigration, !Education %in% elev), ChosenImmigrant ~ Job)
mm(subset(immigration, !Job %in% jlev), ChosenImmigrant ~ Education)