bws2.dataset {support.BWS2} | R Documentation |
Creating a dataset suitable for Case 2 best–worst scaling analysis using counting and modeling approaches
Description
This function creates a dataset used for bws2.count()
in
support.BWS2 and functions for discrete choice models such as
clogit()
in survival.
Usage
bws2.dataset(data, id, response, choice.sets, attribute.levels,
base.attribute = NULL, base.level = NULL,
reverse = TRUE, model = "paired",
attribute.variables = NULL, effect = NULL, delete.best = FALSE,
type = c("paired", "marginal", "sequential"),
...)
Arguments
data |
A data frame containing a respondent dataset. |
id |
A character showing the name of the respondent identification number variable used in the respondent dataset. |
response |
A vector containing the names of response variables in the respondent dataset, showing the best and worst attribute levels selected in each Case 2 BWS question. |
choice.sets |
A data frame or matrix containing an orthogonal main-effect design. |
attribute.levels |
A list containing the names of the attributes and their levels. |
base.attribute |
A character showing the base attribute: the argument is used
when attribute variables are created as effect coded ones and |
base.level |
A list containing the base level in each attribute: the argument is used
when attribute level variables are created as effect coded ones and
|
reverse |
A logical value denoted by |
model |
A character showing a type of dataset created by this function:
|
attribute.variables |
A character showing a type of attribute variables, denoted by
|
effect |
A list containing the base level in each attribute: the argument is used
when attribute level variables are created as effect coded ones and
while |
delete.best |
A logical value denoted by |
type |
A character showing a type of dataset created by this function:
|
... |
Optional arguments; currently not in use. |
Details
The respondent dataset, in which each row corresponds to a respondent,
must be organized by users and then assigned to the argument data
.
The dataset must include the respondent's identification number (id)
variable in the first column and the response variables in the subsequent
columns, each indicating which attribute levels are selected as
the best and worst for each question. Other variables in the respondent
dataset are treated as the respondents' characteristics such as gender
and age. Respondents' characteristic variables are also stored in
the resultant dataset created by the function bws2.dataset()
.
Although the names of the id and response variables are left to
the discretion of the user, those of the id and response variables are
assigned to the arguments id
and response
.
The response variables must be constructed such that the best attribute
levels alternate with the worst by question. For example, when there are
nine BWS questions, the variables are B1, W1, B2, W2, ..., B9, and W9.
Here, Bi
and Wi
show the attribute levels selected
as the best and worst in the i
-th question.
The row numbers of the attribute levels
selected as the best and worst are stored in the response variables.
For example, suppose that a respondent was asked to answer the following
BWS question, which is the same as that shown on the help page of
this package, and then selected A1 (attribute level in the first row) as
the best and C2 (attribute level in the third row) as the worst.
Please select your best and worst attribute levels from the following four: |
Best | Attribute | Worst |
[_] | A1 | [_] |
[_] | B3 | [_] |
[_] | C2 | [_] |
[_] | D3 | [_] |
The response variables B1 and W1, corresponding to the respondent's answer
to this question, take the value of 1
(= the attribute level in
the first row) and 3
(= the attribute level in the third row).
The arguments choice.sets
and attribute.levels
are
the same as those in bws2.questionnaire()
.
The order of questions in the respondent dataset has to be
the same as that in choice.sets
.
The arguments type
, reverse
, base.attribute
,
and base.level
are set according to the model you will use:
argument type
is set as "paired"
for the paired model,
"marginal"
for the marginal model, or "sequential"
for
the marginal sequential model;
the argument reverse
is set as "TRUE"
for a model
in which the signs of the attribute variables are reversed
for the possible worst (Flynn et al. 2007 and 2008),
or FALSE
when not doing so (Hensher et al. 2015, Appendix 6B);
the argument base.attribute
is set as a character vector showing
the base attribute for a marginal (sequential) model with effect-coded
attribute variables;
and the argument base.level
is set as a list containing the base level
in each attribute for a model with effect-coded level variables
(Flynn et al. 2007 and 2008), while it is set as NULL
for a model
with dummy-coded attribute level variables (Hensher et al. 2015, Appendix 6B).
Note that the arguments attribute.variables
, effect
,
delete.best
, and type
are deprecated and will be removed
in the future.
Value
The function returns a dataset in data frame format for the paired model or one for the marginal (sequential) model. The dataset for the paired model contains the following variables and attribute and/or attribute-level variables explained above:
id |
A respondent's identification number; the actual name and values of this variable is set according to the id variable in the respondent dataset. |
Q |
A serial number of BWS questions. |
PAIR |
A serial number for the possible pairs of the best and worst attribute levels for each question. |
BEST |
An attribute-level number treated as the best in the possible pairs of the best and worst attribute levels for each question. |
WORST |
An attribute-level number treated as the worst in the possible pairs of the best and worst attribute levels for each question. |
BEST.AT |
A character showing the attribute corresponding to the attribute level treated as the best in the possible pairs of the best and worst attribute levels for each question. |
WORST.AT |
A character showing the attribute corresponding to the attribute level treated as the worst in the possible pairs of the best and worst attribute levels for each question. |
BEST.LV |
A character showing the attribute level treated as the best in the possible pairs of the best and worst attribute levels for each question. |
WORST.LV |
A character showing the attribute level treated as the worst in the possible pairs of the best and worst attribute levels for each question. |
RES.B |
A row number in the profile corresponding to the attribute level selected as the best by respondents. |
RES.W |
A row number in the profile corresponding to the attribute level selected as the worst by respondents. |
RES |
Responses to BWS questions that takes the value of |
STR |
A stratification variable identifying each combination of
respondent and question; the variable is also used in the model formula
of |
The dataset for the marginal (sequential) model contains the variables
id
, Q
, RES.B
, RES.W
, and STR
mentioned above and the following variables:
ALT |
A serial number of alternatives (attribute levels) for each question. |
BW |
A state variable that takes the value of |
ATT.cha |
A character showing the attribute corresponding to the attribute level treated as the possible best or worst for each question. |
ATT |
An attribute number showing the attribute corresponding to the attribute level treated as the possible best or worst for each question. |
LEV.cha |
A character showing the attribute levels treated as the possible best or worst for each question. |
LEV |
An attribute level number showing the attribute level treated as the possible best or worst for each question. |
RES |
Responses to BWS questions that takes the value of |
The output has its attributes that consist of arguments assigned to
this function (i.e., id
, response
, choice.sets
,
attribute.levels
, reverse
, base.attribute
,
base.level
, attribute.variables
, effect
,
delete.best
, and type
) and the following:
design.matrix |
Design matrix. |
lev.var.wo.ref |
Names of attribute-level variables excluding base levels. |
freq.levels |
Frequency of attribute levels shown in all the questions. |
respondent.characteristics |
Names of variables corresponding to the respondents' characteristics: variables, except for the respondents' id and response variables, are considered the respondents' characteristics. |
Author(s)
Hideo Aizaki
See Also
support.BWS2-package
,
oa.design
,
clogit
Examples
# Load package survival used for a conditional logit model analysis of
# the responses
require(survival)
# Set a three-level orthogonal main-effect design (OMED) with
# four columns
omed <- matrix(
c(1,3,2,3,
3,1,2,2,
3,3,3,1,
2,3,1,2,
2,2,2,1,
1,1,1,1,
1,2,3,2,
3,2,1,3,
2,1,3,3),
nrow = 9, ncol = 4, byrow = TRUE)
omed
## The OMED is generated by executing the following lines of code:
## require(DoE.base)
## set.seed(123)
## omed <- data.matrix(oa.design(nl = c(3, 3, 3, 3)))
# Set the names of the attributes and attribute levels
attr.lev <- list(
A = c("A1","A2","A3"), B = c("B1","B2","B3"),
C = c("C1","C2","C3"), D = c("D1","D2","D3"))
# Convert the OMED into Case 2 BWS questions using three formats:
## Attribute column is located on the left-hand side
bws2.questionnaire(omed, attribute.levels = attr.lev,
position = "left")
## Attribute column is located in the center
bws2.questionnaire(omed, attribute.levels = attr.lev,
position = "center")
## Attribute column is located on the right-hand side
bws2.questionnaire(omed, attribute.levels = attr.lev,
position = "right")
# Set respondent dataset containing 20 respondents who answered
# nine BWS questions
resp.data <- data.frame(
id = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20),
B1 = c(2,2,2,1,2,4,2,2,2,2,1,2,2,4,2,3,2,3,2,2),
W1 = c(1,1,1,4,1,3,3,1,4,1,4,4,1,1,1,4,1,1,4,4),
B2 = c(1,1,2,1,1,3,1,1,1,1,2,1,1,2,1,3,1,3,1,1),
W2 = c(2,4,4,4,4,2,4,2,4,2,4,4,4,4,2,4,4,1,4,4),
B3 = c(1,1,2,1,2,1,1,1,1,2,1,1,1,2,1,1,1,1,3,1),
W3 = c(4,4,4,2,4,4,4,3,4,3,4,4,3,1,4,4,3,4,4,4),
B4 = c(1,2,2,1,2,1,2,2,2,1,2,4,2,2,2,4,2,2,1,2),
W4 = c(3,4,3,2,3,3,3,1,4,3,3,3,4,3,3,1,4,3,4,4),
B5 = c(1,2,2,1,2,1,2,1,3,1,1,1,3,1,1,1,3,1,1,1),
W5 = c(4,1,3,4,4,4,3,4,4,4,2,4,4,2,4,2,1,4,3,4),
B6 = c(2,4,2,1,2,1,4,3,1,1,1,1,3,2,1,2,3,4,1,4),
W6 = c(4,1,4,4,4,3,3,4,4,2,4,2,4,4,3,4,4,1,4,1),
B7 = c(3,3,2,3,4,1,2,3,3,3,2,1,3,2,1,2,3,1,3,2),
W7 = c(1,4,1,4,1,4,4,4,4,2,4,4,4,4,4,4,4,4,4,4),
B8 = c(1,1,2,1,2,2,1,1,1,2,1,2,1,1,1,3,1,1,1,1),
W8 = c(3,3,3,3,3,3,3,3,4,3,3,3,4,3,3,4,4,3,4,3),
B9 = c(3,3,3,1,3,1,1,3,1,1,1,1,3,1,1,1,3,1,1,1),
W9 = c(2,1,2,2,2,2,4,2,4,2,4,2,2,2,2,4,1,2,2,2))
# Create a dataset and conduct a conditional logit model analysis
## Set response variables
response.vars <- names(resp.data)[2:19]
## Set a base level in each attribute
base.lev <- list(
A = c("A3"), B = c("B3"), C = c("C3"), D = c("D3"))
## Paired model with attribute and attribute-level variables
pr.data <- bws2.dataset(
data = resp.data,
id = "id",
response = response.vars,
choice.sets = omed,
attribute.levels = attr.lev,
reverse = TRUE,
base.level = base.lev,
model = "paired")
attributes(pr.data)$design.matrix
head(pr.data, 12)
### Attribute variable D is omitted from the model
pr <- clogit(RES ~ A + B + C +
A1 + A2 + B1 + B2 + C1 + C2 + D1 + D2 + strata(STR),
data = pr.data)
pr
### Calculate coefficients of base level variables
b.pr <- coef(pr)
-sum(b.pr[4:5]) # attribute level A3
-sum(b.pr[6:7]) # attribute level B3
-sum(b.pr[8:9]) # attribute level C3
-sum(b.pr[10:11]) # attribute level D3
## Marginal model with attribute and attribute-level variables
mr.data <- bws2.dataset(
data = resp.data,
id = "id",
response = response.vars,
choice.sets = omed,
attribute.levels = attr.lev,
reverse = TRUE,
base.level = base.lev,
model = "marginal")
attributes(mr.data)$design.matrix
head(mr.data, 8)
### Attribute variable D is omitted from the model
mr <- clogit(RES ~ A + B + C +
A1 + A2 + B1 + B2 + C1 + C2 + D1 + D2 + strata(STR),
data = mr.data)
mr
### Calculate coefficients of base level variables
b.mr <- coef(mr)
-sum(b.mr[4:5]) # attribute level A3
-sum(b.mr[6:7]) # attribute level B3
-sum(b.mr[8:9]) # attribute level C3
-sum(b.mr[10:11]) # attribute level D3
# Calculate BWS scores
bwscores <- bws2.count(mr.data)
sum(bwscores, "level")
barplot(bwscores, "bw", "level")