support.BWS2-package {support.BWS2} | R Documentation |
Tools for Case 2 best-worst scaling
Description
The package has three basic functions that support an implementation of Case 2 (profile case) best–worst scaling. The first is to convert an orthogonal main-effect design into questions, the second is to create a dataset suitable for analysis, and the third is to calculate count-based scores. For details, see Aizaki and Fogarty (2019).
Details
The package is under development and thus may be changed substantially in the future.
1) Outline of Case 2 best–worst scaling
Case 2 (profile case) best–worst scaling (BWS) is a question-based survey method to elicit preferences for attribute levels (See Flynn 2010, Flynn et al. 2007 and 2008, Louviere et al. 2015, and Marley et al. 2008 for details of the subsection). A profile (choice set) has three or more attributes and each attribute has two or more levels. The profile is expressed as a combination of attribute levels. Numerous profiles are constructed using experimental designs. Attributes shown in each profile are fixed in all the profiles and a combination of attribute levels in each profile is changed according to the profiles. A profile selected from all the constructed profiles is presented to respondents, who are then asked to choose the best and worst attribute levels in the profile. This question is repeated until all profiles are evaluated. Analyzing the responses enables us to elicit preferences for the attribute levels.
A basic approach to constructing profiles is using an orthogonal
main-effect design (OMED). Assume that the profiles have K
attributes and each attribute has L_{k}
levels. If all the attributes
have the same number of levels, L
, a L^{K}
OMED is used to
construct the profiles. Columns of the OMED correspond to attributes,
while the rows to profiles. For example, profiles have four attributes
and they have three levels: attribute A with levels A1, A2, and A3;
attribute B with levels B1, B2, and B3; attribute C with levels C1, C2,
and C3; and attribute D with levels D1, D2, and D3. A 3^{4}
OMED corresponding to the assumptions is as follows (see the section
Examples of the function bws2.dataset()
for code to generate
the OMED):
1 | 3 | 2 | 3 |
3 | 1 | 2 | 2 |
3 | 3 | 3 | 1 |
2 | 3 | 1 | 2 |
2 | 2 | 2 | 1 |
1 | 1 | 1 | 1 |
1 | 2 | 3 | 2 |
3 | 2 | 1 | 3 |
2 | 1 | 3 | 3 |
Suppose that attributes A, B, C, and D are assigned to the first, second, third, and fourth column of the OMED, respectively, and the values 1, 2, and 3 used in the OMED correspond to the attribute-level values in each attribute: 1 = A1, 2 = A2, and 3 = A3 for attribute A; 1 = B1, 2 = B2, and 3 = B3 for attribute B; 1 = C1, 2 = C2, and 3 = C3 for attribute C; and 1 = D1, 2 = D2, and 3 = D3 for attribute D. Accordingly, the above-mentioned OMED can be transformed into the following:
A1 | B3 | C2 | D3 |
A3 | B1 | C2 | D2 |
A3 | B3 | C3 | D1 |
A2 | B3 | C1 | D2 |
A2 | B2 | C2 | D1 |
A1 | B1 | C1 | D1 |
A1 | B2 | C3 | D2 |
A3 | B2 | C1 | D3 |
A2 | B1 | C3 | D3 |
The resultant OMED consists of nine rows: nine profiles, that is, nine Case 2 BWS questions, are constructed. For example, a profile corresponding to the first row of the OMED comprises A1, B3, C2, and D3. This means that respondents who face the question created from the first row of the OMED are asked to select their best and worst attribute levels from attribute levels A1, B3, C2, and D3, as follows:
Please select your best and worst attribute levels from the following four: |
Best | Attribute | Worst |
[_] | A1 | [_] |
[_] | B3 | [_] |
[_] | C2 | [_] |
[_] | D3 | [_] |
There are two approaches for analyzing responses to Case 2 BWS questions:
a counting approach and modeling approach. The counting approach
calculates scores on the basis of the number of times attribute level
i
is selected as the best (B_{in}
: B score for attribute
level i
) and the worst (W_{in}
: W score for attribute
i
) among all the questions for respondent n
.
A (disaggregated) best-minus-worst (BW) score and its standardized
variant are defined as
BW_{in} = B_{in} - W_{in},
std.BW_{in} = \frac{BW_{in}}{f_{i}},
where f_{i}
is the frequency with which attribute level i
appears across all questions.
The modeling approach uses discrete choice models to analyze responses. When using the modeling approach, a model type must be selected according to the assumption for respondents' choice behavior in Case 2 BWS questions and then a dataset must be formatted as per the selected model. There are three standard models: paired, marginal, and marginal sequential models. Although the three models commonly assume that the respondents derive utility for each attribute level shown in the profile, the assumption for how they select the best and worst attribute levels from the set of attribute levels in the profile differs among the three models.
The number of possible pairs in which attribute level i
is selected
as the best and attribute level j
is selected as
the worst (i \neq j
) from K
attribute levels is
K \times (K - 1)
. The paired model assumes that respondents select
attribute level i
as the best and attribute level j
as
the worst because the difference in utility between i
and
j
represents the greatest utility difference
among K \times (K - 1)
utility differences.
Consider the example profile mentioned above. It contains four
attribute levels: A1, B3, C2, and D3. The number of possible pairs
is 12
(= 4 \times (4 - 1))
. There are 12 possible pairs
of the best and worst attribute levels (in each pair,
the former is the best and the latter is the worst): (A1, B3), (A1, C2),
(A1, D3), (B3, A1), (B3, C2), (B3, D3), (C2, A1), (C2, B3), (C2, D3),
(D3, A1), (D3, B3), and (D3, C2).
If a respondent selects A1 as the best attribute level and C2 as
the worst, the paired model assumes that the respondent calculates 12
utility differences as per the 12 above-mentioned pairs and that
the difference in utility between A1 and C2 is the maximum among
the 12 utility differences.
The marginal model assumes that there are K
possible best
attribute levels and K
possible worst attribute levels
in a profile, that attribute level i
is selected as
the best from K
possible best attribute levels in the profile,
and that attribute level j
is selected as the worst
from K
possible worst attribute levels. This is because the
utility for attribute level i
is the maximum among
the utilities for K
attribute levels and that for attribute level
j
is the minimum. Following the above example, the marginal model
assumes that there are four possible best attribute levels and four
possible worst attribute levels in the profile and interprets
the respondent's choice behavior as follows: utility for A1 is
the maximum among the four utilities for A1, B3, C2, and D3 and
that for C2 is the minimum among the four.
The assumption of the marginal model that the worst attribute level is
selected from K
attribute levels would not be appropriate
because the best attribute level in a profile must differ from
the worst one in the profile. Thus, the marginal sequential model
assumes that respondents select attribute level i
as the best
from K
attribute levels in the profile and then attribute level
j
as the worst from the remaining K - 1
attribute levels.
Following the above example, under the marginal sequential
model assumption, there are four possible best attribute levels and
three possible worst attribute levels in the profile.
The model considers that the respondent selects A1 as the best from
the four possible attribute levels because the utility for A1 is
the highest among the utilities for A1, B3, C2, and D3, but selects C2
as the worst from three possible worst levels, B3, C2, and D3,
because the utility for C2 is the least among the three.
The three models generally assume that the utility for attribute level
i
selected as the worst is the negative of the one selected as
the best. Under these assumptions, and given the assumption for
the stochastic component of utility, the probability of selecting
attribute level i
as the best and attribute level j
as
the worst can be expressed as a conditional logit model.
2) Role of the package and other packages needed to complete implementing Case 2 BWS
The package support.BWS2 provides functions to convert an OMED
into a series of Case 2 BWS questions, create a dataset for the analysis
from the OMED and the responses to the questions, and calculate BWS scores.
Other packages are needed to complete implementing Case 2 BWS with R:
a package to construct OMEDs and another to analyze the responses
on the basis of the modeling approach. For example,
the oa.design()
function in DoE.base (Groemping 2018) can
construct OMEDs, while the functions clogit()
in survival
(Therneau 2016), mlogit()
in mlogit (Croissant 2013),
and gmnl()
in gmnl (Sarrias and Daziano 2017) can fit
the conditional logit model. The latter two functions are also used
to fit advanced discrete choice models
such as a mixed (random parameters) logit model. Refer to the task views
about experimental designs (Groemping 2016) and econometrics (Zeileis 2017)
on CRAN for details on packages for experimental designs and
discrete choice models in R.
Acknowledgments
I would like to thank Professor Kazuo Sato for his kind support. This work was supported by JSPS KAKENHI Grant Numbers 25450341, 16K07886, and 20K06251.
Author(s)
Hideo Aizaki
References
Aizaki H, Fogarty J (2019) An R package and tutorial for case 2 best-worst scaling. Journal of Choice Modelling, 32, 100171. doi: 10.1016/j.jocm.2019.100171.
Flynn TN (2010) Valuing citizen and patient preferences in health: recent developments in three types of best-worst scaling. Expert Review of Pharmacoeconomics & Outcomes Research, 10(3), 259–267. doi: 10.1586/erp.10.29.
Flynn TN, Louviere JJ, Peters TJ, Coast J (2007) Best-Worst Scaling: What it can do for health care research and how to do it. Journal of Health Economics, 26, 171–189. doi: 10.1016/j.jhealeco.2006.04.002.
Flynn TN, Louviere JJ, Peters TJ, Coast J (2008) Estimating preferences for a dermatology consultation using best-worst scaling: Comparison of various methods of analysis. BMC Medical Research Methodology, 8(76). doi: 10.1186/1471-2288-8-76.
Croissant Y (2013) mlogit: multinomial logit model. R package version 0.2-4. https://CRAN.R-project.org/package=mlogit.
Groemping U (2018) R Package DoE.base for Factorial Experiments. Journal of Statistical Software, 85(5), 1–41. doi: 10.18637/jss.v085.i05.
Groemping U (2016) CRAN Task View: Design of Experiments (DoE) & Analysis of Experimental Data. https://CRAN.R-project.org/view=ExperimentalDesign.
Hensher DA, Rose JM, Greene WH (2015) Applied Choice Analysis. 2nd edition. Cambridge University Press. doi: 10.1017/CBO9781316136232.
Louviere JJ, Flynn TN, Marley AAJ (2015) Best-Worst Scaling: Theory, Methods and Applications. Cambridge University Press. doi: 10.1017/CBO9781107337855.
Marley AAJ, Flynn TN, Louviere JJ (2008) Probabilistic models of set-dependent and attribute-level best-worst choice. Journal of Mathematical Psychology, 52, 281–296. doi: 10.1016/j.jmp.2008.02.002.
Sarrias M, Daziano R (2017) Multinomial Logit Models with Continuous and Discrete Individual Heterogeneity in R: The gmnl Package. Journal of Statistical Software, 79(2), 1–46. doi: 10.18637/jss.v079.i02.
Therneau T (2015) A Package for Survival Analysis in S. Version 2.38, https://CRAN.R-project.org/package=survival.
Zeileis A (2017) CRAN Task View: Econometrics. https://CRAN.R-project.org/view=Econometrics.