R: Parametric approach to analyze double-bounded dichotomous...

dbchoice {DCchoice}

R Documentation

Parametric approach to analyze double-bounded dichotomous choice contingent valuation data

Description

This function analyzes double-bounded dichotomous choice contingent valuation (CV) data on the basis of the utility difference approach.

Usage

dbchoice(formula, data, subset, na.action = na.omit, dist = "log-logistic",
         par = NULL, ...)

## S3 method for class 'dbchoice'
print(x, digits = max(3, getOption("digits") - 1), ...)

## S3 method for class 'dbchoice'
vcov(object, ...)

## S3 method for class 'dbchoice'
logLik(object, ...)

Arguments

`formula`	an object of S3 class `"formula"` and specifies the model structure.
`data`	a data frame containing the variables in the model formula.
`subset`	an optional vector specifying a subset of observations.
`na.action`	a function which indicates what should happen when the data contains `NA`s.
`dist`	a character string setting the error distribution in the model, which takes one of `"logistic"`, `"normal"`, `"log-logistic"`, `"log-normal"` or `"weibull"`.
`par`	a vector of initial parameters over which the optimization is carried out.
`x`	an object of class `"dbchoice"`.
`digits`	a number of digits to display.
`object`	an object of class `"dbchoice"`.
`...`	optional arguments. Currently not in use.

Details

The function dbchoice() implements an analysis of double-bounded dichotomous choice contingent valuation (CV) data on the basis of the utility difference approach (Hanemann, 1984). A generic call to dbchoice() is given by

dbchoice(formula, data, dist = "log-logistic", ...)

The extractor function summary() is available for a "dbchoice" class object. See summary.dbchoice for details.

There are two functions available for computing the confidence intervals for the estimates of WTPs. krCI implements simulations to construct empirical distributions of the WTP while bootCI carries out nonparametric bootstrapping.

The argument formula defines the response variables and covariates. The argument data is mandatory where the data frame containing the variables in the model is specified. The argument dist sets the error distribution. Currently, one of "logistic", "normal", "log-logistic", "log-normal", or "weibull" is available. The default value is dist = "log-logistic", so that it may be omitted if the user wants to estimate a model with log-logistic error distribution.

The difference between normal and log-normal models or between logistic or log-logistic ones is how the bid variable is incorporated into the model to be estimated. For the Weibull model, the bid variable must be entered in the natural log. Therefore, the user must be careful in defining the model formula that is explained in details below.

A typical structure of the formula for dbchoice() is defined as follows:

R1 + R2 ~ (the names of the covariates) | BD1 + BD2

The formula is an object of class "formula" and specifies the model structure. It has to be written in a symbolic expression in R. The formula consists of three parts. The first part, the left-hand side of the tilde sign (~), must contain the response variables for the suggested prices in the first and the second stage of CV questions. In the example below, R1 denotes a binary or two-level factor response variable for a bid in the first stage and R2 for a bid in the second stage. Each of R1 and R2 contains "Yes" or "No" to the bid or 1 for "Yes" and 0 for "No".

The covariates are defined in the second part in the place of (the names of the covariates). Each covariate is connected with the arithmetic operator + and (the names of the covariates) in the above syntax should be replaced with var1 + var2 and the like. The plus sign is nothing to do with addition of the two variables in the symbolic expression. When the covariate contains only a constant term, a value of 1 is set as the covariate (that is, R1 + R2 ~ 1 | BD1 + BD2)

The last part starts after the vertical bar (|). The names of the two variables (BD1 and BD2) containing suggested prices in the first and second stage of double-bounded dichotomous choice CV question are specified in this part. The two variables are also connected with the arithmetic operator (+).

According to the structure of the formula, a data set (data frame) consists of three parts. An example of the data set is as follows (sex, age, and income are respondents characteristics and assumed to be covariates):

`R1`	`R2`	`sex`	`age`	`income`	`BD1`	`BD2`
Yes	Yes	Male	20	Low	100	250
Yes	No	Male	30	Low	500	1000
...

The second bid in the double-bounded dichotomous choice CV question is larger or lower than the first bit according to the response to the first stage: if the response to the first stage is "Yes", the second bid is larger than the first bid; if the response is "No", the second bid is lower than the first bid. In the example above, BD2 is set as the second bid according to each respondent faced in the second stage. However, the followings style of data set is frequently prepared:

`R1`	`R2`	`sex`	`age`	`income`	`BD1`	`BD2H`	`BD2L`
Yes	Yes	Male	20	Low	100	250	50
Yes	No	Male	30	Low	500	1000	250
...

BD2H is the second (higher) bid when the respondent answers "Yes" in the first stage; BD2L is the second (lower) bid when the respondent answers "No" in the first stage. In this case, the users have to convert BD2H and BD2L into BD2 (see the section "Examples").

The function dbchoice() analyzes double-bounded dichotomous choice CV data using the function optim on the basis of the initial coefficients that are estimated from a binary logit model analysis of the first-stage CV responses (the binary logit model is estimated internally by the function glm with the argument family = binomial(link = "logit")).

Nonparametric analysis of double-bounded dichotomous choice data can be done by turnbull.db. A single-bounded analogue of dbchoice is called sbchoice.

Value

This function returns an S3 class object "dbchoice" that is a list with the following components.

`f.stage`	a list of components returned from the function `glm` based on the responses to the first CV question. The coefficient estimates of the first stage estimation is used as the initial coefficients for full analysis using the function `optim`. If `par` is not `NULL`, the supplied vector is returned.
`dbchoice`	a list of components returned from the function `optim`.
`coefficients`	a named vector of estimated coefficients.
`call`	the matched call.
`formula`	the formula supplied.
`Hessian`	an estimate of the Hessian. See also `Hessian` in `optim`.
`distribution`	a character string showing the error distribution used.
`loglik`	a value of the log likelihood at the estimates.
`convergence`	an logical code: `TRUE` means a successful convergence.
`niter`	a vector of two integers describing the number of calls to the object function and the numerical gradient, respectively. See also `counts` in `optim`.
`nobs`	a number of observations.
`covariates`	a named matrix of the covariates used in the model.
`bid`	a named matrix of the bids used in the model.
`yn`	a named matrix of the responses to the initial and follow-up CV questions used in the model.
`data.name`	the data matrix.
`terms`	terms
`contrast`	contrasts used for factors
`xlevels`	levels used for factors

References

Bateman IJ, Carson RT, Day B, Hanemann M, Hanley N, Hett T, Jones-Lee M, Loomes G, Mourato S, \"Ozdemiro\=glu E, Pearce DW, Sugden R, Swanson J (eds.) (2002). Economic Valuation with Stated Preference Techniques: A Manual. Edward Elger, Cheltenham, UK.

Carson RT, Hanemann WM (2005). “Contingent Valuation.” in KG M\"aler, JR Vincent (eds.), Handbook of Environmental Economics. Elsevier, New York.

Croissant Y (2011). Ecdat: Data Sets for Econometrics, R package version 0.1-6.1, https://CRAN.R-project.org/package=Ecdat.

Hanemann, WM (1984). “Welfare Evaluations in Contingent Valuation Experiments with Discrete Responses”, American Journal of Agricultural Economics, 66(2), 332–341.

Hanemann M, Kanninen B (1999). “The Statistical Analysis of Discrete-Response CV Data.”, in IJ Bateman, KG Willis (eds.), Valuing Environmental Preferences: Theory and Practice of the Contingent Valuation Methods in the US, EU, and Developing Countries, 302–441. Oxford University Press, New York.

Hanemann WM, Loomis JB, Kanninen BJ (1991). “Statistical Efficiency of Double-Bounded Dichotomous Choice Contingent Valuation.” American Journal of Agricultural Economics, 73(4), 1255–1263.

Examples

## Examples are based on a data set NaturalPark in the package 
## Ecdat (Croissant 2011): DBDCCV style question for measuring 
## willingness to pay for the preservation of the Alentejo Natural 
## Park. The data set (dataframe) contains seven variables: 
## bid1 (bid in the initial question), bidh (higher bid in the follow-up 
## question), bidl (lower bid in the follow-up question), answers 
## (response outcomes in a factor format with four levels of "nn", 
## "ny", "yn", "yy"), respondents' characteristic variables such 
## as age, sex and income (see NaturalPark for details).
data(NaturalPark, package = "Ecdat")
head(NaturalPark)

## The variable answers are converted into a format that is suitable for the 
## function dbchoice() as follows:
NaturalPark$R1 <- ifelse(substr(NaturalPark$answers, 1, 1) == "y", 1, 0)
NaturalPark$R2 <- ifelse(substr(NaturalPark$answers, 2, 2) == "y", 1, 0)

## We assume that the error distribution in the model is a 
## log-logistic; therefore, the bid variables bid1 is converted 
## into LBD1 as follows:
NaturalPark$LBD1 <- log(NaturalPark$bid1)

## Further, the variables bidh and bidl are integrated into one 
## variable (bid2) and the variable is converted into LBD2 as follows:
NaturalPark$bid2 <- ifelse(NaturalPark$R1 == 1, NaturalPark$bidh, NaturalPark$bidl)
NaturalPark$LBD2 <- log(NaturalPark$bid2)

## The utility difference function is assumed to contain covariates (sex, age, and 
## income) as well as two bid variables (LBD1 and LBD2) as follows:
fmdb <- R1 + R2 ~ sex + age + income | LBD1 + LBD2

## Not run: 
## The formula may be alternatively defined as
fmdb <- R1 + R2 ~ sex + age + income | log(bid1) + log(bid2)

## End(Not run)

## The function dbchoice() with the function fmdb and the dataframe 
## NP is executed as follows:
NPdb <- dbchoice(fmdb, data = NaturalPark)
NPdb
NPdbs <- summary(NPdb)
NPdbs

## The confidence intervals for these WTPs are calculated using the 
## function krCI() or bootCI() as follows:
## Not run: 
krCI(NPdb)
bootCI(NPdb)

## End(Not run)
## The WTP of a female with age = 5 and income = 3 is calculated
## using function krCI() or bootCI() as follows:
## Not run: 
krCI(NPdb, individual = data.frame(sex = "female", age = 5, income = 3))
bootCI(NPdb, individual = data.frame(sex = "female", age = 5, income = 3))

## End(Not run)

## The variable age and income are deleted from the fitted model, 
## and the updated model is fitted as follows:
update(NPdb, .~. - age - income |.)

## The bid design used in this example is created as follows:
bid.design <- unique(NaturalPark[, c(1:3)])
bid.design <- log(bid.design)
colnames(bid.design) <- c("LBD1", "LBDH", "LBDL")
bid.design
## Respondents' utility and probability of choosing Yes-Yes, Yes-No, 
## No-Yes, and No-No under the fitted model and original data are 
## predicted as follows: 
head(predict(NPdb, type = "utility", bid = bid.design))
head(predict(NPdb, type = "probability", bid = bid.design))
## Utility and probability of choosing Yes for a female with age = 5 
## and income = 3 under bid = 10 are predicted as follows:
predict(NPdb, type = "utility",
    newdata = data.frame(sex = "female", age = 5, income = 3, LBD1 = log(10)))
predict(NPdb, type = "probability",
    newdata = data.frame(sex = "female", age = 5, income = 3, LBD1 = log(10)))

## Plot of probabilities of choosing yes is drawn as drawn as follows:
plot(NPdb)
## The range of bid can be limited (e.g., [log(10), log(20)]):
plot(NPdb, bid = c(log(10), log(20)))

[Package DCchoice version 0.2.0 Index]