cpd {pricelevels}R Documentation

CPD and NLCPD methods

Description

Function cpd() estimates regional price levels by the Country-Product-Dummy (CPD) method, originally developed by Summers (1973). Auer and Weinand (2022) recently proposed a generalization of the CPD method. This nonlinear CPD method (NLCPD method) is implemented in function nlcpd().

Usage

cpd(p, r, n, q=NULL, w=NULL, base=NULL, simplify=TRUE, settings=list())

nlcpd(p, r, n, q=NULL, w=NULL, base=NULL, simplify=TRUE, settings=list(), ...)

Arguments

p

A numeric vector of prices.

r, n

A character vector or factor of regional entities r and products n, respectively.

q, w

A numeric vector of non-negative quantities q or weights w. By default, no weights are used in the regression (q=NULL and w=NULL). While w can be any weights considered as appropriate for weighted regression, q will result in an expenditure share weighted regression (see details). If both q and w are provided, q will be used.

base

A character specifying the base to which the estimated logarithmic regional price levels are expressed. When NULL, they refer to the (unweighted) regional average, similar to contr.sum.

simplify

A logical indicating whether the full regression-object should be provided (FALSE) or a named vector of estimated regional price levels (TRUE).

settings

A list of control settings to be used. The following settings are supported:

  • chatty : A logical specifying if warnings and info messages should be printed or not. The default is getOption("pricelevels.chatty").

  • connect : A logical specifying if the data should be checked for connectedness or not. The default is getOption("pricelevels.connect"). If the data are not connected, price levels are computed within the biggest block of connected regions or the block of regions to which the base region belongs. See also connect().

  • norm.weights : A logical specifying if the weights w should be renormalized such that they add up to 1 for each region r or not. The default is TRUE.

  • plot : A logical specifying if the calculated price levels should be plotted or not. If TRUE, the price ratios of each region are displayed as boxplots and the price levels are added as colored points. The default is getOption("pricelevels.plot").

  • self.start : Only if par=NULL, the strategy how parameter start values are internally derived by nlcpd(). Currently, values s1, s2 and s3 are allowed. For s1, simple price averages across products and regions are used as start values, while these are derived by the CPD method for strategies s2 and s3. Start values for delta are either set to 1 or derived by their first-order condition if s3. By default, self.start='s1'.

  • use.jac : A logical indicating if the jacobian matrix should be used by nlcpd() for the nonlinear optimization or not. The default is FALSE.

  • w.delta : A named vector of weights for the delta-parameter (see Details). Vector length must be equal to the number of products, while names must match product names. If not supplied, \delta_i weights are derived internally by nlcpd() from the weights w.

...

Further arguments passed to nls.lm, typically arguments control, par, upper, and lower. For par, upper, and lower, vectors must have names for each parameter separated by a dot, e.g., lnP.1, pi.2, or delta.3.

Details

The CPD method is a linear regression model that explains the logarithmic price of product i in region r, \ln p_i^r, by the general product price, \ln \pi_i, and the overall price level, \ln P^r:

\ln p_i^r = \ln \pi_i + \ln P^r + u_i^r

The NLCPD method inflates the CPD model by product-specific elasticities \delta_i:

\ln p_i^r = \ln \pi_i + \delta_i \ln P^r + u_i^r

Note that both the CPD and the NLCPD method require a normalization of the estimated price levels \widehat{\ln P^r} to avoid multicollinearity. If base=NULL, normalization \sum_{r=1}^{R} \widehat{\ln P^r}=0 is used in both functions; otherwise, one price level is set to 0. The NLCPD method additionally imposes the restriction \sum_{i=1}^{N} w_i \widehat{\delta_i}=1, where the weights w_i can be defined by settings$w.delta. In nlcpd(), it is always the parameter \widehat{\delta_1} that is derived residually from this restriction.

Before calculations start, missing values are excluded and duplicated observations for r and n are aggregated, that is, duplicated prices p and weights w are averaged and duplicated quantities q added up.

If q is provided, expenditure shares are derived as w_i^r = p_i^r q_i^r / \sum_{j=1}^{N} p_j^r q_j^r and used as weights in the regression. If only w is provided, the weights w are (re-)normalized by default. If the weights w do not represent expenditure shares, the (re-)normalization can be turned off by settings=list(norm.weights=FALSE).

Value

For simplify=TRUE, a named vector of (unlogged) regional price levels. Otherwise, for cpd(), a lm-object containing the full regression output, and for nlcpd() the full output of nls.lm() plus element w.delta.

Author(s)

Sebastian Weinand

References

Auer, L. v. and Weinand, S. (2022). A Nonlinear Generalization of the Country-Product- Dummy Method. Discussion Paper 2022/45, Deutsche Bundesbank.

Summers, R. (1973). International Price Comparisons based upon Incomplete Data. Review of Income and Wealth, 19 (1), 1-16.

See Also

lm, dummy.coef, nls.lm

Examples

# sample complete price data:
set.seed(123)
R <- 3 # number of regions
B <- 1 # number of product groups
N <- 5 # number of products
dt1 <- rdata(R=R, B=B, N=N)

# compute expenditure share weighted cpd and nlcpd index:
dt1[, cpd(p=price, r=region, n=product, q=quantity)]
dt1[, nlcpd(p=price, r=region, n=product, q=quantity)]

# set individual start values in nlcpd():
par.init <- list("lnP"=setNames(rep(0, R), 1:R),
                 "pi"=setNames(rep(2, N), 1:N),
                 "delta"=setNames(rep(1, N), 1:N))
dt1[, nlcpd(p=price, r=region, n=product, q=quantity, par=par.init)]

# use lower and upper bounds on parameters:
dt1[, nlcpd(p=price, r=region, n=product, q=quantity,
            lower=unlist(par.init)-0.1, upper=unlist(par.init)+0.1)]

# change internal calculation of start values:
dt1[, nlcpd(p=price, r=region, n=product, q=quantity, settings=list(self.start="s2"))]

# add price data:
dt2 <- rdata(R=4, B=1, N=4)
dt2[, "region":=factor(region, labels=4:7)]
dt2[, "product":=factor(product, labels=6:9)]
dt <- rbind(dt1, dt2)
dt[, is.connected(r=region, n=product)] # non-connected now

# compute expenditure share weighted cpd and nlcpd index:
dt[, cpd(p=price, r=region, n=product, q=quantity, base="1")]
dt[, nlcpd(p=price, r=region, n=product, q=quantity, base="1")]

# compare with toernqvist index:
dt[, toernqvist(p=price, r=region, n=product, q=quantity, base="1")]


# computational speed in nlcpd() usually increases if use.jac=TRUE:
set.seed(123)
dt3 <- rdata(R=20, B=1, N=30)
system.time(m1 <- dt3[, nlcpd(p=price, r=region, n=product, q=quantity,
                              settings=list(use.jac=FALSE), simplify=FALSE,
                              control=minpack.lm::nls.lm.control("maxiter"=200))])
system.time(m2 <- dt3[, nlcpd(p=price, r=region, n=product, q=quantity,
                              settings=list(use.jac=TRUE), simplify=FALSE,
                              control=minpack.lm::nls.lm.control("maxiter"=200))])
all.equal(m1$par, m2$par, tol=1e-05)


[Package pricelevels version 1.3.0 Index]