data_imputing {PriceIndices}R Documentation

Imputing missing and (optionally) zero prices.

Description

This function imputes missing prices and (optionally) zero prices by using carry forward/backward prices.

Usage

data_imputing(data, start, end, zero_prices = TRUE, outlets = TRUE)

Arguments

data

The user's data frame with information about sold products. It must contain columns: time (as Date in format: year-month-day,e.g. '2020-12-01'), prices (as numeric), quantities (as numeric - for future calculations) and prodID (as numeric, factor or character). A column retID (as factor, character or numeric) is also needed if the User wants to impute prices over outlets.

start

The base period (as character) limited to the year and month, e.g. "2020-03".

end

The research period (as character) limited to the year and month, e.g. "2020-04".

zero_prices

A logical parameter indicating whether zero prices are to be imputed too (then it is set to TRUE).

outlets

A logical parameter indicating whether imputations are to be done for each outlet separately (then it is set to TRUE).

Value

This function imputes missing prices (unit values) and (optionally) zero prices by using carry forward/backward prices. The imputation can be done for each outlet separately or for aggragated data (see the outlets parameter). If a missing product has a previous price then that previous price is carried forward until the next real observation. If there is no previous price then the next real observation is found and carried backward. The quantities for imputed prices are set to zeros. The function returns a data frame (monthly aggregated) which is ready for price index calculations.

Examples

# Creating a small data set with zero prices:
time.<-c("2018-12-01","2019-01-01")
time<-as.Date(c(time., time.))
p1<-c(0,23)
p2<-c(14,0)
q1<-c(15,25)
q2<-c(44,79)
quantities<-c(q1,q2)
prices<-c(p1,p2)
prodID<-c(1,1,2,2)
my_data<-data.frame(time, prices, quantities, prodID)
# Price imputing:
data_imputing(my_data, start="2018-12", end="2019-01",
zero_prices=TRUE, outlets=FALSE)

# Preparing a data set with zero and missing prices:
dataMATCH$prodID<-dataMATCH$codeIN 
data<-dplyr::select(dataMATCH, time, prices, quantities, prodID, retID)
set1<-data[1:5,]
set1$prices<-0
set2<-data[6:30,]
df<-rbind(set1, set2)
# Price imputing:
data_imputing(df, start="2018-12", end="2019-03",
zero_prices=TRUE, outlets=TRUE)

[Package PriceIndices version 0.1.9 Index]