data_preparing {PriceIndices}R Documentation

Preparing a data set for further data processing or price index calculations

Description

This function returns a prepared data frame based on the user's data set. The resulting data frame is ready for further data processing (such as data selecting, matching or filtering) and it is also ready for price index calculations (if only it contains required columns).

Usage

data_preparing(
  data,
  time = NULL,
  prices = NULL,
  quantities = NULL,
  prodID = NULL,
  retID = NULL,
  description = NULL,
  codeIN = NULL,
  codeOUT = NULL,
  grammage = NULL,
  unit = NULL,
  additional = c(),
  zero_prices = FALSE,
  zero_quantities = TRUE
)

Arguments

data

The user's data frame to be prepared. The user must indicate columns: time (as Date or character type, allowed formats are, eg.: '2020-03' or '2020-12-28'), prices and quantities (as numeric). Optionally, the user may also indicate columns: prodID, codeIN, codeOUT, retID (as numeric, factor or character), description (as character), grammage (as numeric or character), unit (as character) and other columns specified by the additional parameter.

time

A character name of the column which provides transaction dates.

prices

A character name of the column which provides product prices.

quantities

A character name of the column which provides product quantities.

prodID

A character name of the column which provides product IDs. The prodID column should include unique product IDs used for product matching (as numeric or character). It is not obligatory to consider this column while data preparing but it is required while price index calculating (to obtain it, please see data_matching).

retID

A character name of the column which provides outlet IDs (retailer sale points). The retID column should include unique outlet IDs used for aggregating subindices over outlets. It is not obligatory to consider this column while data preparing but it is required while final price index calculating (to obtain it, please see the final_index function).

description

A character name of the column which provides product descriptions. It is not obligatory to consider this column while data preparing but it is required while product selecting (please see the data_selecting function).

codeIN

A character name of the column which provides internal product codes (from the retailer). It is not obligatory to consider this column while data preparing but it may be required while product matching (please see the data_matching function).

codeOUT

A character name of the column which provides external product codes (e.g. GTIN or SKU). It is not obligatory to consider this column while data preparing but it may be required while product matching (please see the data_matching function).

grammage

A character name of the numeric column which provides the grammage of products

unit

A character name of the column which provides the unit of the grammage of products

additional

A character vector of names of additional columns to be considered while data preparing (records with missing values are deleted).

zero_prices

A logical parameter indicating whether zero prices are to be acceptable.

zero_quantities

A logical parameter indicating whether zero quantities are to be acceptable.

Value

The resulting data frame is free from: missing values, negative prices (if zero_prices is set to TRUE), zero or negative prices (if zero_prices is set to FALSE), negative quantities (if zero_quantities is set to TRUE) and zero and negative quantities (if zero_prices is set to FALSE). As a result, column time is set to be Date type (in format: 'Year-Month-01'), columns prices and quantities are set to be numeric. If the column description is selected, then it is set to be character type. If columns: prodID, retID, codeIN or codeOUT are selected, then they are set to be factor type.

Examples

data_preparing(milk, time="time",prices="prices",quantities="quantities")
data_preparing(dataCOICOP, time="time",
prices="prices",quantities="quantities",additional="coicop6")

[Package PriceIndices version 0.1.9 Index]