data_preparing {PriceIndices} | R Documentation |
Preparing a data set for further data processing or price index calculations
Description
This function returns a prepared data frame based on the user's data set. The resulting data frame is ready for further data processing (such as data selecting, matching or filtering) and it is also ready for price index calculations (if only it contains required columns).
Usage
data_preparing(
data,
time = NULL,
prices = NULL,
quantities = NULL,
prodID = NULL,
retID = NULL,
description = NULL,
codeIN = NULL,
codeOUT = NULL,
grammage = NULL,
unit = NULL,
additional = c(),
zero_prices = FALSE,
zero_quantities = TRUE
)
Arguments
data |
The user's data frame to be prepared. The user must indicate columns: |
time |
A character name of the column which provides transaction dates. |
prices |
A character name of the column which provides product prices. |
quantities |
A character name of the column which provides product quantities. |
prodID |
A character name of the column which provides product IDs. The |
retID |
A character name of the column which provides outlet IDs (retailer sale points). The |
description |
A character name of the column which provides product descriptions. It is not obligatory to consider this column while data preparing but it is required while product selecting (please see the |
codeIN |
A character name of the column which provides internal product codes (from the retailer). It is not obligatory to consider this column while data preparing but it may be required while product matching (please see the |
codeOUT |
A character name of the column which provides external product codes (e.g. GTIN or SKU). It is not obligatory to consider this column while data preparing but it may be required while product matching (please see the |
grammage |
A character name of the numeric column which provides the grammage of products |
unit |
A character name of the column which provides the unit of the grammage of products |
additional |
A character vector of names of additional columns to be considered while data preparing (records with missing values are deleted). |
zero_prices |
A logical parameter indicating whether zero prices are to be acceptable. |
zero_quantities |
A logical parameter indicating whether zero quantities are to be acceptable. |
Value
The resulting data frame is free from: missing values, negative prices (if zero_prices
is set to TRUE), zero or negative prices (if zero_prices
is set to FALSE), negative quantities (if zero_quantities
is set to TRUE) and zero and negative quantities (if zero_prices
is set to FALSE). As a result, column time
is set to be Date type (in format: 'Year-Month-01'), columns prices
and quantities
are set to be numeric. If the column description
is selected, then it is set to be character type. If columns: prodID
, retID
, codeIN
or codeOUT
are selected, then they are set to be factor type.
Examples
data_preparing(milk, time="time",prices="prices",quantities="quantities")
data_preparing(dataCOICOP, time="time",
prices="prices",quantities="quantities",additional="coicop6")