data_quality_check {PVplr}R Documentation

checks the quality of the data after and before cleaning

Description

calculates the percentage of anomalies, missings + zeros, gaps, and length of the data and reports the quality of data before and after cleaning.

Usage

data_quality_check(
  energy_data,
  col = "elec_cons",
  id = "pv_df",
  batch_days = 90
)

Arguments

energy_data

structured energy dataframe

col

Input column

id

PV system ID

batch_days

the batch of data that the anomaly detection is applied. Since time series decomposition is used, one seasonality will be applied for whole data which is inefficient, if NA, will pass whole

Details

The quality grading criteria is as following: anomalies A: less than 10 missing percentage: A: less than 10 largest gap: A: less than 120 hours, B: 120 to 164 hours, C: 164 to 240 hours D: more than 240 hours length P: more than 2 years, F: less than 2 years

Value

a table with grading of the quality after and before cleaning

Author(s)

Arash Khalilnejad


[Package PVplr version 0.1.2 Index]