analyze.p1 {gainML}R Documentation

Apply Period 1 Analysis

Description

Conducts period 1 analysis; selects the optimal set of variables that minimizes a k-fold CV error measure and establishes a machine learning model that predicts power output of REF and CTR-b turbines by using period 1 data.

Usage

analyze.p1(train, test, ratedPW)

Arguments

train

A list containing k datasets that will be used to train the machine learning model.

test

A list containing k datasets that will be used to test the machine learning model and calculate CV error measures.

ratedPW

A kW value that describes the (common) rated power of the selected turbines (REF and CTR-b).

Value

The function returns a list containing period 1 analysis results as follows.

opt.cov

A character vector presenting the names of predictor variables chosen for the optimal set.

pred.REF

A list of kk datasets each representing the kkth fold's period 1 prediction for the REF turbine.

pred.CTR

A list of kk datasets each representing the kkth fold's period 1 prediction for the CTR-b turbine.

err.REF

A data frame containing kk-fold CV based RMSE values and BIAS values for the REF turbine model (so kk of them for both). The first column includes the RMSE values and the second column includes the BIAS values.

err.CTR

A data frame containing kk-fold CV based RMSE values and BIAS values for the CTR-b turbine model. Similarly structured with err.REF.

biasCurve.REF

A kk by mm matrix describing the binned BIAS (technically speacking, ‘residuals’ which are the negative BIAS) curve for the REF turbine model, where mm is the number of power bins.

biasCurve.CTR

A kk by mm matrix describing the binned BIAS curve for the CTR-b turbine model.

Note

VERY IMPORTANT!

References

H. Hwangbo, Y. Ding, and D. Cabezon, 'Machine Learning Based Analysis and Quantification of Potential Power Gain from Passive Device Installation,' arXiv:1906.05776 [stat.AP], Jun. 2019. https://arxiv.org/abs/1906.05776.

Examples

df.ref <- with(wtg, data.frame(time = time, turb.id = 1, wind.dir = D,
 power = y, air.dens = rho))
df.ctrb <- with(wtg, data.frame(time = time, turb.id = 2, wind.spd = V,
 power = y))
df.ctrn <- df.ctrb
df.ctrn$turb.id <- 3

data <- arrange.data(df.ref, df.ctrb, df.ctrn, p1.beg = '2014-10-24',
 p1.end = '2014-10-25', p2.beg = '2014-10-25', p2.end = '2014-10-26',
 k.fold = 2)

p1.res <- analyze.p1(data$train, data$test, ratedPW = 1000)
p1.res$opt.cov #This provides the optimal set of variables.


[Package gainML version 0.1.0 Index]