bootstrap.gain {gainML} | R Documentation |
Construct a Confidence Interval of the Gain Estimate
Description
Estimates gain and its confidence interval at a given level of confidence by using bootstrap.
Usage
bootstrap.gain(df1, df2, df3, opt.cov, n.rep, p1.beg, p1.end, p2.beg,
p2.end, ratedPW, AEP, pw.freq, freq.id = 3,
time.format = "%Y-%m-%d %H:%M:%S", k.fold = 5, col.time = 1,
col.turb = 2, free.sec = NULL, neg.power = FALSE,
pred.return = FALSE)
Arguments
df1 |
A dataframe for reference turbine data. This dataframe must include five columns: timestamp, turbine id, wind direction, power output, and air density. |
df2 |
A dataframe for baseline control turbine data. This dataframe must include four columns: timestamp, turbine id, wind speed, and power output. |
df3 |
A dataframe for neutral control turbine data. This dataframe must
include four columns and have the same structure with |
opt.cov |
A character vector indicating the optimal set of variables (obtained from the period 1 analysis). |
n.rep |
An integer describing the total number of replications when
applying bootstrap. This number determines the confidence level; for
example, if |
p1.beg |
A string specifying the beginning date of period 1. By default,
the value needs to be specified in ‘%Y-%m-%d’ format, for example,
|
p1.end |
A string specifying the end date of period 1. For example, if
the value is |
p2.beg |
A string specifying the beginning date of period 2. |
p2.end |
A string specifying the end date of period 2. Defined similarly
as |
ratedPW |
A kW value that describes the (common) rated power of the selected turbines (REF and CTR-b). |
AEP |
A kWh value describing the annual energy production from a single turbine. |
pw.freq |
A matrix or a dataframe that includes power output bins and corresponding frequency in terms of the accumulated hours during an annual period. |
freq.id |
An integer indicating the column number of |
time.format |
A string describing the format of time stamps used in the
data to be analyzed. The default value is |
k.fold |
An integer defining the number of data folds for the period 1
analysis and prediction. In the period 1 analysis, |
col.time |
An integer specifying the column number of time stamps in wind turbine datasets. The default value is 1. |
col.turb |
An integer specifying the column number of turbines' id in wind turbine datasets. The default value is 2. |
free.sec |
A list of vectors defining free sectors. Each vector in the
list has two scalars: one for starting direction and another for ending
direction, ordered clockwise. For example, a vector of |
neg.power |
Either |
pred.return |
A logical value whether to return the full prediction
results; see Details below. The default value is |
Details
For each replication, this function will make a of period 1
predictions for each of REF and CTR-b turbine models and an additional
period 2 prediction for each model. This results in
predictions for each replication. With
n.rep
replications, there
will be predictions in total.
One can avoid storing such many datasets in the memory by setting
pred.return
to FALSE
; which is the default setting.
Value
The function returns a list of n.rep
replication objects
(lists) each of which includes the following.
gain.res
A list containing gain quantification results; see
quantify.gain
for the details.p1.pred
A list containing period 1 prediction results.
pred.REF
: A list ofdatasets each representing the
th fold's period 1 prediction for the REF turbine.
pred.CTR
: A list ofdatasets each representing the
th fold's period 1 prediction for the CTR-b turbine.
p2.pred
A list containing period 2 prediction results; see
analyze.p2
for the details.
References
H. Hwangbo, Y. Ding, and D. Cabezon, 'Machine Learning Based Analysis and Quantification of Potential Power Gain from Passive Device Installation,' arXiv:1906.05776 [stat.AP], Jun. 2019. https://arxiv.org/abs/1906.05776.
Examples
df.ref <- with(wtg, data.frame(time = time, turb.id = 1, wind.dir = D,
power = y, air.dens = rho))
df.ctrb <- with(wtg, data.frame(time = time, turb.id = 2, wind.spd = V,
power = y))
df.ctrn <- df.ctrb
df.ctrn$turb.id <- 3
opt.cov = c('D','density','Vn','hour')
n.rep = 2 # just for illustration; a user may use at leat 10 for this.
res <- bootstrap.gain(df.ref, df.ctrb, df.ctrn, opt.cov = opt.cov, n.rep = n.rep,
p1.beg = '2014-10-24', p1.end = '2014-10-25', p2.beg = '2014-10-25',
p2.end = '2014-10-26', ratedPW = 1000, AEP = 300000, pw.freq = pw.freq,
k.fold = 2)
length(res) #2
sapply(res, function(ls) ls$gain.res$gainCurve) #This provides 2 gain curves.
sapply(res, function(ls) ls$gain.res$gain) #This provides 2 gain values.