bootstrap.gain {gainML} | R Documentation |
Construct a Confidence Interval of the Gain Estimate
Description
Estimates gain and its confidence interval at a given level of confidence by using bootstrap.
Usage
bootstrap.gain(df1, df2, df3, opt.cov, n.rep, p1.beg, p1.end, p2.beg,
p2.end, ratedPW, AEP, pw.freq, freq.id = 3,
time.format = "%Y-%m-%d %H:%M:%S", k.fold = 5, col.time = 1,
col.turb = 2, free.sec = NULL, neg.power = FALSE,
pred.return = FALSE)
Arguments
df1 |
A dataframe for reference turbine data. This dataframe must include five columns: timestamp, turbine id, wind direction, power output, and air density. |
df2 |
A dataframe for baseline control turbine data. This dataframe must include four columns: timestamp, turbine id, wind speed, and power output. |
df3 |
A dataframe for neutral control turbine data. This dataframe must
include four columns and have the same structure with |
opt.cov |
A character vector indicating the optimal set of variables (obtained from the period 1 analysis). |
n.rep |
An integer describing the total number of replications when
applying bootstrap. This number determines the confidence level; for
example, if |
p1.beg |
A string specifying the beginning date of period 1. By default,
the value needs to be specified in ‘%Y-%m-%d’ format, for example,
|
p1.end |
A string specifying the end date of period 1. For example, if
the value is |
p2.beg |
A string specifying the beginning date of period 2. |
p2.end |
A string specifying the end date of period 2. Defined similarly
as |
ratedPW |
A kW value that describes the (common) rated power of the selected turbines (REF and CTR-b). |
AEP |
A kWh value describing the annual energy production from a single turbine. |
pw.freq |
A matrix or a dataframe that includes power output bins and corresponding frequency in terms of the accumulated hours during an annual period. |
freq.id |
An integer indicating the column number of |
time.format |
A string describing the format of time stamps used in the
data to be analyzed. The default value is |
k.fold |
An integer defining the number of data folds for the period 1
analysis and prediction. In the period 1 analysis, |
col.time |
An integer specifying the column number of time stamps in wind turbine datasets. The default value is 1. |
col.turb |
An integer specifying the column number of turbines' id in wind turbine datasets. The default value is 2. |
free.sec |
A list of vectors defining free sectors. Each vector in the
list has two scalars: one for starting direction and another for ending
direction, ordered clockwise. For example, a vector of |
neg.power |
Either |
pred.return |
A logical value whether to return the full prediction
results; see Details below. The default value is |
Details
For each replication, this function will make a k
of period 1
predictions for each of REF and CTR-b turbine models and an additional
period 2 prediction for each model. This results in 2 \times (k + 1)
predictions for each replication. With n.rep
replications, there
will be n.rep \times 2 \times (k + 1)
predictions in total.
One can avoid storing such many datasets in the memory by setting
pred.return
to FALSE
; which is the default setting.
Value
The function returns a list of n.rep
replication objects
(lists) each of which includes the following.
gain.res
A list containing gain quantification results; see
quantify.gain
for the details.p1.pred
A list containing period 1 prediction results.
pred.REF
: A list ofk
datasets each representing thek
th fold's period 1 prediction for the REF turbine.pred.CTR
: A list ofk
datasets each representing thek
th fold's period 1 prediction for the CTR-b turbine.
p2.pred
A list containing period 2 prediction results; see
analyze.p2
for the details.
References
H. Hwangbo, Y. Ding, and D. Cabezon, 'Machine Learning Based Analysis and Quantification of Potential Power Gain from Passive Device Installation,' arXiv:1906.05776 [stat.AP], Jun. 2019. https://arxiv.org/abs/1906.05776.
Examples
df.ref <- with(wtg, data.frame(time = time, turb.id = 1, wind.dir = D,
power = y, air.dens = rho))
df.ctrb <- with(wtg, data.frame(time = time, turb.id = 2, wind.spd = V,
power = y))
df.ctrn <- df.ctrb
df.ctrn$turb.id <- 3
opt.cov = c('D','density','Vn','hour')
n.rep = 2 # just for illustration; a user may use at leat 10 for this.
res <- bootstrap.gain(df.ref, df.ctrb, df.ctrn, opt.cov = opt.cov, n.rep = n.rep,
p1.beg = '2014-10-24', p1.end = '2014-10-25', p2.beg = '2014-10-25',
p2.end = '2014-10-26', ratedPW = 1000, AEP = 300000, pw.freq = pw.freq,
k.fold = 2)
length(res) #2
sapply(res, function(ls) ls$gain.res$gainCurve) #This provides 2 gain curves.
sapply(res, function(ls) ls$gain.res$gain) #This provides 2 gain values.