analyze.gain {gainML} | R Documentation |
Analyze Potential Gain from Passive Device Installation on WTGs by Using a Machine Learning-Based Tool
Description
Implements the gain analysis as a whole; this includes data arrangement, period 1 analysis, period 2 analysis, and gain quantification.
Usage
analyze.gain(df1, df2, df3, p1.beg, p1.end, p2.beg, p2.end, ratedPW, AEP,
pw.freq, freq.id = 3, time.format = "%Y-%m-%d %H:%M:%S",
k.fold = 5, col.time = 1, col.turb = 2, bootstrap = NULL,
free.sec = NULL, neg.power = FALSE)
Arguments
df1 |
A dataframe for reference turbine data. This dataframe must include five columns: timestamp, turbine id, wind direction, power output, and air density. |
df2 |
A dataframe for baseline control turbine data. This dataframe must include four columns: timestamp, turbine id, wind speed, and power output. |
df3 |
A dataframe for neutral control turbine data. This dataframe must
include four columns and have the same structure with |
p1.beg |
A string specifying the beginning date of period 1. By default,
the value needs to be specified in ‘%Y-%m-%d’ format, for example,
|
p1.end |
A string specifying the end date of period 1. For example, if
the value is |
p2.beg |
A string specifying the beginning date of period 2. |
p2.end |
A string specifying the end date of period 2. Defined similarly
as |
ratedPW |
A kW value that describes the (common) rated power of the selected turbines (REF and CTR-b). |
AEP |
A kWh value describing the annual energy production from a single turbine. |
pw.freq |
A matrix or a dataframe that includes power output bins and corresponding frequency in terms of the accumulated hours during an annual period. |
freq.id |
An integer indicating the column number of |
time.format |
A string describing the format of time stamps used in the
data to be analyzed. The default value is |
k.fold |
An integer defining the number of data folds for the period 1
analysis and prediction. In the period 1 analysis, |
col.time |
An integer specifying the column number of time stamps in wind turbine datasets. The default value is 1. |
col.turb |
An integer specifying the column number of turbines' id in wind turbine datasets. The default value is 2. |
bootstrap |
An integer indicating the current replication (run) number
of bootstrap. If set to |
free.sec |
A list of vectors defining free sectors. Each vector in the
list has two scalars: one for starting direction and another for ending
direction, ordered clockwise. For example, a vector of |
neg.power |
Either |
Details
Builds a machine learning model for a REF turbine (device installed) and a baseline CTR turbine (CTR-b; without device installation and preferably closest to the REF turbine) by using data measurements from a neutral CTR turbine (CTR-n; without device installation). Gain is quantified by evaluating predictions from the machine learning models and their differences during two different time periods, namely, period 1 (without device installation on the REF turbine) and period 2 (device installed on the REF turbine).
Value
The function returns a list of several objects (lists) that includes all the analysis results from all steps.
data
A list of arranged datasets including period 1 and period 2 data as well as
k
-folded training and test datasets generated from the period 1 data. See alsoarrange.data
.p1.res
A list containing period 1 analysis results. This includes the optimal set of predictor variables, period 1 prediction for the REF turbine and CTR-b turbine, the corresponding error measures such as RMSE and BIAS, and BIAS curves for both REF and CTR-b turbine models; see
analyze.p1
for the details.p2.res
A list containing period 2 analysis results. This includes period 2 prediction for the REF turbine and CTR-b turbine. See also
analyze.p2
.gain.res
A list containing gain quantification results. This includes effect curve, offset curve, and gain curve as well as the measures of effect (gain without offset), offset, and (the final) gain; see
quantify.gain
for the details.
Note
This function will execute four other functions in sequence, namely,
arrange.data
,analyze.p1
,analyze.p2
,quantify.gain
.A user can alternatively run the four funtions by calling them individually in sequence.
References
H. Hwangbo, Y. Ding, and D. Cabezon, 'Machine Learning Based Analysis and Quantification of Potential Power Gain from Passive Device Installation,' arXiv:1906.05776 [stat.AP], Jun. 2019. https://arxiv.org/abs/1906.05776.
See Also
arrange.data
, analyze.p1
,
analyze.p2
, quantify.gain
Examples
df.ref <- with(wtg, data.frame(time = time, turb.id = 1, wind.dir = D,
power = y, air.dens = rho))
df.ctrb <- with(wtg, data.frame(time = time, turb.id = 2, wind.spd = V,
power = y))
df.ctrn <- df.ctrb
df.ctrn$turb.id <- 3
# For Full Sector Analysis
res <- analyze.gain(df.ref, df.ctrb, df.ctrn, p1.beg = '2014-10-24',
p1.end = '2014-10-25', p2.beg = '2014-10-25', p2.end = '2014-10-26',
ratedPW = 1000, AEP = 300000, pw.freq = pw.freq, k.fold = 2)
# In practice, one may use annual data for each of period 1 and period 2 analysis.
# One may typically use k.fold = 5 or 10.
# For Free Sector Analysis
free.sec <- list(c(310, 50), c(150, 260))
res <- analyze.gain(df.ref, df.ctrb, df.ctrn, p1.beg = '2014-10-24',
p1.end = '2014-10-25', p2.beg = '2014-10-25', p2.end = '2014-10-26',
ratedPW = 1000, AEP = 300000, pw.freq = pw.freq, k.fold = 2,
free.sec = free.sec)
gain.res <- res$gain.res
gain.res$gain #This will provide the final gain value.