R: Power curve comparison

ComparePCurve {DSWE}

R Documentation

Power curve comparison

Description

Power curve comparison

Usage

ComparePCurve(
  data,
  xCol,
  xCol.circ = NULL,
  yCol,
  testCol,
  testSet = NULL,
  thrs = 0.2,
  conflevel = 0.95,
  gridSize = c(50, 50),
  powerbins = 15,
  baseline = 1,
  limitMemory = TRUE,
  opt_method = "nlminb",
  sampleSize = list(optimSize = 500, bandSize = 5000),
  rngSeed = 1
)

Arguments

`data`	A list of data sets to be compared, the difference in the mean function is always computed as (f(data2) - f(data1))
`xCol`	A numeric or vector stating column number of covariates
`xCol.circ`	A numeric or vector stating column number of circular covariates
`yCol`	A numeric value stating the column number of the response
`testCol`	A numeric/vector stating column number of covariates to used in generating test set. Maximum of two columns to be used.
`testSet`	A matrix or dataframe consisting of test points, default value NULL, if NULL computes test points internally using testCol variables. If not NULL, total number of test points must be less than or equal to 2500.
`thrs`	A numeric or vector representing threshold for each covariates
`conflevel`	A numeric between (0,1) representing the statistical significance level for constructing the band
`gridSize`	A numeric / vector to be used in constructing test set, should be provided when testSet is NuLL, else it is ignored. Default is `c(50,50)` for 2-dim input which is converted internally to a default of `c(1000)` for 1-dim input. Total number of test points (product of gridSize vector components) must be less than or equal to 2500.
`powerbins`	A numeric stating the number of power bins for computing the scaled difference, default is 15.
`baseline`	An integer between 0 to 2, where 1 indicates to use power curve of first dataset as the base for metric calculation, 2 indicates to use the power curve of second dataset as the base, and 0 indicates to use the average of both power curves as the base. Default is set to 1.
`limitMemory`	A boolean (True/False) indicating whether to limit the memory use or not. Default is true. If set to true, 5000 datapoints are randomly sampled from each dataset under comparison for inference
`opt_method`	A string specifying the optimization method to be used for hyperparameter estimation. Current options are: `'L-BFGS-B'`, `'BFGS'`, and `'nlminb'`. Default is set to `'nlminb'`.
`sampleSize`	A named list of two integer items: `optimSize` and `bandSize`, denoting the sample size for each dataset for hyperparameter optimization and confidence band computation, respectively, when `limitMemory = TRUE`. Default value is `list(optimSize = 500, bandSize = 5000)`.
`rngSeed`	Random seed for sampling data when `limitMemory = TRUE`. Default is 1.

Value

a list containing :

weightedDiff - a numeric, % difference between the functions weighted using the density of the covariates
weightedStatDiff - a numeric, % statistically significant difference between the functions weighted using the density of the covariates
scaledDiff - a numeric, % difference between the functions scaled to the orginal data
scaledStatDiff - a numeric, % statistically significant difference between the functions scaled to the orginal data
unweightedDiff - a numeric, % difference between the functions unweighted
unweightedStatDiff - a numeric, % statistically significant difference between the functions unweighted
reductionRatio - a list consisting of shrinkage ratio of features used in testSet
mu1 - a vector of prediction on testset using the first data set
mu2 - a vector of prediction on testset using the second data set
muDiff - a vector of the difference in prediction (mu2 - mu1) for each test point
band - a vector for the confidence band at all the testpoints for the two functions to be the same at a given cofidence level.
confLevel - a numeric representing the statistical significance level for constructing the band
testSet - a vector/matrix of the test points either provided by user, or generated internally
estimatedParams - a list of estimated hyperaparameters for the Gaussian process model
matchedData - a list of two matched datasets as generated by covariate matching

References

For details, see Ding et al. (2021) available doi:10.1016/j.renene.2021.02.136.

Examples


data1 = data1[1:100, ]
data2 = data2[1:100, ]
data = list(data1, data2)
xCol = 2
xCol.circ = NULL
yCol = 7
testCol = 2
testSet = NULL
thrs = 0.2
confLevel = 0.95
gridSize = 20
function_comparison = ComparePCurve(data, xCol, xCol.circ, yCol,
testCol, testSet, thrs, confLevel, gridSize)

[Package DSWE version 1.8.2 Index]