sse {studyStrap} | R Documentation |
Trained-on-Observed-Studies Ensemble (Study-Specific Ensemble) for Multi-Study Learning: fits one or more models on each study and ensembles models.
Description
Trained-on-Observed-Studies Ensemble (Study-Specific Ensemble) for Multi-Study Learning: fits one or more models on each study and ensembles models.
Usage
sse(formula = Y ~ ., data, target.study = NA, sim.covs = NA,
ssl.method = list("lm"), ssl.tuneGrid = list(c()),
sim.mets = FALSE, model = FALSE, customFNs = list(),
stack.standardize = FALSE)
Arguments
formula |
Model formula |
data |
A dataframe with all the studies has the following columns in this order: "Study", "Y", "V1", ...., "Vp" |
target.study |
Dataframe of the design matrix (just covariates) of study one aims to make predictions on |
sim.covs |
Is a vector of names of covariates or the column numbers of the covariates to be used for the similarity measure. Default is to use all covariates. |
ssl.method |
A list of strings indicating which modeling methods to use. |
ssl.tuneGrid |
A list of the tuning parameters in the format of the caret package. Each element must be a dataframe (as required by caret). If no tuning parameters are required then NA is indicated. |
sim.mets |
Boolean indicating whether to calculate default covariate profile similarity measures. |
model |
Indicates whether to attach training data to model object. |
customFNs |
Optional list of functions that can be used to add custom covaraite profile similarity measures. |
stack.standardize |
Boolean determining whether stacking weights are standardized to sum to 1. Default is FALSE |
Value
A model object of studyStrap class "ss" that can be used to make predictions.
Examples
##########################
##### Simulate Data ######
##########################
set.seed(1)
# create half of training dataset from 1 distribution
X1 <- matrix(rnorm(2000), ncol = 2) # design matrix - 2 covariates
B1 <- c(5, 10, 15) # true beta coefficients
y1 <- cbind(1, X1) %*% B1
# create 2nd half of training dataset from another distribution
X2 <- matrix(rnorm(2000, 1,2), ncol = 2) # design matrix - 2 covariates
B2 <- c(10, 5, 0) # true beta coefficients
y2 <- cbind(1, X2) %*% B2
X <- rbind(X1, X2)
y <- c(y1, y2)
study <- sample.int(10, 2000, replace = TRUE) # 10 studies
data <- data.frame( Study = study, Y = y, V1 = X[,1], V2 = X[,2] )
# create target study design matrix for covariate profile similarity weighting and
# accept/reject algorithm (covaraite-matched study strap)
target <- matrix(rnorm(1000, 3, 5), ncol = 2) # design matrix
colnames(target) <- c("V1", "V2")
##########################
##### Model Fitting #####
##########################
sseMod <- sse(formula = Y ~.,
data = data,
ssl.method = list("pcr"),
ssl.tuneGrid = list(data.frame("ncomp" = 1)),
model = FALSE,
customFNs = list() )
## Fit models with Target Study Specified ##
# Fit model with 1 Single-Study Learner (SSL): Linear Regression
sseMod1 <- sse(formula = Y ~.,
data = data,
target.study = target,
ssl.method = list("lm"),
ssl.tuneGrid = list(NA),
sim.mets = FALSE,
model = FALSE,
customFNs = list() )
# Fit model with 2 SSLs: Linear Regression and PCA Regression
sseMod2 <- sse(formula = Y ~.,
data = data,
target.study = target,
ssl.method = list("lm", "pcr"),
ssl.tuneGrid = list(NA,
data.frame("ncomp" = 1)),
sim.mets = TRUE,
model = FALSE,
customFNs = list() )
# Fit model with custom similarity function for
# covaraite profile similarity weighting
fn1 <- function(x1,x2){
return( abs( cor( colMeans(x1), colMeans(x2) )) )
}
sseMod3 <- sse(formula = Y ~.,
data = data,
target.study = target,
ssl.method = list("lm", "pcr"),
ssl.tuneGrid = list(NA,
data.frame("ncomp" = 1)),
sim.mets = TRUE,
model = FALSE,
customFNs = list(fn1) )
#########################
##### Predictions ######
#########################
preds <- studyStrap.predict(sseMod1, target)