| metafuse {metafuse} | R Documentation | 
fit a GLM with fusion penalty for data integraion
Description
Fit a GLM with fusion penalty on coefficients within each covariate across datasets, generate solution path and fusograms for visualization of the model selection.
Usage
metafuse(X = X, y = y, sid = sid, fuse.which = c(0:ncol(X)),
  family = "gaussian", intercept = TRUE, alpha = 0, criterion = "EBIC",
  verbose = TRUE, plots = FALSE, loglambda = TRUE)
Arguments
| X | a matrix (or vector) of predictor(s), with dimensions of  | 
| y | a vector of response, with length  | 
| sid | data source ID of length  | 
| fuse.which | a vector of integers from 0 to  | 
| family | response vector type,  | 
| intercept | if  | 
| alpha | the ratio of sparsity penalty to fusion penalty, default is 0 (i.e., no variable selection, only fusion) | 
| criterion | 
 | 
| verbose | if  | 
| plots | if  | 
| loglambda | if  | 
Details
Adaptive lasso penalty is used. See Zou (2006) for detail.
Value
A list containing the following items will be returned:
| family | the response/model type | 
| criterion | model selection criterion used | 
| alpha | the ratio of sparsity penalty to fusion penalty | 
| if.fuse | whether covariate is assumed to be heterogeneous (1) or homogeneous (0) | 
| betahat | the estimated regression coefficients | 
| betainfo | additional information about the fit, including degree of freedom, optimal lambda value, maximum lambda value to fuse all coefficients, and estimated friction of fusion | 
References
Lu Tang, and Peter X.K. Song. Fused Lasso Approach in Regression Coefficients Clustering - Learning Parameter Heterogeneity in Data Integration. Journal of Machine Learning Research, 17(113):1-23, 2016.
Fei Wang, Lu Wang, and Peter X.K. Song. Fused lasso with the adaptation of parameter ordering in combining multiple studies with repeated measurements.  Biometrics, DOI:10.1111/biom.12496, 2016. 
Examples
########### generate data ###########
n <- 200    # sample size in each dataset (can also be a K-element vector)
K <- 10     # number of datasets for data integration
p <- 3      # number of covariates in X (including the intercept)
# the coefficient matrix of dimension K * p, used to specify the heterogeneous pattern
beta0 <- matrix(c(0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,   # beta_0 of intercept
                  0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,   # beta_1 of X_1
                  0.0,0.0,0.0,0.0,0.5,0.5,0.5,1.0,1.0,1.0),  # beta_2 of X_2
                K, p)
# generate a data set, family=c("gaussian", "binomial", "poisson", "cox")
data <- datagenerator(n=n, beta0=beta0, family="gaussian", seed=123)
# prepare the input for metafuse
y       <- data$y
sid     <- data$group
X       <- data[,-c(1,ncol(data))]
########### run metafuse ###########
# fuse slopes of X1 (which is heterogeneous with 2 clusters)
metafuse(X=X, y=y, sid=sid, fuse.which=c(1), family="gaussian", intercept=TRUE, alpha=0,
          criterion="EBIC", verbose=TRUE, plots=TRUE, loglambda=TRUE)
# fuse slopes of X2 (which is heterogeneous with 3 clusters)
metafuse(X=X, y=y, sid=sid, fuse.which=c(2), family="gaussian", intercept=TRUE, alpha=0,
          criterion="EBIC", verbose=TRUE, plots=TRUE, loglambda=TRUE)
# fuse all three covariates
metafuse(X=X, y=y, sid=sid, fuse.which=c(0,1,2), family="gaussian", intercept=TRUE, alpha=0,
          criterion="EBIC", verbose=TRUE, plots=TRUE, loglambda=TRUE)
# fuse all three covariates, with sparsity penalty
metafuse(X=X, y=y, sid=sid, fuse.which=c(0,1,2), family="gaussian", intercept=TRUE, alpha=1,
          criterion="EBIC", verbose=TRUE, plots=TRUE, loglambda=TRUE)