coef.ecpc {ecpc}R Documentation

Obtain coefficients from 'ecpc' object

Description

Obtain regression coefficients or penalties from an existing model fit given in an 'ecpc' object, re-estimate regression coefficients for a given 'ecpc' object and ridge penalties, or obtain ridge penalties for given prior parameters and co-data.

Usage

## S3 method for class 'ecpc'
coef(object, penalties = NULL, 
          X = NULL, Y = NULL, 
          unpen = NULL, intrcpt = TRUE, 
          model = c("linear", "logistic", "cox"), 
          est_beta_method = c("glmnet", "multiridge"), ...)

penalties(object, tauglobal=NULL, sigmahat=NULL, gamma=NULL, gamma0=NULL, w=NULL,
          Z=NULL, groupsets=NULL,
          unpen=NULL, datablocks=NULL)

Arguments

object

An 'ecpc' object returned by ecpc.

penalties

Ridge penalties; p-dimensional vector. If provided to coef.ecpc, 'X' and 'Y' should be provided too.

tauglobal

Estimated global prior variance; scalar (or vector with datatype-specific global prior variances when multiple ‘datablocks’ are given).) If provided to penalties, 'Z' or 'groupsets' should be provided too.

sigmahat

(linear model) Estimated sigma^2. If provided to penalties, 'Z' or 'groupsets' should be provided too.

gamma

Estimated co-data variable weights; vector of dimension the total number of groups. If provided to penalties, 'Z' or 'groupsets' should be provided too.

gamma0

Estimated co-data variable intercept; scalar. If provided to penalties, 'Z' or 'groupsets' should be provided too.

w

Estimated group set weights; m-dimensional vector. If provided to penalties, 'Z' or 'groupsets' should be provided too.

X

Observed data; (nxp)-dimensional matrix (p: number of covariates) with each row the observed high-dimensional feature vector of a sample.

Y

Response data; n-dimensional vector (n: number of samples) for linear and logistic outcomes, or Surv object for Cox survival.

Z

List with m co-data matrices. Each element is a (pxG)-dimensional co-data matrix containing co-data on the p variables. Co-data should either be provided in ‘Z’ or ‘groupsets’.

groupsets

Co-data group sets; list with m (m: number of group sets) group sets. Each group set is a list of all groups in that set. Each group is a vector containing the indices of the covariates in that group.

unpen

Unpenalised covariates; vector with indices of covariates that should not be penalised.

intrcpt

Should an intercept be included? Included by default for linear and logistic, excluded for Cox for which the baseline hazard is estimated.

model

Type of model for the response; linear, logistic or cox.

est_beta_method

Package used for estimating regression coefficients, either "glmnet" or "multiridge".

datablocks

(optional) for multiple data types, the corresponding blocks of data may be given in datablocks; a list of B vectors of the indices of covariates in ‘X’ that belong to each of the B data blocks. Unpenalised covariates should not be given as seperate block, but can be omitted or included in blocks with penalised covariates. Each datatype obtains a datatype-specific ‘tauglobal’ as in multiridge.

...

Other parameters

Value

For coef.ecpc, a list with:

intercept

If included, the estimated intercept; scalar.

beta

Estimated regression coefficients; p-dimensional vector.

For penalties: a p-dimensional vector with ridge penalties.

See Also

penalties for obtaining penalties for given prior parameters and co-data.

Examples

 
#####################
# Simulate toy data #
#####################
p<-300 #number of covariates
n<-100 #sample size training data set
n2<-100 #sample size test data set

#simulate all betas i.i.d. from beta_k~N(mean=0,sd=sqrt(0.1)):
muBeta<-0 #prior mean
varBeta<-0.1 #prior variance
indT1<-rep(1,p) #vector with group numbers all 1 (all simulated from same normal distribution)

#simulate test and training data sets:
Dat<-simDat(n,p,n2,muBeta,varBeta,indT1,sigma=1,model='linear') 
str(Dat) #Dat contains centered observed data, response data and regression coefficients

###################
# Provide co-data #
###################
continuousCodata <- abs(Dat$beta) 
Z1 <- cbind(continuousCodata,sqrt(continuousCodata))

#setting 2: splines for informative continuous
Z2 <- createZforSplines(values=continuousCodata)
S1.Z2 <- createS(orderPen=2, G=dim(Z2)[2]) #create difference penalty matrix
Con2 <- createCon(G=dim(Z2)[2], shape="positive+monotone.i") #create constraints

#setting 3: 5 random groups
G <- 5
categoricalRandom <- as.factor(sample(1:G,p,TRUE))
#make group set, i.e. list with G groups:
groupsetRandom <- createGroupset(categoricalRandom)
Z3 <- createZforGroupset(groupsetRandom,p=p)
S1.Z3 <- createS(G=G, categorical = TRUE) #create difference penalty matrix
Con3 <- createCon(G=dim(Z3)[2], shape="positive") #create constraints

#fit ecpc for the three co-data matrices with following penalty matrices and constraints
#note: can also be fitted without paraPen and/or paraCon
Z.all <- list(Z1=Z1,Z2=Z2,Z3=Z3)
paraPen.all <- list(Z2=list(S1=S1.Z2), Z3=list(S1=S1.Z3))
paraCon <- list(Z2=Con2, Z3=Con3)

############
# Fit ecpc #
############
tic<-proc.time()[[3]]
fit <- ecpc(Y=Dat$Y,X=Dat$Xctd,
           Z = Z.all, paraPen = paraPen.all, paraCon = paraCon,
           model="linear",maxsel=c(5,10,15,20),
           Y2=Dat$Y2,X2=Dat$X2ctd)
toc <- proc.time()[[3]]-tic

#estimate coefficients for twice as large penalties
new_coefficients <- coef(fit, penalties=fit$penalties*2, X=Dat$Xctd, Y=Dat$Y)

#change some prior parameters and find penalties
gamma2 <- fit$gamma; gamma2[1:3] <- 1:3
new_penalties <- penalties(fit, gamma=gamma2, Z=Z.all)
new_coefficients2 <- coef(fit, penalties=new_penalties, X=Dat$Xctd, Y=Dat$Y)



[Package ecpc version 3.1.1 Index]