squant {squant}R Documentation

The SQUANT method

Description

squant conducts subgroup identification based on quantitative criteria.

Usage

squant(
  yvar,
  censorvar = NULL,
  xvars,
  trtvar = NULL,
  trtcd = 1,
  data,
  type = "c",
  weight = NULL,
  dir = "larger",
  quant = NULL,
  xvars.keep = NULL,
  alpha = 1,
  fold = 5,
  n.cv = 50,
  FDR = 0.15,
  progress = TRUE
)

Arguments

yvar

A character. The response variable name in the data. The corresponding column in the data should be numeric.

censorvar

A character or NULL. The event indicator variable name in the data. The corresponding column in the data should be 0(censor) or 1(event). Use NULL when it is not a time-to-event case.

xvars

A vector of characters. The covariates (predictors) variable names in the data. The corresponding columns in the data should be numeric.

trtvar

A character or NULL. The trt variable name in the data for the predictive case. The corresponding column in the data should contain the treatment assignments, and can be either numeric or character. Use NULL for the prognostics case.

trtcd

The code for the treatment arm for the predictive case, e.g. trtcd="treatment" or trtcd=1, etc.

data

The data frame for training.

type

The response type. Use "s" for survival, "b" for binary, and "c" for continuous.

weight

The weight of every observation, has to be a numeric vector>0 or NULL (equivalent to all 1).

dir

A character, "larger" or "smaller". When dir == "larger", larger response is preferred for the target subgroup. In the predictive case, it means selecting patients satisfying E(Y|X,TRT)-E(Y|X,CTRL)>=quant. In the prognostic case, it means selecting patients satisfying E(Y|X)>=quant. When dir == "smaller", smaller response is preferred for the target subgroup. In the predictive case, it means selecting patients satisfying E(Y|X,CTRL)-E(Y|X,TRT)>=quant. In the prognostic case, it means selecting patients satisfying E(Y|X)<=quant.

quant

A numeric value or NULL. The quantitative subgroup selection criterion. Please see dir. When NULL, the program will automatically select the best quant based on cross validation.

xvars.keep

A character vector. The names of variables that we want to keep in the final model.

alpha

The same alpha as in glmnet. alpha=1 is the lasso penalty.

fold

A numeric value. The number of folds for internal cross validation for variable selection.

n.cv

A numeric value. The number of different values of quant used for cross validation. It's also the number of CV to conduct variable selection.

FDR

A numeric value. The level of FDR control for variable selection and the entire training process.

progress

a logical value (TRUE/FALSE), whether to display the program progress.

Details

This is the main function of SQUANT to train subgroup signatures. This method can handle continuous, binary and survival endpoint for both prognostic and predictive case. For the predictive case, the method aims at identifying a subgroup for which treatment is better than control by at least a pre-specified or auto-selected constant. For the prognostic case, the method aims at identifying a subgroup that is at least better than a pre-specified/auto-selected constant. The derived signature is a linear combination of predictors, and the selected subgroup are subjects with the signature > 0. The false discover rate when no true subgroup exists is strictly controlled at a user-specified level.

Value

An object of "squant". A list containing the following elements.

squant.fit

The fitted signature from training, which is the coefficients of the linear combination of predictors plus an intercept.

data.pred

The training data with the predicted subgroup in the last column.

performance

The output of eval_squant (excluding the data.pred). The performance of subgroup identification. In the predictive case, the performance includes the interaction p value, the p value of the trt difference in the selected positive group, the p value of the trt difference in the unselected negative group, and the stats for each arm in each group. In the prognostic case, the performance includes p value of group comparison and the stats of each group.

d.sel

Closely related to quant.Please see element: interpretation.

min.diff, threshold

Please see interpretation.

xvars.top

The ordered variable importance list.

FDR.min

The minimum achieveable FDR threshold so that a signature can be derived. This is useful when a pre-specified FDR does not lead to a signature, in which case the FDR.min can be used instead.

interpretation1

Interpretation of the result.

interpretation2

Interpretation of the result.

References

Yan Sun, Samad Hedayat. Subgroup Identification based on Quantitative Objectives. (submitted)

Examples

#toy example#
set.seed(888)
x=as.data.frame(matrix(rnorm(200),100,2))
names(x) = c("x1", "x2")
trt = sample(0:1, size=100, replace=TRUE)
y= 2*x[,2]*trt+rnorm(100)
data = cbind(y=y, trt=trt, x)
res = squant(yvar="y", censorvar=NULL, xvars=c("x1", "x2"),
             trtvar="trt", trtcd=1, data=data, type="c", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=2, n.cv = 10, FDR = 0.1, progress=FALSE)



#predictive case with continuous response#
set.seed(888)
x=as.data.frame(matrix(rnorm(20000),200,100))
names(x) = paste("x", 1:100,sep="")
trt = sample(0:1, size=200, replace=TRUE)
y=x[,1]+x[,2]*trt+rnorm(200)
data = cbind(y=y, trt=trt, x)
res = squant(yvar="y", censorvar=NULL, xvars=paste("x", 1:100,sep=""),
             trtvar="trt", trtcd=1, data=data, type="c", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=5, n.cv = 50, FDR = 0.1)
res
#fitted signature#
res$squant.fit
#performance of the identified subgroup#
#including:
#  interaction p value,
#  p valve of trt difference in positive group,
#  p value of trt difference in negative group,
#  and stats for each arm in each group.
res$performance
#interpretation#
res$interpretation1
res$interpretation2

#evaluation of prediction performance#
eval.res = eval_squant(yvar="y", censorvar=NULL, trtvar="trt", trtcd=1, dir="larger",
                       type="c", data=data, squant.out=res, brief=FALSE)
#plot the subgroups#
plot(res, trt.name="Trt", ctrl.name="Ctrl")
plot(eval.res, trt.name="Trt", ctrl.name="Ctrl")



#prognostic case with survival response#
set.seed(888)
x=as.data.frame(matrix(rnorm(20000),200,100))
names(x) = paste("x", 1:100,sep="")
y=10*(10+x[,1]+rnorm(200))
data = cbind(y=y, x)
data$event = sample(c(rep(1,150),rep(0,50)))
res = squant(yvar="y", censorvar="event", xvars=paste("x", 1:100,sep=""),
             trtvar=NULL, trtcd=NULL, data=data, type="s", weight=NULL,
             dir="larger", quant=NULL, xvars.keep=NULL, alpha=1,
             fold=5, n.cv = 50, FDR = 0.1)
res

#fitted signature#
res$squant.fit
#performance of the identified subgroup#
res$performance
#evaluation of prediction performance#
eval.res = eval_squant(yvar="y", censorvar="event", trtvar=NULL, trtcd=NULL, dir="larger",
                       type="s", data=data, squant.out=res, brief=FALSE)

#plot the subgroups#
plot(res, trt.name=NULL, ctrl.name=NULL)
plot(eval.res, trt.name=NULL, ctrl.name=NULL)



[Package squant version 1.1.4 Index]