simml {simml} | R Documentation |
Single-index models with multiple-links (main function)
Description
simml
is the wrapper function for Single-index models with multiple-links (SIMML).
The function estimates a linear combination (a single-index) of covariates X, and models the treatment-specific outcome y, via treatment-specific nonparametrically-defined link functions.
Usage
simml(y, A, X, Xm = NULL, aug = NULL, family = "gaussian",
R = NULL, bs = "cr", k = 8, sp = NULL, linear.link = FALSE,
method = "GCV.Cp", gamma = 1, rho = 0, beta.ini = NULL,
ind.to.be.positive = NULL, scale.si.01 = FALSE, max.iter = 20,
eps.iter = 0.01, trace.iter = TRUE, lambda = 0, pen.order = 0,
scale.X = TRUE, center.X = TRUE, ortho.constr = TRUE,
si.main.effect = FALSE, random.effect = FALSE, z = NULL,
plots = FALSE, bootstrap = FALSE, nboot = 200, boot.conf = 0.95,
seed = 1357)
Arguments
y |
a n-by-1 vector of treatment outcomes; y is a member of the exponential family; any distribution supported by |
A |
a n-by-1 vector of treatment variable; each element is assumed to take a value in a finite discrete space. |
X |
a n-by-p matrix of baseline covarates. |
Xm |
a n-by-q design matrix associated with an X main effect model; the defult is |
aug |
a n-by-1 additional augmentation vector associated with the X main effect; the default is |
family |
specifies the distribution of y; e.g., "gaussian", "binomial", "poisson"; can be any family supported by |
R |
the number of response categories for the case of family = "ordinal". |
bs |
basis type for the treatment (A) and single-index joint effect; the defult is "ps" (p-splines); any basis supported by |
k |
basis dimension for the spline-type-represented treatment-specific link functions. |
sp |
smoothing paramter for the treatment-specific link functions; if |
linear.link |
if |
method |
the smoothing parameter estimation method; "GCV.Cp" to use GCV for unknown scale parameter and Mallows' Cp/UBRE/AIC for known scale; any method supported by |
gamma |
increase this beyond 1 to produce smoother models. |
rho |
a tuning parameter associated with the additional augmentation vector |
beta.ini |
an initial value for |
ind.to.be.positive |
for identifiability of the solution |
scale.si.01 |
if |
max.iter |
an integer specifying the maximum number of iterations for |
eps.iter |
a value specifying the convergence criterion of algorithm. |
trace.iter |
if |
lambda |
a regularization parameter associated with the penalized LS for |
pen.order |
0 indicates the ridge penalty; 1 indicates the 1st difference penalty; 2 indicates the 2nd difference penalty, used in a penalized least squares (LS) estimation of |
scale.X |
if |
center.X |
if |
ortho.constr |
separates the interaction effects from the main effect (without this, the interaction effect can be confounded by the main effect; the default is |
si.main.effect |
if |
random.effect |
if |
z |
a factor that specifies the random intercepts when |
plots |
if |
bootstrap |
if |
nboot |
when |
boot.conf |
a value specifying the confidence level of the bootstrap confidence intervals; the defult is |
seed |
when |
Details
SIMML captures the effect of covariates via a single-index and their interaction with the treatment via nonparametric link functions.
Interaction effects are determined by distinct shapes of the link functions.
The estimated single-index is useful for comparing differential treatment efficacy.
The resulting simml
object can be used to estimate an optimal treatment decision rule
for a new patient with pretreatment clinical information.
Value
a list of information of the fitted SIMML including
beta.coef |
the estimated single-index coefficients. |
g.fit |
a |
beta.ini |
the initial value used in the estimation of |
beta.path |
solution path of |
d.beta |
records the change in |
scale.X |
sd of pretreatment covariates X |
center.X |
mean of pretreatment covariates X |
L |
number of different treatment options |
p |
number of pretreatment covariates X |
n |
number of subjects |
boot.ci |
(1-boot.alpha/2) percentile bootstrap CIs (LB, UB) associated with |
Author(s)
Park, Petkova, Tarpey, Ogden
See Also
pred.simml
, fit.simml
Examples
family <- "gaussian" #"poisson"
delta = 1 # moderate main effect
s=2 # if s=2 (s=1), a nonlinear (linear) contrast function
n=500 # number of subjects
p=10 # number of pretreatment covariates
# generate training data
data <- generate.data(n= n, p=p, delta = delta, s= s, family = family)
data$SNR # the ratio of interactions("signal") vs. main effects("noise")
A <- data$A
y <- data$y
X <- data$X
# generate testing data
data.test <- generate.data(n=10^5, p=p, delta = delta, s= s, family = family)
A.test <- data.test$A
y.test <- data.test$y
X.test <- data.test$X
data.test$value.opt # the optimal "value"
# fit SIMML
#1) SIMML without X main effect
simml.obj1 <- simml(y, A, X, family = family)
#2) SIMML with X main effect (estimation efficiency for the g term of SIMML can be improved)
simml.obj2 <- simml(y, A, X, Xm = X, family = family)
# apply the estimated SIMML to the testing set and obtain treatment assignment rules.
simml.trt.rule1 <- pred.simml(simml.obj1, newX= X.test)$trt.rule
# "value" estimation (estimated by IPWE)
simml.value1 <- mean(y.test[simml.trt.rule1 == A.test])
simml.value1
simml.trt.rule2 <- pred.simml(simml.obj2, newX= X.test)$trt.rule
simml.value2 <- mean(y.test[simml.trt.rule2 == A.test])
simml.value2
# compare these to the optimal "value"
data.test$value.opt
# fit MC (modified covariates) model of Tien et al 2014
n.A <- summary(as.factor(A)); pi.A <- n.A/sum(n.A)
mc <- (as.numeric(A) + pi.A[1] -2) *cbind(1, X) # 0.5*(-1)^as.numeric(A) *cbind(1, X)
mc.coef <- coef(glm(y ~ mc, family = family))
mc.trt.rule <- (cbind(1, X.test) %*% mc.coef[-1] > 0) +1
# "value" estimation (estimated by IPWE)
mc.value <- mean(y.test[mc.trt.rule == A.test])
mc.value
# visualization of the estimated link functions of SIMML
simml.obj1$beta.coef # estimated single-index coefficients
g.fit <- simml.obj1$g.fit # estimated trt-specific link functions; "g.fit" is a mgcv::gam object.
#plot(g.fit)
# can improve visualization by using the package "mgcViz"
#install.packages("mgcViz")
# mgcViz depends on "rgl". "rgl" depends on XQuartz, which you can download from xquartz.org
#library(mgcViz)
# transform the "mgcv::gam" object to a "mgcViz" object (to improve visualization)
g.fit <- getViz(g.fit)
plot1 <- plot( sm(g.fit,1) ) # for treatment group 1
plot1 + l_fitLine(colour = "red") + l_rug(mapping = aes(x=x, y=y), alpha = 0.8) +
l_ciLine(mul = 5, colour = "blue", linetype = 2) +
l_points(shape = 19, size = 1, alpha = 0.1) +
xlab(expression(paste("z = ", alpha*minute, "x"))) + ylab("y") +
ggtitle("Treatment group 1 (Trt =1)") + theme_classic()
plot2 <- plot( sm(g.fit,2) ) # for treatment group 2
plot2 + l_fitLine(colour = "red") + l_rug(mapping = aes(x=x, y=y), alpha = 0.8) +
l_ciLine(mul = 5, colour = "blue", linetype = 2) +
l_points(shape = 19, size = 1, alpha = 0.1) +
xlab(expression(paste("z = ", alpha*minute, "x"))) +ylab("y") +
ggtitle("Treatment group 2 (Trt =2)") + theme_classic()
trans = function(x) x + g.fit$coefficients[2]
plotDiff(s1 = sm(g.fit, 2), s2 = sm(g.fit, 1), trans=trans) + l_ciPoly() +
l_fitLine() + geom_hline(yintercept = 0, linetype = 2) +
xlab(expression(paste("z = ", alpha*minute, "x")) ) +
ylab("(Treatment 2 effect) - (Treatment 1 effect)") +
ggtitle("Contrast between two treatment effects") +
theme_classic()
# yet another way of visualization, using ggplot2
#library(ggplot2)
dat <- data.frame(y= simml.obj1$g.fit$model$y,
x= simml.obj1$g.fit$model$single.index,
Treatment= simml.obj1$g.fit$model$A)
g.plot<- ggplot(dat, aes(x=x,y=y,color=Treatment,shape=Treatment,linetype=Treatment))+
geom_point(aes(color=Treatment, shape=Treatment), size=1, fill="white") +
scale_colour_brewer(palette="Set1", direction=-1) +
xlab(expression(paste(beta*minute,"x"))) + ylab("y")
g.plot + geom_smooth(method=gam, formula= y~ s(x, bs=simml.obj1$bs, k=simml.obj1$k),
se=TRUE, fullrange=TRUE, alpha = 0.35)
# can obtain bootstrap CIs for beta.coef.
simml.obj <- simml(y,A,X,Xm=X, family=family,bootstrap=TRUE,nboot=15) #nboot=500.
simml.obj$beta.coef
round(simml.obj$boot.ci,3)
# compare the estimates to the true beta.coef.
data$true.beta
# an application to data with ordinal categorical response
dat <- ordinal.data(n=500, p=5, R = 11, # 11 response levels
s = "nonlinear", # nonlinear interactions
delta = 1)
dat$SNR
y <- dat$y # ordinal response
X <- dat$X # X matrix
A <- dat$A # treatment
dat$true.beta # the "true" single-index coefficient
# 1) fit a cumulative logit simml, with a flexible link function
res <- simml(y,A,X, family="ordinal", R=11)
res$beta.coef # single-index coefficients.
res$g.fit$family$getTheta(TRUE) # the estimated R-1 threshold values.
# 2) fit a cumulative logit simml, with a linear link function
res2 <- simml(y,A,X, family="ordinal", R=11, linear.link = TRUE)
res2$beta.coef # single-index coefficients.
family = mgcv::ocat(R=11) # ocat: ordered categorical response family, with R categories.
# the treatment A's effect.
tmp <- mgcv::gam(y ~ A, family =family)
exp(coef(tmp)[2]) #odds ratio (OR) comparing treatment A=2 vs. A=1.
ind2 <- pred.simml(res)$trt.rule ==2 # subgroup recommended with A=2 under SIMML ITR
tmp2 <- mgcv::gam(y[ind2] ~ A[ind2], family = family)
exp(coef(tmp2)[2]) #OR comparing treatment A=2 vs. A=1, for subgroup recommended with A=2
ind1 <- pred.simml(res)$trt.rule ==1 # subgroup recommended with A=1 under SIMML ITR
tmp1 <- mgcv::gam(y[ind1] ~ A[ind1], family = family)
exp(coef(tmp1)[2]) #OR comparing treatment A=2 vs. A=1, for subgroup recommended with A=2