maximin {FRESHD} | R Documentation |
Maximin signal estimation
Description
Efficient procedure for solving the maximin estimation problem with identical design across groups, see (Lund, 2022).
Usage
maximin(y,
x,
penalty = "lasso",
alg ="aradmm",
kappa = 0.99,
nlambda = 30,
lambda_min_ratio = 1e-04,
lambda = NULL,
penalty_factor = NULL,
standardize = TRUE,
tol = 1e-05,
maxiter = 1000,
delta = 1,
gamma = 1,
eta = 0.1,
aux_par = NULL)
Arguments
y |
Array of size |
x |
Either i) the design matrix, ii) a list containing the Kronecker
components (2 or 3) if the design matrix has Kronecker structure or iii) a
string indicating the name of the wavelet to use (see |
penalty |
string specifying the penalty type. Possible values are
|
alg |
string specifying the optimization algorithm. Possible values are
|
kappa |
Strictly positive float controlling the maximum sparsity in the solution. Only used with ADMM type algorithms. Should be between 0 and 1. |
nlambda |
Positive integer giving the number of |
lambda_min_ratio |
strictly positive float giving the smallest value for
|
lambda |
Sequence of strictly positive floats used as penalty parameters. |
penalty_factor |
A vector of length |
standardize |
Boolean indicating if response |
tol |
Strictly positive float controlling the convergence tolerance. |
maxiter |
Positive integer giving the maximum number of iterations
allowed for each |
delta |
Positive float controlling the step size in the algorithm. |
gamma |
Positive float controlling the relaxation parameter in the algorithm. Should be between 0 and 2. |
eta |
Scaling parameter for the step size in the accelerated TOS algorithm. Should be between 0 and 1. |
aux_par |
Auxiliary parameters for the algorithms. |
Details
For n
heterogeneous data points divided into G
equal sized
groups with m<n
data points in each, let y_g=(y_{g,1},\ldots,y_{g,m})
denote the vector of observations in group g
. For a m\times p
design matrix X
consider the model
y_g=Xb_g+\epsilon_g
for b_g
a random group specific coefficient vector and \epsilon_g
an error term, see Meinshausen and Buhlmann (2015). For the model above following
Lund (2022) this package solves the maximin estimation problem
\min_{\beta} -\hat V_g(\beta)) + \lambda\Vert\beta\Vert_1,\lambda \ge 0
where
\hat V_g(\beta):=\frac{1}{n}(2\beta^\top X^\top y_g - \beta^\top X^\top X\beta),
is the empirical explained variance in group g
. See Lund, 2022
for more details and references.
The package solves the problem using different algorithms depending on X
:
i) If X
is orthogonal (e.g. the inverse wavelet transform) either
an ADMM algorithm (standard or relaxed) or an adaptive relaxed
ADMM (ARADMM) with auto tuned step size is used, see Xu et al (2017).
ii) For general X
, a three operator splitting (TOS) algorithm
is implemented, see Damek and Yin (2017). Note if the design is
tensor structured, X = \bigotimes_{i=1}^d X_i
for d\in\{1, 2,3\}
,
the procedure accepts a list containing the tensor components (matrices).
Value
An object with S3 Class "FRESHD".
spec |
A string indicating the array dimension (1, 2 or 3) and the penalty. |
coef |
A |
lambda |
A vector containing the sequence of penalty values used in the estimation procedure. |
df |
The number of nonzero coefficients for each value of |
dimcoef |
A vector giving the dimension of the model coefficient array
|
dimobs |
A vector giving the dimension of the observation (response) array |
dim |
Integer indicating the dimension of of the array model. Equal to 1 for non array. |
wf |
A string indicating the wavelet name if used. |
diagnostics |
A list where item |
Author(s)
Adam Lund
Maintainer: Adam Lund, adam.lund@math.ku.dk
References
Lund, Adam (2022). Fast Robust Signal Estimation for Heterogeneous data. In preparation.
Meinshausen, N and P. Buhlmann (2015). Maximin effects in inhomogeneous large-scale data. The Annals of Statistics. 43, 4, 1801-1830.
Davis, Damek and Yin, Wotao, (2017). A three-operator splitting scheme and its optimization applications. Set-valued and variational analysis. 25, 4, 829-858.
Xu, Zheng and Figueiredo, Mario AT and Yuan, Xiaoming and Studer, Christoph and Goldstein, Tom (2017). Adaptive relaxed admm: Convergence theory and practical implementation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 7389-7398.
Examples
## general 3d tensor design matrix
set.seed(42)
G <- 20; n <- c(65, 26, 13)*3; p <- c(13, 5, 4)*3
sigma <- 1
##marginal design matrices (Kronecker components)
x <- list()
for(i in 1:length(n)){x[[i]] <- matrix(rnorm(n[i] * p[i], 0, sigma), n[i], p[i])}
##common features and effects
common_features <- rbinom(prod(p), 1, 0.1)
common_effects <- rnorm(prod(p), 0, 0.1) * common_features
##simulate group response
y <- array(NA, c(n, G))
for(g in 1:G){
bg <- rnorm(prod(p), 0, 0.1) * (1 - common_features) + common_effects
Bg <- array(bg, p)
mu <- RH(x[[3]], RH(x[[2]], RH(x[[1]], Bg)))
y[,,, g] <- array(rnorm(prod(n), 0, var(mu)), dim = n) + mu
}
##fit model for range of lambda
system.time(fit <- maximin(y, x, penalty = "lasso", alg = "tosacc"))
##estimated common effects for specific lambda
modelno <- 10
betafit <- fit$coef[, modelno]
plot(common_effects, type = "h", ylim = range(betafit, common_effects), col = "red")
lines(betafit, type = "h")
##size of example
set.seed(42)
G <- 50; p <- n <- c(2^6, 2^5, 2^6);
##common features and effects
common_features <- rbinom(prod(p), 1, 0.1) #sparsity of comm. feat.
common_effects <- rnorm(prod(p), 0, 1) * common_features
##group response
y <- array(NA, c(n, G))
for(g in 1:G){
bg <- rnorm(prod(p), 0, 0.1) * (1 - common_features) + common_effects
Bg <- array(bg, p)
mu <- iwt(Bg)
y[,,, g] <- array(rnorm(prod(n),0, 0.5), dim = n) + mu
}
##orthogonal wavelet design with 1d data
G = 50; N1 = 2^10; p = 101; J = 2; amp = 20; sigma2 = 10
y <- matrix(0, N1, G)
z <- seq(0, 2, length.out = N1)
sig <- cos(10 * pi * z) + 1.5 * sin(5 * pi * z)
for (i in 1:G){
freqs <- sample(1:100, size = J, replace = TRUE)
y[, i] <- sig * 2 + rnorm(N1, sd = sqrt(sigma2))
for (j in 1:J){
y[, i] <- y[, i] + amp * sin(freqs[j] * pi * z + runif(1, -pi, pi))
}
}
system.time(fit <- maximin(y, "la8", alg = "aradmm", kappa = 0.9))
mmy <- predict(fit, "la8")
plot(mmy[,1], type = "l")
lines(sig, col = "red")