cve.call {CVarE}  R Documentation 
This is the main function in the CVE
package. It creates objects of
class "cve"
to estimate the mean subspace. Helper functions that
require a "cve"
object can then be applied to the output from this
function.
Conditional Variance Estimation (CVE) is a sufficient dimension reduction (SDR) method for regressions studying E(YX), the conditional expectation of a response Y given a set of predictors X. This function provides methods for estimating the dimension and the subspace spanned by the columns of a p x k matrix B of minimal rank k such that
E(YX) = E(YB'X)
or, equivalently,
Y = g(B'X) + ε
where X is independent of ε with positive definite variancecovariance matrix Var(X) = Σ_X. ε is a mean zero random variable with finite Var(ε) = E(ε^2), g is an unknown, continuous nonconstant function, and B = (b_1,..., b_k) is a real p x k matrix of rank k <= p.
Both the dimension k and the subspace span(B) are unknown. The CVE method makes very few assumptions.
A kernel matrix Bhat is estimated such that the column space of Bhat should be close to the mean subspace span(B). The primary output from this method is a set of orthonormal vectors, Bhat, whose span estimates span(B).
The method central implements the Ensemble Conditional Variance Estimation
(ECVE) as described in [2]. It augments the CVE method by applying an
ensemble of functions (parameter func_list
) to the response to
estimate the central subspace. This corresponds to the generalization
F(YX) = F(YB'X)
or, equivalently,
Y = g(B'X, ε)
where F is the conditional cumulative distribution function.
cve.call( X, Y, method = c("mean", "weighted.mean", "central", "weighted.central"), func_list = NULL, nObs = sqrt(nrow(X)), h = NULL, min.dim = 1L, max.dim = 10L, k = NULL, momentum = 0, tau = 1, tol = 0.001, slack = 0, gamma = 0.5, V.init = NULL, max.iter = 50L, attempts = 10L, nr.proj = 1L, logger = NULL )
X 
Design predictor matrix. 
Y 
ndimensional vector of responses. 
method 
This character string specifies the method of fitting. The options are

func_list 
a list of functions applied to 
nObs 
parameter for choosing bandwidth 
h 
bandwidth or function to estimate bandwidth, defaults to internaly estimated bandwidth. 
min.dim 
lower bounds for 
max.dim 
upper bounds for 
k 
Dimension of lower dimensional projection, if 
momentum 
number of [0, 1) giving the ration of momentum for
eucledian gradient update with a momentum term. 
tau 
Initial stepsize. 
tol 
Tolerance for break condition. 
slack 
Positive scaling to allow small increases of the loss while
optimizing, i.e. 
gamma 
stepsize reduction multiple. If gradient step with step size

V.init 
Semiorthogonal matrix of dimensions '(ncol(X), ncol(X)  k)
used as starting value in the optimization. (If supplied,

max.iter 
maximum number of optimization steps. 
attempts 
If 
nr.proj 
The number of projection used for projective resampling for multivariate response Y (under active development, ignored for univariate response). 
logger 
a logger function (only for advanced users, slows down the computation). 
an S3 object of class cve
with components:
design matrix of predictor vector used for calculating cveestimate,
ndimensional vector of responses used for calculating cveestimate,
Name of used method,
the matched call,
list of components V, L, B, loss, h
for
each k = min.dim, ..., max.dim
. If k
was supplied in the
call min.dim = max.dim = k
.
B
is the cveestimate with dimension
p x k.
V
is the orthogonal complement of B.
L
is the loss for each sample seperatels such that
it's mean is loss
.
loss
is the value of the target function that is
minimized, evaluated at V.
h
bandwidth parameter used to calculate
B, V, loss, L
.
[1] Fertl, L. and Bura, E. (2021) "Conditional Variance Estimation for Sufficient Dimension Reduction" <arXiv:2102.08782>
[2] Fertl, L. and Bura, E. (2021) "Ensemble Conditional Variance Estimation for Sufficient Dimension Reduction" <arXiv:2102.13435>
# create B for simulation (k = 1) B < rep(1, 5) / sqrt(5) set.seed(21) # creat predictor data X ~ N(0, I_p) X < matrix(rnorm(500), 100, 5) # simulate response variable # Y = f(B'X) + err # with f(x1) = x1 and err ~ N(0, 0.25^2) Y < X %*% B + 0.25 * rnorm(100) # calculate cve with method 'simple' for k = 1 set.seed(21) cve.obj.simple1 < cve(Y ~ X, k = 1) # same as set.seed(21) cve.obj.simple2 < cve.call(X, Y, k = 1) # extract estimated B's. coef(cve.obj.simple1, k = 1) coef(cve.obj.simple2, k = 1)