fit.pfc {abundant}R Documentation

Fit a high-dimensional principal fitted components model using the method of Cook, Forzani, and Rothman (2012).

Description

Let (x1,y1),,(xn,yn)(x_1, y_1), \ldots, (x_n, y_n) denote the nn measurements of the predictor and response, where xiRpx_i\in R^p and yiRy_i \in R. The model assumes that these measurements are a realization of nn independent copies of the random vector (X,Y)(X,Y)', where

X=μX+Γβ{f(Y)μf}+ϵ, X = \mu_X + \Gamma \beta\{f(Y) - \mu_f\}+ \epsilon,

μXRp\mu_X\in R^p; ΓRp×d\Gamma\in R^{p\times d} with rank dd; βRd×r\beta \in R^{d\times r} with rank dd; f:RRrf: R \rightarrow R^r is a known vector valued function; μf=E{f(Y)}\mu_f = E\{f(Y)\}; ϵNp(0,Δ)\epsilon \sim N_p(0, \Delta); and YY is independent of ϵ\epsilon. The central subspace is Δ1span(Γ)\Delta^{-1} {\rm span}(\Gamma).

This function computes estimates of these model parameters by imposing constraints for identifiability. The mean parameters μX\mu_X and μf\mu_f are estimated with xˉ=n1i=1nxi\bar x = n^{-1}\sum_{i=1}^n x_i and fˉ=n1i=1nf(yi)\bar f = n^{-1} \sum_{i=1}^n f(y_i). Let Φ^=n1i=1n{f(yi)fˉ}{f(yi)fˉ}\widehat\Phi = n^{-1}\sum_{i=1}^{n} \{f(y_i) - \bar f\}\{f(y_i) - \bar f\}', which we require to be positive definite. Given a user-specified weight matrix W^\widehat W, let

(Γ^,β^)=argminGRp×d,BRd×ri=1n[xixˉGB{f(yi)fˉ}]W^[xixˉGB{f(yi)fˉ}], (\widehat\Gamma, \widehat\beta) = \arg\min_{G\in R^{p\times d}, B \in R^{d\times r}} \sum_{i=1}^n [x_i - \bar x - GB\{f(y_i) - \bar f\}]'\widehat W [x_i - \bar x - GB\{f(y_i) - \bar f\}],

subject to the constraints that GW^GG'\widehat W G is diagonal and BΦ^B=IB \widehat\Phi B' = I. The sufficient reduction estimate R^:RpRd\widehat R: R^p \rightarrow R^d is defined by

R^(x)=(Γ^W^Γ^)1Γ^W^(xxˉ). \widehat R(x) = (\widehat\Gamma'\widehat W \widehat\Gamma)^{-1} \widehat\Gamma' \widehat W(x - \bar x).

Usage

fit.pfc(X, y, r=4, d=NULL, F.user=NULL, weight.type=c("sample", "diag", "L1"), 
        lam.vec=NULL, kfold=5, silent=TRUE, qrtol=1e-10, cov.tol=1e-4, 
        cov.maxit=1e3, NPERM=1e3, level=0.01)

Arguments

X

The predictor matrix with nn rows and pp columns. The iith row is xix_i defined above.

y

The vector of measured responses with nn entries. The iith entry is yiy_i defined above.

r

When polynomial basis functions are used (which is the case when F.user=NULL), r is the polynomial order, i.e, f(y)=(y,y2,,yr)f(y) = (y, y^2, \ldots, y^r)'. The default is r=4. This argument is not used when F.user is specified.

d

The dimension of the central subspace defined above. This must be specified by the user when weight.type="L1". If unspecified by the user this function will use the sequential permutation testing procedure, described in Section 8.2 of Cook, Forzani, and Rothman (2012), to select d.

F.user

A matrix with nn rows and rr columns, where the iith row is f(yi)f(y_i) defined above. This argument is optional, and will typically be used when polynomial basis functions are not desired.

weight.type

The type of weight matrix estimate W^\widehat W to use. Let Δ^\widehat\Delta be the observed residual sample covariance matrix for the multivariate regression of X on ff(Y) with nr1n-r-1 scaling. There are three options for W^\widehat W:

  • weight.type="sample" uses a Moore-Penrose generalized inverse of Δ^\widehat\Delta for W^\widehat W, when pnr1p \leq n-r-1 this becomes the inverse of Δ^\widehat\Delta;

  • weight.type="diag" uses the inverse of the diagonal matrix with the same diagonal as Δ^\widehat\Delta for W^\widehat W;

  • weight.type="L1" uses the L1-penalized inverse of Δ^\widehat\Delta described in equation (5.4) of Cook, Forzani, and Rothman (2012). In this case, lam.vec and d must be specified by the user. The glasso algorithm of Friedman et al. (2008) is used through the R package glasso.

lam.vec

A vector of candidate tuning parameter values to use when weight.type="L1". If this vector has more than one entry, then kfold cross validation will be performed to select the optimal tuning parameter value.

kfold

The number of folds to use in cross-validation to select the optimal tuning parameter when weight.type="L1". Only used if lam.vec has more than one entry.

silent

Logical. When silent=FALSE, progress updates are printed.

qrtol

The tolerance for calls to qr.solve().

cov.tol

The convergence tolerance for the QUIC algorithm used when weight.type="L1".

cov.maxit

The maximum number of iterations allowed for the QUIC algorithm used when weight.type="L1".

NPERM

The number of permutations to used in the sequential permutation testing procedure to select dd. Only used when d is unspecified.

level

The significance level to use to terminate the sequential permutation testing procedure to select dd.

Details

See Cook, Forzani, and Rothman (2012) more information.

Value

A list with

Gamhat

this is Γ^\widehat\Gamma described above.

bhat

this is β^\widehat\beta described above.

Rmat

this is W^Γ^(Γ^W^Γ^)1\widehat W\widehat\Gamma(\widehat\Gamma'\widehat W \widehat\Gamma)^{-1}.

What

this is W^\widehat W described above.

d

this is dd described above.

r

this is rr described above.

GWG

this is Γ^W^Γ^\widehat\Gamma'\widehat W \widehat\Gamma

fc

a matrix with nn rows and rr columns where the iith row is f(yi)fˉf(y_i) - \bar f.

Xc

a matrix with nn rows and pp columns where the iith row is xixˉx_i - \bar x.

y

the vector of nn response measurements.

mx

this is xˉ\bar x described above.

mf

this is fˉ\bar f described above.

best.lam

this is selected tuning parameter value used when weight.type="L1", will be NULL otherwise.

lam.vec

this is the vector of candidate tuning parameter values used when weight.type="L1", will be NULL otherwise.

err.vec

this is the vector of validation errors from cross validation, one error for each entry in lam.vec. Will be NULL unless weight.type="L1" and lam.vec has more than one entry.

test.info

a dataframe that summarizes the results from the sequential testing procedure. Will be NULL unless d is unspecified.

Author(s)

Adam J. Rothman

References

Cook, R. D., Forzani, L., and Rothman, A. J. (2012). Estimating sufficient reductions of the predictors in abundant high-dimensional regressions. Annals of Statistics 40(1), 353-384.

Friedman, J., Hastie, T., and Tibshirani R. (2008). Sparse inverse covariance estimation with the lasso. Biostatistics 9(3), 432-441.

See Also

pred.response

Examples

set.seed(1)
n=20
p=30
d=2
y=sqrt(12)*runif(n)
Gam=matrix(rnorm(p*d), nrow=p, ncol=d)
beta=diag(2)
E=matrix(0.5*rnorm(n*p), nrow=n, ncol=p)
V=matrix(c(1, sqrt(12), sqrt(12), 12.8), nrow=2, ncol=2)
tmp=eigen(V, symmetric=TRUE)
V.msqrt=tcrossprod(tmp$vec*rep(tmp$val^(-0.5), each=2), tmp$vec)
Fyc=cbind(y-sqrt(3),y^2-4)%*%V.msqrt
X=0+Fyc%*%t(beta)%*%t(Gam) + E

fit=fit.pfc(X=X, y=y, r=3, weight.type="sample")
## display hypothesis testing information for selecting d
fit$test.info
##  make a response versus fitted values plot
plot(pred.response(fit), y)

[Package abundant version 1.2 Index]