R: Adjusting a data matrix for underlying factors

farm.res {FarmSelect}

R Documentation

Adjusting a data matrix for underlying factors

Description

Given a matrix of covariates, this function estimates the underlying factors and computes data residuals after regressing out those factors.

Usage

farm.res(X, K.factors = NULL, robust = TRUE, cv = FALSE, tau = 2,
  verbose = TRUE)

Arguments

`X`	an n x p data matrix with each row being a sample.
`K.factors`	a optional number of factors to be estimated. Otherwise estimated internally. K>0.
`robust`	a boolean, specifying whether or not to use robust estimators for mean and variance. Default is TRUE.
`cv`	a boolean, specifying whether or not to run cross-validation for the tuning parameter. Default is FALSE. Only used if `robust` is TRUE.
`tau`	`>0` multiplier for the tuning parameter for Huber loss function. Default is 2. Only used if `robust` is TRUE and `cv` is FALSE. See details.
`verbose`	a boolean specifying whether to print runtime updates to the console. Default is TRUE.

Details

For details about the method, see Fan et al.(2017).

Using robust = TRUE uses the Huber's loss to estimate parameters robustly. For details of covariance estimation method see Fan et al.(2017).

Number of rows and columns of the data matrix must be at least 4 in order to be able to calculate latent factors.

Number of latent factors, if not provided, is estimated by the eignevalue ratio test. See Ahn and Horenstein(2013). The maximum number is taken to be min(n,p)/2. User can supply a larger number is desired.

The tuning parameter = tau * sigma * optimal rate where optimal rate is the optimal rate for the tuning parameter. For details, see Fan et al.(2017). sigma is the standard deviation of the data.

Value

A list with the following items

`residual`	the data after being adjusted for underlying factors
`loadings`	estimated factor loadings
`factors`	estimated factors
`nfactors`	the number of (estimated) factors

References

Ahn, S. C., and A. R. Horenstein (2013): "Eigenvalue Ratio Test for the Number of Factors," Econometrica, 81 (3), 1203–1227.

Fan J., Ke Y., Wang K., "Decorrelation of Covariates for High Dimensional Sparse Regression." https://arxiv.org/abs/1612.08490

Examples

set.seed(100)
P = 200 #dimension
N = 50 #samples
K = 3 #nfactors
Q = 3 #model size
Lambda = matrix(rnorm(P*K, 0,1), P,K)
F = matrix(rnorm(N*K, 0,1), N,K)
U = matrix(rnorm(P*N, 0,1), P,N)
X = Lambda%*%t(F)+U
X = t(X)
output = farm.res(X) #default options
output$nfactors
output = farm.res(X, K.factors = 10) #inputting factors
names(output) #list of output

[Package FarmSelect version 1.0.2 Index]