farm.res {FarmSelect} | R Documentation |
Adjusting a data matrix for underlying factors
Description
Given a matrix of covariates, this function estimates the underlying factors and computes data residuals after regressing out those factors.
Usage
farm.res(X, K.factors = NULL, robust = TRUE, cv = FALSE, tau = 2,
verbose = TRUE)
Arguments
X |
an n x p data matrix with each row being a sample. |
K.factors |
a optional number of factors to be estimated. Otherwise estimated internally. K>0. |
robust |
a boolean, specifying whether or not to use robust estimators for mean and variance. Default is TRUE. |
cv |
a boolean, specifying whether or not to run cross-validation for the tuning parameter. Default is FALSE. Only used if |
tau |
|
verbose |
a boolean specifying whether to print runtime updates to the console. Default is TRUE. |
Details
For details about the method, see Fan et al.(2017).
Using robust = TRUE
uses the Huber's loss to estimate parameters robustly. For details of covariance estimation method see Fan et al.(2017).
Number of rows and columns of the data matrix must be at least 4 in order to be able to calculate latent factors.
Number of latent factors, if not provided, is estimated by the eignevalue ratio test. See Ahn and Horenstein(2013). The maximum number is taken to be min(n,p)/2. User can supply a larger number is desired.
The tuning parameter = tau * sigma * optimal rate
where optimal rate
is the optimal rate for the tuning parameter. For details, see Fan et al.(2017). sigma
is the standard deviation of the data.
Value
A list with the following items
residual |
the data after being adjusted for underlying factors |
loadings |
estimated factor loadings |
factors |
estimated factors |
nfactors |
the number of (estimated) factors |
References
Ahn, S. C., and A. R. Horenstein (2013): "Eigenvalue Ratio Test for the Number of Factors," Econometrica, 81 (3), 1203–1227.
Fan J., Ke Y., Wang K., "Decorrelation of Covariates for High Dimensional Sparse Regression." https://arxiv.org/abs/1612.08490
See Also
Examples
set.seed(100)
P = 200 #dimension
N = 50 #samples
K = 3 #nfactors
Q = 3 #model size
Lambda = matrix(rnorm(P*K, 0,1), P,K)
F = matrix(rnorm(N*K, 0,1), N,K)
U = matrix(rnorm(P*N, 0,1), P,N)
X = Lambda%*%t(F)+U
X = t(X)
output = farm.res(X) #default options
output$nfactors
output = farm.res(X, K.factors = 10) #inputting factors
names(output) #list of output