R: Robust_regression

Robust_regression {RobRegression}

R Documentation

Robust_regression

Description

This function gives robust estimates of the paramter of the Multivariate Linear regression with the help of the euclidean distance, or with the help of the Mahalanobis distance for some matrice Sigma. More precisely, the aim is to minimize

G(\hat{\beta}) = \mathbb{E}[ \| Y-X\hat{\beta} \|_{\Sigma}] + \lambda \| \hat{\beta}\|^{\text{ridge}}.

Usage

Robust_regression(X,Y, Mat_Mahalanobis=diag(rep(1,ncol(Y))),
                  niter=50,lambda=0,c='default',method='Offline',
                  alpha=0.66,w=2,ridge=1,nlambda=50,
                  init=matrix(runif(ncol(X)*ncol(Y))-0.5,nrow=ncol(X),ncol=ncol(Y)),
                  epsilon=10^(-8), Mahalanobis_distance = FALSE,
                  par=TRUE,scale='none',tol=10^(-3))

Arguments

`X`	A (n,p)-matrix whose raws are the explaining data.
`Y`	A (n,q)-matrix whose raws are the variables to be explained.
`method`	The method used for estimating the parameter. Should be `method='Offline'` if the fix point algorithm is used, and `'Online'` if the (weighted) averaged stochastic gradient algorithm is used. Default is `'Offline'`.
`Mat_Mahalanobis`	A (q,q)-matrix giving `\Sigma` for the Mahalanobis distance. Default is identity.
`Mahalanobis_distance`	A logical telling if the Mahalanobis distance is used. Default is `FALSE`.
`scale`	If a scaling is used. `scale='robust'` should be used if a robust scaling of `Y` is desired. Default is `'none'`
`niter`	The maximum number of iteration if `method='Offline'`.
`init`	A (p,q)-matrix which gives the initialization of the algorithm.
`ridge`	The power of the penalty: i.e should be `2` if the squared norm is considered or `1` if the norm is considered.
`lambda`	A vector giving the different studied penalizations. If `lambda='default'`, would be a vector of preselected penalizations.
`nlambda`	The number of tested penalizations if `lambda='default'`.
`par`	Is equal to `TRUE` if the parallelization of the algorithm for estimating robustly the variance of the noise is allowed.
`c`	The constant in the stepsequence if the averaged stochastic gradient algorithm, i.e if `method='Online'`.
`alpha`	A scalar between 1/2 and 1 used in the stepsequence for stochastic gradient algorithm if `method='Online'`.
`w`	The power for the weighted averaged Robbins-Monro algorithm if `method='Online'`.
`epsilon`	Stoping condition for the fix point algorithm if `method='Offline'`.
`tol`	A scalar that avoid numerical problems if method='Offline'. Default is `10^(-3)`.

Value

A list with:

`beta`	A (p,q)-matrix giving the estimation of the parameters.
`criterion`	A vector giving the loss for the different chosen `lambda`. If `sale='robust'`, it is calculated on the scaled data.
`all_beta`	A list containing the different estimation of the parameters (with respect to the different coices of `lambda`).
`lambda_opt`	A scalar giving the selected `lambda`.

References

Godichon-Baggioni, A., Robin, S. and Sansonnet, L. (2023): A robust multivariate linear regression based on the Mahalanobis distance

Examples


p=5
q=10
n=2000
mu=rep(0,q)
epsilon=mvtnorm::rmvnorm(n = n,mean = mu)
X=mvtnorm::rmvnorm(n=n,mean=rep(0,p))
beta=matrix(rnorm(p*q),ncol=q)
Y=X %*% beta+epsilon
Res_reg=Robust_regression(X,Y)
sum((Res_reg$beta-beta)^2)

[Package RobRegression version 0.1.0 Index]