R: Creates an 'FSR_control' object

FSR_control {fsdaR}

R Documentation

Creates an `FSR_control` object

Description

Creates an object of class FSR_control to be used with the fsreg() function, containing various control parameters.

Usage

FSR_control(intercept = TRUE, h, nsamp = 1000, lms = 1, init, nocheck = FALSE, 
    bonflev = "", msg = TRUE, bsbmfullrank = TRUE, 
    plot = FALSE, bivarfit = FALSE, multivarfit = FALSE, 
    labeladd = FALSE, nameX, namey, ylim, xlim)

Arguments

`intercept`	Indicator for constant term. Scalar. If `intercept=TRUE`, a model with constant term will be fitted (default), else, no constant term will be included.
`h`	The number of observations that have determined the least trimmed squares estimator, scalar. `h` is an integer greater or equal than `p` but smaller then `n`. Generally if the purpose is outlier detection `h=[0.5*(n+p+1)]` (default value). `h` can be smaller than this threshold if the purpose is to find subgroups of homogeneous observations. In this function the LTS/LMS estimator is used just to initialize the search.
`nsamp`	Number of subsamples which will be extracted to find the robust estimator, scalar. If `nsamp=0` all subsets will be extracted. They will be `(n choose p)`. If the number of all possible subset is `<1000` the default is to extract all subsets otherwise just 1000.
`lms`	Criterion to use to find the initial subset to initialize the search (LMS, LTS with concentration steps, LTS without concentration steps or subset supplied directly by the user). The default value is 1 (Least Median of Squares is computed to initialize the search). On the other hand, if the user wants to initialze the search with LTS with all the default options for concentration steps then lms=2. If the user wants to use LTS without concentration steps, lms can be a scalar different from 1 or 2. If lms is a list it is possible to control a series of options for concentration steps (for more details see option `lms` inside `LXS_control`). If, on the other hand, the user wants to initialize the search with a prespecified set of units there are two possibilities: lms can be a vector with length greater than 1 which contains the list of units forming the initial subset. For example, if the user wants to initialize the search with units 4, 6 and 10 then `lms=c(4, 6, 10)`; lms is a struct which contains a field named bsb which contains the list of units to initialize the search. For example, in the case of simple regression through the origin with just one explanatory variable, if the user wants to initialize the search with unit 3 then `lms=list(bsb=3)`.
`init`	Search initialization, scalar. It specifies the initial subset size to start monitoring exceedances of minimum deletion residual, if init is not specified it set equal to: `p+1`, if the sample size is smaller than 40 or `min(3p+1,floor(0.5(n+p+1)))`, otherwise. For example, if `init=100`, the procedure starts monitoring from step `m=100`.
`nocheck`	Check input arguments, scalar. If `nocheck=TRUE` no check is performed on matrix `y` and matrix `X`. Notice that `y` and `X` are left unchanged. In other words the additional column of ones for the intercept is not added. As default `nocheck=FALSE`.
`bonflev`	Option to be used if the distribution of the data is strongly non normal and, thus, the general signal detection rule based on consecutive exceedances cannot be used. In this case bonflev can be: a scalar smaller than 1 which specifies the confidence level for a signal and a stopping rule based on the comparison of the minimum MD with a Bonferroni bound. For example if bonflev=0.99 the procedure stops when the trajectory exceeds for the first time the 99% bonferroni bound. A scalar value greater than 1. In this case the procedure stops when the residual trajectory exceeds for the first time this value. Default value is ”, which means to rely on general rules based on consecutive exceedances.
`msg`	Controls whether to display or not messages on the screen If `msg==1` (default) messages are displayed on the screen about step in which signal took place else no message is displayed on the screen.
`bsbmfullrank`	How to behave in case subset at step m (say bsbm) produces a singular X. In other words, this options controls what to do when `rank(X[bsbm, ])` is smaller then number of explanatory variables. If `bsbmfullrank=1` (default) these units (whose number is say mnofullrank) are constrained to enter the search in the final n-mnofullrank steps else the search continues using as estimate of beta at step m the estimate of beta found in the previous step.
`plot`	Plot on the screen. Scalar. If `plot=TRUE` the plot of minimum deletion residual with envelopes based on n observations and the scatterplot matrix with the outliers highlighted is produced. If `plot=2` the user can also monitor the intermediate plots based on envelope superimposition. If `plot=FALSE` (default) no plot is produced.
`bivarfit`	Wheather to superimpose bivariate least square lines on the plot (if `plot=TRUE`. This option adds one or more least squares lines, based on SIMPLE REGRESSION of y on Xi, to the plots of y\|Xi. The default is `bivarfit=FALSE`: no line is fitted. If `bivarfit=1`, a single OLS line is fitted to all points of each bivariate plot in the scatter matrix y\|X. If `bivarfit=2`, two OLS lines are fitted: one to all points and another to the group of the genuine observations. The group of the potential outliers is not fitted. If `bivarfit=0` one OLS line is fitted to each group. This is useful for the purpose of fitting mixtures of regression lines. If `bivarfit='i1'` or `bivarfit='i2'`, etc. an OLS line is fitted to a specific group, the one with index 'i' equal to 1, 2, 3 etc. Again, useful in case of mixtures.
`multivarfit`	Wheather to superimpose multivariate least square lines. This option adds one or more least square lines, based on MULTIVARIATE REGRESSION of y on X, to the plots of y\|Xi. The default is `multivarfit=FALSE`: no line is fitted. If `bivarfit=1`, a single OLS line is fitted to all points of each bivariate plot in the scatter matrix y\|X. The line added to the scatter plot y\|Xi is avconst + Ci*Xi, where Ci is the coefficient of Xi in the multivariate regression and avconst is the effect of all the other explanatory variables different from Xi evaluated at their centroid (that is overline(y)'C)). If `multivarfit=2`, same action as with `multivarfit=1` but this time we also add the line based on the group of unselected observations (i.e. the normal units).
`labeladd`	Add outlier labels in plot. If `labeladd=TRUE`, we label the outliers with the unit row index in matrices X and y. The default value is `labeladd=FALSE`, i.e. no label is added.
`nameX`	Add variable labels in plot. A vector of strings of length `p` containing the labels of the variables of the regression dataset. If it is empty (default) the sequence `X1, ..., Xp` will be created automatically
`namey`	Add response label. A string containing the label of the response
`ylim`	Control `y` scale in plot. Vector with two elements controlling minimum and maximum on the y axis. Default is to use automatic scale.
`xlim`	Control `x` scale in plot. Vector with two elements controlling minimum and maximum on the x axis. Default is to use automatic scale.

Details

Creates an object of class FSR_control to be used with the fsreg() function, containing various control parameters.

Value

An object of class "FSR_control" which is basically a list with components the input arguments of the function mapped accordingly to the corresponding Matlab function.

Author(s)

FSDA team

Examples

## Not run: 
data(hbk, package="robustbase")
(out <- fsreg(Y~., data=hbk, method="FS", control=FSR_control(h=56, nsamp=500, lms=2)))
summary(out)

## End(Not run)

[Package fsdaR version 0.9-0 Index]

Creates an FSR_control object