fsmmmdrs {fsdaR} | R Documentation |
Performs random start monitoring of minimum Mahalanobis distance
Description
The trajectories originate from many different random initial subsets and provide information on the presence of groups in the data. Groups are investigated by monitoring the minimum Mahalanobis distance outside the forward search subset.
Usage
fsmmmdrs(
x,
plot = FALSE,
init,
bsbsteps,
nsimul = 200,
nocheck = FALSE,
numpool,
cleanpool = FALSE,
msg = FALSE,
trace = FALSE,
...
)
Arguments
x |
An n x p data matrix (n observations and p variables). Rows of x represent observations, and columns represent variables. Missing values (NA's) and infinite values (Inf's) are allowed, since observations (rows) with missing or infinite values will automatically be excluded from the computations. |
plot |
Plots the random starts minimum Mahalanobis distance with 1
If
Remark: the plot which is produced is very simple. In order to control a series of options
in this plot (including the y scale) and in order to connect it dynamically to the other
forward plots it is necessary to use function |
init |
Point where to start monitoring required diagnostics.
If |
bsbsteps |
A vector which specifies for which steps of the forward
search it is necessary to save the units forming subset for each
random start. if REMARK: The vector bsbsteps must contain numbers from init to n.
if |
nsimul |
Number of random starts. Default value is |
nocheck |
It controls whether to perform checks on matrix Y. If |
numpool |
If REMARK: up to R2013b, there was a limitation on the maximum number of cores that could be addressed by the parallel processing toolbox (8 and, more recently, 12). From R2014a, it is possible to run a local cluster of more than 12 workers. REMARK: Unless you adjust the cluster profile, the default maximum number of workers is the same as the number of computational (physical) cores on the machine. REMARK: In modern computers the number of logical cores is larger than the number of physical cores. By default, MATLAB is not using all logical cores because, normally, hyper-threading is enabled and some cores are reserved to this feature. REMARK: It is because of Remarks 3 that we have chosen as default value for numpool the number of physical cores rather than the number of logical ones. The user can increase the number of parallel pool workers allocated to the multiple start monitoring by:
Therefore, *if a parallel pool is not already open*, UserOption numpool (if set)
overwrites the number of workers set in the local/current profile. Similarly,
the number of workers in the local/current profile overwrites default value of
|
cleanpool |
Set cleanpool |
msg |
Level of output to sidplay. It controls whether to display or not messages
about random start progress. More precisely, if previous option REMARK: in order to create the progress bar when Error using ProgressBar (line 57) Do you have write permissions for C:/Program Files/MATLAB?" |
trace |
Whether to print intermediate results. Default is |
... |
potential further arguments passed to lower level functions. |
Value
Returns an object of class fsmmmdrs.object
.
Author(s)
FSDA team, valentin.todorov@chello.at
References
Atkinson, A.C., Riani, M., and Cerioli, A. (2006), Random Start Forward Searches with Envelopes for Detecting Clusters in Multivariate Data, in: Zani S., Cerioli A., Riani M., Vichi M., Eds., Data Analysis, Classification and the Forward Search, pp. 163-172, Springer Verlag.
Atkinson, A.C. and Riani, M., (2007), Exploratory Tools for Clustering Multivariate Data, Computational Statistics and Data Analysis, Vol. 52, pp. 272-285, doi:10.1016/j.csda.2006.12.034
Riani, M., Cerioli, A., Atkinson, A.C., Perrotta, D. and Torti, F. (2008), Fitting Mixtures of Regression Lines with the Forward Search, in: Mining Massive Data Sets for Security, F. Fogelman-Soulie et al. Eds., pp. 271-286, IOS Press.
Examples
## Not run:
data(hbk, package="robustbase")
out <- fsmmmdrs(hbk[,1:3])
class(out)
summary(out)
## End(Not run)