ICSShiny {ICSShiny} | R Documentation |
Invariant Coordinate Selection With a Shiny App
Description
Performs ICS via a shiny app where the user can change the scatter matrices, explore the output and download graphs and components.
Also the ICS outlier detection framework, from the ICSOutlier
package is available.
It is inspired from the Factoshiny
application of the FactoMineR
package.
Usage
ICSShiny(x, S1 = MeanCov, S2 = Mean3Cov4,
S1args = list(), S2args = list(), seed = NULL,
ncores = NULL, iseed = NULL,
pkg = "ICSOutlier")
Arguments
x |
data matrix or dataframe with at least two numeric variables. Please note that it can contain non-numeric variables, but ICS is only performed on numeric variables. |
S1 |
name of the function which returns the first location vector T1 and scatter matrix S1. See details and |
S2 |
name of the function which returns the second location vector T2 and scatter matrix S2. See details and |
S1args |
list with optional additional arguments when calling function S1. |
S2args |
list with optional additional arguments when calling function S2. |
seed |
to fix a seed when needed in order to fix the thresholds. Default is |
ncores |
number of cores to be used in |
iseed |
If parallel computation is used the seed passed on to |
pkg |
When using parallel computing, a character vector listing all the packages which need to be loaded on the different cores via |
Details
- Choice of the parameters
-
The scatter matrices and their associated location estimators can be selected through the list out of the options:
MeanCov
,Mean3Cov4
,MCD
,TM
. It is also possible to run the application with your own functions as long as they are passed as an argument of the call toICSShiny
. However, in this case it is not possible to run the simulations' steps for now.ICS is only performed on numeric variables. Only non-numeric variables are proposed for labelling and/or categorizing the observations.
- Component selection
-
For computing the kernel densities in the second sub-tab, the weight is given by the Gaussian function and the bandwidth follows the rule of thumb of Silverman (1986).
For the automatic selection of the Invariant Components (IC), the referenced normality tests are the same as in the
comp.norm.test
function:"jarque.test"
,"anscombe.test"
,"bonett.test"
,"agostino.test"
,"shapiro.test"
. All the decisions are corrected from multiple testing by adjusting the levels as incomp.norm.test
. The number of components to keep can also be decided from Monte Carlo simulations trough thecomp.simu.test
function. This parallel analysis method may need a very long time to compute, so it is used only if the user clicks on the 'Launch the test' button.
Value
Returns several tabs on the navigator:
Choice of the parameters |
The scatterplot matrix of an ICS object for the parameters chosen on the left part (variables included/excluded, the location vectors and scatter matrices). |
Component selection |
Three different subtabs to help the user to choose the interesting components. The first sub-tab is the screeplot of the eigenvalues of the ICS object followed by the summary of the analysis. The second sub-tab plots the kernel density of the ICS components. The third sub-tab suggests which components to select, starting from the highest and/or the lowest kurtosis, through different normality tests or simulations. The default values of the slidebar in the left are obtained from |
Matrix scatterplot of invariant components |
The two sub-tabs aim at identifying groups or outliers by using pairwise plots of invariant coordinates. It offers two ways of plotting them: only two invariant components or a scatterplot matrix with up to six invariant components. The left panel allows to color the groups identified by the user and label the observations. |
Outlier identification |
This tab plots outlyingness values for each observation based on the selected components. These squared ICS distances are computed through the |
Descriptive statistics |
This tab gives some descriptive statistics on different subsets of the data (for all the observations, for the observations from a given cluster, for the outlying observations) and enables to compare the sub-populations. The application includes a boxplot, a kernel density, an histogram and some basic statistics: Min, Q1, Mean, Median, Q3 and Max. |
Data Table |
This tab contains the dataset with a nice display and the possibility to choose different sub-populations of the data: all the observations, the observations from a given cluster or the outlying observations. |
Save |
This tab allows to display and save the data table of components and the summary of operations. The data frame contains the components kept in the analysis as well as the distance generated by these components. It also includes the cluster the observation belongs to whether the observation is defined as an outlier, as well as the variables used for labelling and categorizing the data. The data are saved in a csv format. The summary of operations contains a summary of all parameters that were used to obtain the current result, it may be useful for another user who may want to get the same result as the original user. It is saved in a txt format. |
The "Close the session" button closes the application and saves the icsshiny object into the global environment.
Author(s)
Aurore Archimbaud and Joris May
References
Nordhausen, K., Oja, H. and Tyler, D.E. (2008), Tools for exploring multivariate data: The package ICS, Journal of Statistical Software, 28, 1–31. <doi:10.18637/jss.v028.i06>.
Archimbaud, A., Nordhausen, K. and Ruiz-Gazen, A. (2016), ICS for multivariate outlier detection with application to quality control, <https://arxiv.org/pdf/1612.06118.pdf>.
See Also
ics2
,ics.outlier
,
shiny website
Examples
if(interactive()){
library(ICSShiny)
# ICS with ICSShiny:
res.shiny <- ICSShiny(iris)
# Close the session by clicking on the button or closing the navigator's tab
# ICS on a result of an ICSshiny object
ICSShiny(res.shiny)
# ICS with ICSShiny and different parameters
res.shiny <- ICSShiny(iris, S1 = MCD, S1args=list(alpha=0.7), seed = 7587)
# ICS with ICSShiny with parallelization of computations and seed
res.shiny <- ICSShiny(iris, iseed = 1234, ncores = 2)
}