get_stats.coin {COINr}  R Documentation 
Statistics of indicators
Description
Given a coin and a specified data set (dset
), returns a table of statistics with entries for each column.
Usage
## S3 method for class 'coin'
get_stats(
x,
dset,
t_skew = 2,
t_kurt = 3.5,
t_avail = 0.65,
t_zero = 0.5,
t_unq = 0.5,
nsignif = 3,
out2 = "df",
...
)
Arguments
x 
A coin 
dset 
A data set present in 
t_skew 
Absolute skewness threshold. See details. 
t_kurt 
Kurtosis threshold. See details. 
t_avail 
Data availability threshold. See details. 
t_zero 
A threshold between 0 and 1 for flagging indicators with high proportion of zeroes. See details. 
t_unq 
A threshold between 0 and 1 for flagging indicators with low proportion of unique values. See details.plot 
nsignif 
Number of significant figures to round the output table to. 
out2 
Either 
... 
arguments passed to or from other methods. 
Details
The statistics (columns in the output table) are as follows (entries correspond to each column):

Min
: the minimum 
Max
: the maximum 
Mean
: the (arirthmetic) mean 
Median
: the median 
Std
: the standard deviation 
Skew
: the skew 
Kurt
: the kurtosis 
N.Avail
: the number of nonNA
values 
N.NonZero
: the number of nonzero values 
N.Unique
: the number of unique values 
Frc.Avail
: the fraction of nonNA
values 
Frc.NonZero
: the fraction of nonzero values 
Frc.Unique
: the fraction of unique values 
Flag.Avail
: a data availability flag  columns withFrc.Avail < t_avail
will be flagged as"LOW"
, else"ok"
. 
Flag.NonZero
: a flag for columns with a high proportion of zeros. Any columns withFrc.NonZero < t_zero
are flagged as"LOW"
, otherwise"ok"
. 
Flag.Unique
: a unique value flag  any columns withFrc.Unique < t_unq
are flagged as"LOW"
, otherwise"ok"
. 
Flag.SkewKurt
: a skew and kurtosis flag which is an indication of possible outliers. Any columns withabs(Skew) > t_skew
ANDKurt > t_kurt
are flagged as"OUT"
, otherwise"ok"
.
The aim of this table, among other things, is to check the basic statistics of each column/indicator, and identify
any possible issues for each indicator. For example, low data availability, having a high proportion of zeros and/or
a low proportion of unique values. Further, the combination of skew and kurtosis (i.e. the Flag.SkewKurt
column)
is a simple test for possible outliers, which may require treatment using Treat()
.
The table can be returned either to the coin or as a standalone data frame  see out2
.
See also vignette("analysis")
.
Value
Either a data frame or updated coin  see out2
.
Examples
# build example coin
coin < build_example_coin(up_to = "new_coin", quietly = TRUE)
# get table of indicator statistics for raw data set
get_stats(coin, dset = "Raw", out2 = "df")