overview-sn {sn} | R Documentation |
Package sn: overview of the structure and main commands
Description
The package provides facilities to build and manipulate probability
distributions of the skew-normal (SN) and some related families,
notably the skew-t
(ST) and the ‘unified skew-normal’
(SUN) families.
For the SN, ST and skew-Cauchy (SC) families,
statistical methods are made available for data fitting and model diagnostics,
in the univariate and the multivariate case.
Two main sides
The package comprises two main sides: one side provides facilities for the pertaining probability distributions; the other one deals with related statistical methods. Underlying formulation, parameterizations of distributions and terminology are in agreement with the monograph of Azzalini and Capitanio (2014).
Probability side
There are two layers of support for the probability distributions of interest. At the basic level, there exist functions which follow the classical R scheme for distributions. In addition, there exists facilities to build an object which incapsulates a probability distribution and then certain operations can be be performed on such an object; these probability objects operate according to the S4 protocol. The two schemes are described next.
- Classical R scheme
-
The following functions work similary to
{d,p,q,r}norm
and other R functions for probability distributions:skew-normal (SN): functions
{d,p,q,r}sn
for the univariate case, functions{d,p,r}msn
for the multivariate case, where in both cases the ‘Extended skew-normal’ (ESN) variant form is included;skew-
t
(ST): functions{d,p,q,r}st
for the univariate case, functions{d,p,r}mst
for the multivariate case;skew-Cauchy (SC): functions
{d,p,q,r}sc
for the univariate case, functions{d,p,r}msc
for the multivariate case.
In addition to the usual specification of their parameters as a sequence of individual components, a parameter set can be specified as a single
dp
entity, namely a vector in the univariate case, a list in the multivariate case;dp
stands for ‘Direct Parameters’ (DP).Another set of parameters is in use, denoted Centred Parameters (CP), which are more convenient for interpretability, since they correspond to familiar quantifies, such as the mean and the standard deviation. Conversion from the
dp
parameter set to the corresponding CP parameters can be accomplished using the functiondp2cp
, while functioncp2dp
performs the inverse transformation.The SUN family is mainly targeted to the multivariate context, and this is reflected in the organization of the pertaining functions, although univariate SUN distributions are supported. Density, distribution function and random numbers are handled by
{d,p,r}sun
. Mean value, variance matrix and Mardia's measures of multivariate skewness and kurtosis are computed bysun{Mean,Vcov,Mardia}
.In addition, one can introduce a user-specified density function using
dSymmModulated
anddmSymmModulated
, in the univariate and the multivariate case, respectively. These densities are of the ‘symmetry-modulated’ type, also called ‘skew-symmetric’, where one can specify the base density and the modulation factor with high degree of flexibility. Random numbers can be sampled using the corresponding functionsrSymmModulated
andrmSymmModulated
. In the bivariate case, a dedicated plotting function exists. - Probability distribution objects: SEC families
-
Function
makeSECdistr
can be used to build a ‘SEC distribution’ object representing a member of a specified parametric family (among the types SN, ESN, ST, SC) with a givendp
parameter set. This object can be used for various operations such as plotting or extraction of moments and other summary quantities. Another way of constructing a SEC distribution object is viaextractSECdistr
which extracts suitable components of an object produced by functionselm
to be described below.Additional operations on these objects are possible in the multivariate case, namely
marginalSECdistr
andaffineTransSECdistr
for marginalization and affine trasformations. For the multivariate SN family only (but including ESN),conditionalSECdistr
performs a conditioning on the values taken on by some components of the multivariate variable. - Probability distribution objects: the SUN family
-
Function
makeSUNdistr
can be used to build a SUN distribution object representing a member of the SUN parametric family. This object can be used for various operations such as plotting or extraction of moments and other summary quantities.Moreover there are several trasformation operations which can be performed on a SUN distribution object, or two such objects in some cases: computing a (multivariate) marginal distribution, a conditional distribution (on given values of some components or on one-sided intervals), an affine trasformation, a convolution (that is, the distribution of the sum of two independent variables), and joining two distributions under assumption of independence.
Statistics side
The main function for data fitting is represented by selm
, which allows
to specify a linear regression model for the location parameter, similarly
to function lm
, but assuming a skew-elliptical distribution
of the random component;
this explains the name selm=(se+lm). Allowed types of distributions
are SN (but not ESN), ST and SC.
The fitted distribution is univariate or multivariate, depending on the nature
of the response variable of the posited regression model. The model fitting
method is either maximum likelihood or maximum penalized likelihood;
the latter option effectively allows the introduction of a prior distribution
on the slant parameter of the error distribution, hence leading to a
‘maximum a posteriori’ estimate.
Once the fitting process has been accomplished, an object of class either
selm (for univariate response) or mselm (for multivariate
response) is produced.
A number of ‘methods’ are available for these objects: show
,
plot
, summary
, coef
, residuals
, logLik
and others.
For univariate selm-class objects, univariate and bivariate profile
log-likelihood functions can be obtained; a predict
method also exists.
These methods are built following the S4 protocol; however, the user must not
be concerned with the choice of the adopted protocol (unless this is wished).
The actual fitting process invoked via selm
is actually performed by a
set of lower-level procedures. These are accessible for direct call,
if so wished, typically for improved efficiency, at the expense of a little
additional programming effort. Similarly, functions to compute the Fisher
information matrix are available, in the expected and the observed form
(with some restrictions depending on the selected distribution).
The extractSECdistr
function extracts the fitted SEC
distribution from selm-class and mselm-class objects, hence
providing a bridge with the probability side of the package.
The facilities for statistical work do not support the SUN family.
Additional material
Additional material is available in the section
‘User guides, package vignettes and other documentation’
accessible from the front page of the documentation.
See especially the document pkg_sn-intro.pdf
Author
Adelchi Azzalini. Please send comments, error reports et cetera to the author, whose web page is http://azzalini.stat.unipd.it/.
References
Azzalini, A. with the collaboration of Capitanio, A. (2014). The Skew-Normal and Related Families. Cambridge University Press, IMS Monographs series.