tclustreg {fsdaR} | R Documentation |
Computes robust linear grouping analysis
Description
Performs robust linear grouping analysis.
Usage
tclustreg(
y,
x,
k,
alphaLik,
alphaX,
restrfactor = 12,
intercept = TRUE,
plot = FALSE,
nsamp,
refsteps = 10,
reftol = 1e-13,
equalweights = FALSE,
mixt = 0,
wtrim = 0,
we,
msg = TRUE,
RandNumbForNini,
trace = FALSE,
...
)
Arguments
y |
Response variable. A vector with |
x |
An n x p data matrix (n observations and p variables). Rows of x represent observations, and columns represent variables. Missing values (NA's) and infinite values (Inf's) are allowed, since observations (rows) with missing or infinite values will automatically be excluded from the computations. |
k |
Number of groups. |
alphaLik |
Trimming level, a scalar between 0 and 0.5 or an
integer specifying the number of observations which have to be trimmed.
If |
alphaX |
Second-level trimming or constrained weighted model for |
restrfactor |
Restriction factor for regression residuals and
covariance matrices of the explanatory variables. Scalar or vector
with two elements. If |
intercept |
wheather to use constant term (default is |
plot |
If |
nsamp |
If a scalar, it contains the number of subsamples which will be extracted.
If |
refsteps |
Number of refining iterations in each subsample. Default is |
reftol |
Tolerance of the refining steps. The default value is 1e-14 |
equalweights |
A logical specifying wheather cluster weights in the concentration
and assignment steps shall be considered. If |
mixt |
Specifies whether mixture modelling or crisp assignment approach to model
based clustering must be used. In the case of mixture modelling parameter mixt also
controls which is the criterion to find the untrimmed units in each step of the maximization.
If |
wtrim |
How to apply the weights on the observations - a flag taking values in c(0, 1, 2, 3, 4).
|
we |
Weights. A vector of size n-by-1 containing application-specific weights Default is a vector of ones. |
msg |
Controls whether to display or not messages on the screen If |
RandNumbForNini |
pre-extracted random numbers to initialize proportions.
Matrix of size k-by-nrow(nsamp) containing the random numbers which
are used to initialize the proportions of the groups. This option is effective only if
|
trace |
Whether to print intermediate results. Default is |
... |
potential further arguments passed to lower level functions. |
Value
An S3 object of class tclustreg.object
Author(s)
FSDA team, valentin.todorov@chello.at
References
Mayo-Iscar A. (2016). The joint role of trimming and constraints in robust estimation for mixtures of gaussian factor analyzers, Computational Statistics and Data Analysis", Vol. 99, pp. 131-147.
Garcia-Escudero, L.A., Gordaliza, A., Greselin, F., Ingrassia, S. and Mayo-Iscar, A. (2017), Robust estimation of mixtures of regressions with random covariates, via trimming and constraints, Statistics and Computing, Vol. 27, pp. 377-402.
Garcia-Escudero, L.A., Gordaliza A., Mayo-Iscar A., and San Martin R. (2010). Robust clusterwise linear regression through trimming, Computational Statistics and Data Analysis, Vol. 54, pp.3057-3069.
Cerioli, A. and Perrotta, D. (2014). Robust Clustering Around Regression Lines with High Density Regions. Advances in Data Analysis and Classification, Vol. 8, pp. 5-26.
Torti F., Perrotta D., Riani, M. and Cerioli A. (2019). Assessing Robust Methodologies for Clustering Linear Regression Data, Advances in Data Analysis and Classification, Vol. 13, pp 227-257.
Examples
## Not run:
## The X data have been introduced by Gordaliza, Garcia-Escudero & Mayo-Iscar (2013).
## The dataset presents two parallel components without contamination.
data(X)
y1 = X[, ncol(X)]
X1 = X[,-ncol(X), drop=FALSE]
(out <- tclustreg(y1, X1, k=2, alphaLik=0.05, alphaX=0.01, restrfactor=5, plot=TRUE, trace=TRUE))
(out <- tclustreg(y1, X1, k=2, alphaLik=0.05, alphaX=0.01, restrfactor=2,
mixt=2, plot=TRUE, trace=TRUE))
## Examples with fishery data
data(fishery)
X <- fishery
## some jittering is necessary because duplicated units are not treated:
## this needs to be addressed
X <- X + 10^(-8) * abs(matrix(rnorm(nrow(X)*ncol(X)), ncol=2))
y1 <- X[, ncol(X)]
X1 <- X[, -ncol(X), drop=FALSE]
(out <- tclustreg(y1, X1, k=3, restrfact=50, alphaLik=0.04, alphaX=0.01, trace=TRUE))
## Example 2:
## Define some arbitrary weightssome arbitrary weights for the units
we <- sqrt(X1)/sum(sqrt(X1))
## tclustreg required parameters
k <- 2; restrfact <- 50; alpha1 <- 0.04; alpha2 <- 0.01
## Now tclust is run on each combination of mixt and wtrim options
cat("\nmixt=0; wtrim=0",
"\nStandard tclustreg, with classification likelihood and without thinning\n")
(out <- tclustreg(y1, X1, k=k, restrfact=restrfact, alphaLik=alpha1, alphaX=alpha2,
mixt=0, wtrim=0, trace=TRUE))
cat("\nmixt=2; wtrim=0",
"\nMixture likelihood, no thinning\n")
(out <- tclustreg(y1, X1, k=k, restrfact=restrfact, alphaLik=alpha1, alphaX=alpha2,
mixt=2, wtrim=0, trace=TRUE))
cat("\nmixt=0; wtrim=1",
"\nClassification likelihood, thinning based on user weights\n")
(out <- tclustreg(y1, X1, k=k, restrfact=restrfact, alphaLik=alpha1, alphaX=alpha2,
mixt=0, we=we, wtrim=1, trace=TRUE))
cat("\nmixt=2; wtrim=1",
"\nMixture likelihood, thinning based on user weights\n")
(out <- tclustreg(y1, X1, k=k, restrfact=restrfact, alphaLik=alpha1, alphaX=alpha2,
mixt=2, we=we, wtrim=1, trace=TRUE))
cat("\nmixt=0; wtrim=2",
"\nClassification likelihood, thinning based on retention probabilities\n")
(out <- tclustreg(y1, X1, k=k, restrfact=restrfact, alphaLik=alpha1, alphaX=alpha2,
mixt=0, wtrim=2, trace=TRUE))
cat("\nmixt=2; wtrim=2",
"\nMixture likelihood, thinning based on retention probabilities\n")
(out <- tclustreg(y1, X1, k=k, restrfact=restrfact, alphaLik=alpha1, alphaX=alpha2,
mixt=2, wtrim=2, trace=TRUE))
cat("\nmixt=0; wtrim=3",
"\nClassification likelihood, thinning based on bernoulli weights\n")
(out <- tclustreg(y1, X1, k=k, restrfact=restrfact, alphaLik=alpha1, alphaX=alpha2,
mixt=0, wtrim=3, trace=TRUE))
cat("\nmixt=2; wtrim=3",
"\nMixture likelihood, thinning based on bernoulli weights\n")
(out <- tclustreg(y1, X1, k=k, restrfact=restrfact, alphaLik=alpha1, alphaX=alpha2,
mixt=2, wtrim=3, trace=TRUE))
cat("\nmixt=0; wtrim=4",
"\nClassification likelihood, tandem thinning based on bernoulli weights\n")
(out <- tclustreg(y1, X1, k=k, restrfact=restrfact, alphaLik=alpha1, alphaX=alpha2,
mixt=0, wtrim=4, trace=TRUE))
cat("\nmixt=2; wtrim=4",
"\nMixture likelihood, tandem thinning based on bernoulli weights\n")
(out <- tclustreg(y1, X1, k=k, restrfact=restrfact, alphaLik=alpha1, alphaX=alpha2,
mixt=2, wtrim=4, trace=TRUE))
## End(Not run)