regSGB {SGB} | R Documentation |
Regression for compositions following a SGB distribution
Description
Explanatory variables may influence the scale vector through a linear model applied to a log-ratio transform of the compositions. The shape parameters do not depend on explanatory variables. The overall shape parameter shape1
is common to all parts, whereas the Dirichlet shape parameters vector shape2
are specific to each part, i.e. shape2[j]
is the Dirichlet parameter for u[i,j]
, i=1,...,n
, (n
=number of compositions in the dataset u
).
Usage
regSGB(d, ...)
## Default S3 method:
regSGB(d, u, V, weight=rep(1,dim(d)[1]),
shape10 = 1, bound = 2.1, ind = NULL, shape1 = NULL, Mean2 = TRUE,
control.optim = list(trace=0,fnscale=-1),
control.outer = list(itmax=1000,ilack.max=200,trace=TRUE, kkt2.check =TRUE,
method = "BFGS"),...)
## S3 method for class 'formula'
regSGB(Formula, data= list(), weight=rep(1,dim(d)[1]),
shape10 = 1, bound = 2.1, ind = NULL, shape1 = 1, Mean2=TRUE,
control.optim = list(trace=0,fnscale=-1),
control.outer = list(itmax=1000,ilack.max=200,trace=TRUE,kkt2.check =TRUE,
method = "BFGS"),...)
## S3 method for class 'regSGB'
print(x, ...)
## S3 method for class 'regSGB'
summary(object, digits=3,...)
Arguments
Formula |
formula of class Formula, see |
d |
data matrix of explanatory variables (without constant vector) |
u |
data matrix of compositions (independent variables) |
V |
log-ratio transformation matrix |
data |
a list with 3 components |
weight |
vector of length |
shape10 |
positive number, initial value of the overall shape parameter, default 1. |
bound |
inequality constraints on the estimates of shapes: |
ind |
vector of length equal to the number of fixed parameters; see |
shape1 |
fixed value of the overall shape parameter if |
Mean2 |
logical, if TRUE (default), the computed |
control.optim |
list of control parameters for optim, see |
control.outer |
list of control parameters to be used by the outer loop in |
object |
an object of class "regSGB". |
digits |
number of decimal places for print, default 3. |
x |
an object of class "regSGB". |
... |
not used. |
Details
It is advisable to use the formula to specify the model for easy comparison between models.
Without formula, the d
matrix of explanatory variables must contain exactly the variables used in the model,
whereas with formula other variables can be included as well.
Variable transformations can be utilized within the formula, see Example 4 below with the indicator I
and the log.
Constraints on parameters can be introduced, see example 5 and EqualityConstr
for more details.
Use weight
for pseudo-likelihood estimation. weight
is scaled to n
, the sample size.
A design based covariance matrix of the parameters can be obtained by linearization as the covariance matrix of the scores
.
Value
A list of class 'regSGB' with the following components:
The first 13 form the output from auglag
.
par |
Vector of length |
value |
The value of the objective function at termination. |
counts |
A vector of length 2 denoting the number of times the objective and its gradient were evaluated, respectively. |
convergence |
An integer code indicating the type of convergence. 0 indicates successful convergence. Positive integer codes indicate failure to converge. |
message |
A character string giving any additional information on convergence returned by |
outer.iteration |
Number of outer iterations. |
lambda |
Values of the Lagrangian parameter. This is a vector of the same length as the total number of inequalities and equalities. It must be zero for inactive inequalities; non-negative for active inequalities; and can have any sign for equalities. |
sigma |
Value of augmented penalty parameter for the quadratic term. |
gradient |
Gradient of the augmented Lagrangian function at convergence. It should be small. |
hessian |
Hessian of the augmented Lagrangian function at convergence. It should be negative definite for maximization. |
ineq |
Values of inequality constraints at convergence. All of them must be non-negative. |
equal |
Values of equality constraints at convergence. All of them must be close to zero. |
kkt1 |
A logical variable indicating whether or not the first-order KKT conditions were satisfied (printed 1 if conditions satisfied and 0 otherwise). |
kkt2 |
A logical variable indicating whether or not the second-order KKT conditions were satisfied (printed 1 if conditions satisfied and 0 otherwise). |
scale |
|
meanA |
Aitchison expectation at estimated parameters. |
fitted.values |
|
residuals |
Observed minus estimated log-ratio transforms. |
scores |
matrix |
Rsquare |
ratio of total variation of |
vcov |
The robust covariance matrix of parameters estimates, see |
StdErr1 |
Ordinary asymptotic standard errors of parameters. |
StdErr |
Robust asymptotic standard errors of parameters. |
fixed.par |
Indices of the fixed parameters. |
summary |
The summary from |
AIC |
AIC criterion. |
V |
log-ratio transformation matrix (same as corresponding input parameter |
call |
Arguments for calling |
Formula |
Expression for formula. |
References
Graf, M. (2017). A distribution on the simplex of the Generalized Beta type. In J. A. Martin-Fernandez (Ed.), Proceedings CoDaWork 2017, University of Girona (Spain), 71-90.
Hijazi, R. H. and R. W. Jernigan (2009). Modelling compositional data using Dirichlet regression models. Journal of Applied Probability and Statistics, 4 (1), 77-91.
Kotz, S., N. Balakrishnan, and N. L. Johnson (2000). Continuous Multivariate Distributions, Volume 1, Models and Applications. John Wiley & Sons.
Madsen, K., H. Nielsen, and O. Tingleff (2004). Optimization With Constraints. Informatics and Mathematical Modelling, Technical University of Denmark.
Monti, G. S., G. Mateu-Figueras, and V. Pawlowsky-Glahn (2011). Notes on the scaled Dirichlet distribution. In V. Pawlowsky-Glahn and A. Buccianti (Eds.), Compositional data analysis. Theory and applications. Wiley.
Varadhan, R. (2015). alabama: Constrained Nonlinear Optimization. R package version 2015.3-1.
Wicker, N., J. Muller, R. K. R. Kalathur, and O. Poch (2008). A maximum likelihood approximation method for Dirichlet parameter estimation. Computational Statistics & Data Analysis 52 (3), 1315-1322.
Zeileis, A. and Y. Croissant (2010). Extended model formulas in R: Multiple parts and multiple responses. Journal of Statistical Software 34 (1), 1-13.
See Also
stepSGB
, for an experimental stepwise descending regression, initpar.SGB
, for the computation of initial parameters.
This function uses Formula
, auglag
.
Examples
## Regression for car segment shares
## ---------------------------------
data(carseg)
## Extract the compositions
uc <- as.matrix(carseg[,(1:5)])
## Extract the explanatory variables
attach(carseg)
## Example 1: without formula
## --------------------------
## Change some variables
dc <- data.frame(l.exp1=log(expend)*PAC,l.exp0=log(expend)*(1-PAC), l.sent=log(sent),
l.FBCF=log(FBCF), l.price=log(price), rates)
## Define the log-ratio transformation matrix
Vc <- matrix(c( 1,0,0,0,
-1,1,0,0,
0,-1,1,0,
0,0,-1,1,
0,0,0,-1),ncol=4,byrow=TRUE)
colnames(Vc) <- c("AB","BC","CD","DE")
rownames(Vc) <- colnames(uc)
Vc
# 2 next rows only necessary when calling regSGB without a formula.
dc1 <- cbind("(Intercept)"= 1 , dc)
dc1 <- as.matrix(dc1)
object10 <- regSGB(dc1,uc, Vc,shape10=4.4)
summary(object10)
## Example 2: same with formula
## ----------------------------
## Define the formula
Form <- Formula(AB | BC | CD | DE ~ l.exp1 + l.exp0 + l.sent + l.FBCF + l.price + rates)
## Regression with formula
object1 <- regSGB(Form, data= list(dc, uc, Vc),shape10=4.4)
summary(object1)
## Example 3: Usage of I()
## -----------------------
Form2 <- Formula(AB | BC | CD | DE ~ I(l.exp1 + l.exp0) + l.exp1 +l.sent +
l.FBCF + l.price + rates )
object2 <- regSGB(Form2,data= list(dc, uc, Vc),shape10=4.4)
object2
## Example 4: Usage of variable transformations on the original file
## -----------------------------------------------------------------
Form3 <- Formula(AB | BC | CD | DE ~ log(expend) + I(PAC*log(expend)) + log(sent) + log(FBCF) +
log(price) + rates)
object3 <- regSGB(Form3, data=list(carseg, uc, Vc),shape10=4.4)
object3
object2[["par"]]-object3[["par"]] # same results
## Example 5: Fixing parameter values
## ----------------------------------
## 1. In the following regression we condition on shape1 = 2.36.
object4 <- regSGB(Form3,data=list(carseg, uc, Vc),
shape10 = 4.4, bound = 2.0, ind = 1, shape1 = 2.36)
summary(object4)
## 2. In the following regression we condition on shape1 = 2.36 and the coefficient of
## log(FBCF).BC = 0. Notice that it is the 19th parameter.
object5 <- regSGB(Form3,data=list(carseg, uc, Vc),
shape10 = 4.4, bound = 2.0, ind = c(1,19) , shape1 = 2.36)
summary(object5)
object3[["AIC"]]
object4[["AIC"]] # largest AIC
object5[["AIC"]]