gamlssML {gamlss} | R Documentation |
Maximum Likelihood estimation of a simple GAMLSS model
Description
The function gamlssML()
fits a gamlss.family
distribution to single data set using a non linear maximisation algorithm in R
.
This is relevant only when explanatory variables do not exist.
The function gamlssMLpred()
is similar to gamlssML()
but it saves the predictive global deviance for the newdata
. The new data in gamlssMLpred()
can be given with the arguments newdata
or defining the factor rand
. rand
should be a binary factor rand
splitting the original data set into a training set (value 1) and a validation/test set (values 2), see
also gamlssVGD
Usage
gamlssML(formula, family = NO, weights = NULL, mu.start = NULL,
sigma.start = NULL, nu.start = NULL, tau.start = NULL,
mu.fix = FALSE, sigma.fix = FALSE, nu.fix = FALSE,
tau.fix = FALSE, data, start.from = NULL, ...)
gamlssMLpred(response = NULL, data = NULL, family = NO,
rand = NULL, newdata = NULL, ...)
Arguments
formula , response |
a vector of data requiring the fit of a |
family |
|
weights |
a vector of weights.
Here weights can be used to weight out observations (like in |
mu.start |
a scalar of initial values for the location parameter |
sigma.start |
a scalar of initial values for the scale parameter |
nu.start |
scalar of initial values for the parameter |
tau.start |
scalar of initial values for the parameter |
mu.fix |
whether the mu parameter should be kept fixed in the fitting processes e.g. |
sigma.fix |
whether the sigma parameter should be kept fixed in the fitting processes e.g. |
nu.fix |
whether the nu parameter should be kept fixed in the fitting processes e.g. |
tau.fix |
whether the tau parameter should be kept fixed in the fitting processes e.g. |
data |
a data frame containing the variable |
start.from |
a gamlss object to start from the fitting or vector of length as many parameters in the distribution |
rand |
For |
newdata |
The prediction data set (validation or test). |
... |
for extra arguments |
Details
The function gamlssML()
fits a gamlss.family
distribution to a single data set is using a non linear maximisation.
in fact it uses the internal function MLE()
which is a copy of the mle()
function of package stat4
.
The function gamlssML()
could be for large data faster than the equivalent gamlss()
function which is designed for regression type of models.
The function gamlssMLpred()
uses the function gamlssML()
to fit the model but then uses predict.gamlssML()
to predict for new data and saves the the prediction i) deviance increments, ii) global deviance iii) residuals.
Value
Returns a gamlssML
object which behaves like a gamlss
fitted objected
Author(s)
Mikis Stasinopoulos, Bob Rigby, Vlasis Voudouris and Majid Djennad
References
Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.
Rigby, R. A., Stasinopoulos, D. M., Heller, G. Z., and De Bastiani, F. (2019) Distributions for modeling location, scale, and shape: Using GAMLSS in R, Chapman and Hall/CRC. An older version can be found in https://www.gamlss.com/.
Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, https://www.jstatsoft.org/v23/i07/.
Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.
(see also https://www.gamlss.com/).
See Also
Examples
#-------- negative binomial 1000 observations
y<- rNBI(1000)
system.time(m1<-gamlss(y~1, family=NBI))
system.time(m1a<-gamlss(y~1, family=NBI, trace=FALSE))
system.time(m11<-gamlssML(y, family=NBI))
AIC(m1,m1a,m11, k=0)
# neg. binomial n=10000
y<- rNBI(10000)
system.time(m1<-gamlss(y~1, family=NBI))
system.time(m1a<-gamlss(y~1, family=NBI, trace=FALSE))
system.time(m11<-gamlssML(y, family=NBI))
AIC(m1,m1a,m11, k=0)
# binomial type data
data(aep)
m1 <- gamlssML(aep$y, family=BB) # ok
m2 <- gamlssML(y, data=aep, family=BB) # ok
m3 <- gamlssML(y~1, data=aep, family=BB) # ok
m4 <- gamlssML(aep$y~1, family=BB) # ok
AIC(m1,m2,m3,m4)
## Not run:
#-----------------------------------------------------------
# neg. binomial n=10000
y<- rNBI(10000)
rand <- sample(2, length(y), replace=TRUE, prob=c(0.6,0.4))
table(rand)
Y <- subset(y, rand==1)
YVal <- subset(y, rand==2)
length(Y)
length(YVal)
da1 <- data.frame(y=y)
dim(da1)
da2 <- data.frame(y=Y)
dim(da2)
danew <- data.frame(y=YVal)
# using gamlssVGD to fit the models
g1 <- gamlssVGD(y~1, rand=rand, family=NBI, data=da1)
g2 <- gamlssVGD(y~1, family=NBI, data=da2, newdata=dan)
AIC(g1,g2)
VGD(g1,g2)
# using gamlssMLpred to fit the models
p1 <- gamlssMLpred(y, rand=rand, family=NBI)
p2 <- gamlssMLpred(Y, family=NBI, newdata=YVal)
# AIC and VGD should produce identical results
AIC(p1,p2,g1,g2)
VGD(p1,p2, g1,g2)
# the fitted residuals
wp(p1, ylim.all=1)
# the prediction residuals
wp(resid=p1$residVal, ylim.all=.5)
#-------------------------------------------------------------
# chossing between distributions
p2<-gamlssMLpred(y, rand=rand, family=PO)
p3<-gamlssMLpred(y, rand=rand, family=PIG)
p4<-gamlssMLpred(y, rand=rand, family=BNB)
AIC(p1, p2, p3, p4)
VGD(p1, p2, p3, p4)
#--------------------------------------------------
## End(Not run)