GPUglm {GPUmatrix}R Documentation

Fitting Generalized Linear Models using GPUmatrix objects

Description

These functions mimic the functions speedglm and speedglm.wfit of the library 'speedglm' to compute on gpu.matrix-class objects. At the same time, these functions mimic the functions glm, and glm.fit from the library 'stats' to compute on large data sets.

Usage

glm.fit.GPU(x, y, intercept = TRUE, weights = NULL, family =
                   gaussian(), start = NULL, etastart = NULL, mustart =
                   NULL, offset = NULL, acc = 1e-08, maxit = 25, k = 2,
                   sparse = NULL, trace = FALSE, dtype = "float64", device =
                   NULL, type = NULL, ...)

GPUglm(...)

Arguments

As mentioned in the description, these functions mimic speedglm, so almost every parameter does too. There is only three new parameters explained below.

The common parameters with speedglm:

x

the same as speedglm: the design matrix of dimension n*p where n is the number of observations and p is the number of features. x can be either a 'matrix', 'Matrix' or 'gpu.matrix-class' object.

y

the same as speedglm: a vector of n observations. y can be either a 'matrix', 'Matrix' or 'gpu.matrix-class' object.

intercept

the same as speedglm: Logical. If first column of x should be consider as 'intercept' (default) or not. Notice that seting this parameter TRUE or FALSE will not change the design matrix used to fit the model.

weights

the same as speedglm: an optional vector of ‘prior weights’ to be used in the fitting process. Should be NULL (default) or a numeric vector.

family

the same as speedglm: a description of the error distribution and link function to be used in the model. For glm.fit.GPU this can be a character string naming a family function, a family function or the result of a call to a family function. (See family for details of family functions.)

start

the same as speedglm: starting values for the parameters in the linear prediction.

etastart

the same as speedglm: starting values for the linear predictor.

mustart

the same as speedglm: starting values for the vector of means.

offset

the same as speedglm: this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be NULL or a numeric vector of length equal to the number of cases. One or more offset terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See model.offset.

acc

the same as speedglm: tolerance to be used for the estimation (by default equal to: 1e-08).

maxit

the same as speedglm: maximum number of iterations.

k

the same as speedglm: numeric, the penalty per parameter to be used; the default k = 2 is the classical AIC.

sparse

if matrix x is desired to be treated as sparse. Not yet implemented.

trace

If the user wants to see the development of the iterations. By default FALSE

...

For GPUglm: arguments to be used to form the default control argument if it is not supplied directly.

The glm.fit.GPU function internally initialises matrices of the 'GPUmatrix' class by calling the gpu.matrix function. The following parameters correspond to this function:

dtype

parameter of the function gpu.matrix: "data type. User can indicate "float64", "float32" or "int" for "int64"." By default it is set to 'float64'.

device

parameter of the function gpu.matrix:"It indicates the device to load cuda. If not indicated, 'device' will be set to 'cuda' if it is available."

type

parameter of the function gpu.matrix: "If gpu.matrix is 'torch' (by default if type is NULL) or "tensorflow"."

Details

The GPUglm function internally calls the glm function by selecting glm.fit.GPU as the method. The input parameters of the GPUglm function are equivalent to those of the glm function.

If the gpu.matrix-class object(s) are stored on the GPU, then the operations will be performed on the GPU. See gpu.matrix.

Value

Both glmGPU, and glm.fit.GPU returns an object of class "GPUglm". This object can be treated as a list. This object mimics the output of the function speedglm:

coefficients

the estimated coefficients.

logLik

the log likelihood of the fitted model.

iter

the number of iterations of IWLS used.

tol

the maximal value of tolerance reached.

family

the maximal value of tolerance reached.

link

the link function used.

df

the degrees of freedom of the model.

XTX

the product X'X (weighted, if the case).

dispersion

the estimated dispersion parameter of the model.

ok

the set of column indeces of the model matrix where the model has been fitted.

rank

the rank of the model matrix.

RSS

the estimated residual sum of squares of the fitted model.

method

TODO

aic

the estimated Akaike Information Criterion.

offset

he model offset.

sparse

a logical value which indicates if the model matrix is sparse.

deviance

the estimated deviance of the fitted model.

nulldf

the degrees of freedom of the null model.

nulldev

the estimated deviance of the null model.

ngoodobs

the number of non-zero weighted observations.

n

the number of observations.

intercept

a logical value which indicates if an intercept has been used.

convergence

a logical value which indicates if convergence was reached.

terms

the terms object used.

call

the matched call.

xlevels

(where relevant) a record of the levels of the factors used in fitting.

See Also

See also: speedglm and glm.

Also of interest may be the function LR_GradientConjugate_gpumatrix for logistic regression.

Examples



## Not run: 
require(MASS,quietly = TRUE)
require(stats,quietly = TRUE)

# linear model (example taken from 'glm'):

utils::data(anorexia, package = "MASS")
anorex_glm <- glm(Postwt ~ Prewt + Treat + offset(Prewt),
                  family = gaussian(), data = anorexia)
summary(anorex_glm)

#Using GPUglm:
anorex_GPUglm <- GPUglm(Postwt ~ Prewt + Treat + offset(Prewt),
                        family = gaussian, data = anorexia)
summary(anorex_GPUglm)

#linear model using glm.fit.gpu
x <- model.matrix(~Treat+Prewt,data=anorexia)
y <- as.matrix(anorexia$Postwt)
s1_glm <- glm.fit(x=x,y=y)
s1_gpu <- glm.fit.GPU(x=x,y=y)

s1_glm$coefficients
s1_gpu$coefficients


# poisson (example taken from 'glm'):
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
summary(glm.D93)

gpu.glm.D93 <- GPUglm(counts ~ outcome + treatment, family = poisson())
summary(gpu.glm.D93)

#logistic:
data(menarche)
glm.out <- glm(cbind(Menarche, Total-Menarche) ~ Age, family=binomial(), data=menarche)
summary(glm.out)

glm.out_gpu <- GPUglm(cbind(Menarche, Total-Menarche) ~ Age, family=binomial(), data=menarche)
summary(glm.out_gpu)

#can be also called using glm.fit.gpu:
new_menarche <- data.frame(Age=rep(menarche$Age,menarche$Total))
observations <- c()
for(i in 1:nrow(menarche)){
  observations <- c(observations,rep(c(0,1),c(menarche$Total[i]-menarche$Menarche[i],
                                              menarche$Menarche[i])))
}
new_menarche$observations <- observations
x <- model.matrix(~Age,data=new_menarche)
head(new_menarche)
glm.fit_gpu <- glm.fit.GPU(x=x,y=new_menarche$observations, family=binomial())
summary(glm.fit_gpu)

#GPUmatrix package also include the function 'LR_GradientConjugate_gpumatrix'
lr_gran_sol <- LR_GradientConjugate_gpumatrix(X = x,y = observations)

#check results
glm.out$coefficients
glm.out_gpu$coefficients
glm.fit_gpu$coefficients
lr_gran_sol

## End(Not run)



[Package GPUmatrix version 1.0.2 Index]