Logistic regression for large scale data {Rfast2}R Documentation

Logistic regression for large scale data

Description

Logistic regression for large scale data.

Usage

batch.logistic(y, x, k = 10)

Arguments

y

The dependent variable, a numerical vector with 0s and 1s.

x

A matrix with the continuous indendent variables.

k

The number of batches to use (see details).

Details

The batch logistic regression cuts the data into k distinct batches. Then performs logistic regression on each of these batches and the in end combines the coefficients in a meta-analytic form, using the fixed effects form. Using these coefficients, the deviance of the model is computed for all data. This method is pretty accurate for large scale data, with say millions, or even tens of millions of observations.

Value

A list including:

res

A two-column matrix with the regression coefficients and their associated standard errors.

devi

The deviance of the logistic regression.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

See Also

binom.reg, sclr

Examples

y <- rbinom(1000, 1, 0.5)
x <- matrix( rnorm(1000 * 5), ncol = 5 )
## not a very good approximation since the data are not of large scale
batch.logistic(y, x, k = 2) 

[Package Rfast2 version 0.1.5.2 Index]