Logistic regression for large scale data {Rfast2} | R Documentation |
Logistic regression for large scale data
Description
Logistic regression for large scale data.
Usage
batch.logistic(y, x, k = 10)
Arguments
y |
The dependent variable, a numerical vector with 0s and 1s. |
x |
A matrix with the continuous indendent variables. |
k |
The number of batches to use (see details). |
Details
The batch logistic regression cuts the data into k distinct batches. Then performs logistic regression on each of these batches and the in end combines the coefficients in a meta-analytic form, using the fixed effects form. Using these coefficients, the deviance of the model is computed for all data. This method is pretty accurate for large scale data, with say millions, or even tens of millions of observations.
Value
A list including:
res |
A two-column matrix with the regression coefficients and their associated standard errors. |
devi |
The deviance of the logistic regression. |
Author(s)
Michail Tsagris.
R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.
See Also
Examples
y <- rbinom(1000, 1, 0.5)
x <- matrix( rnorm(1000 * 5), ncol = 5 )
## not a very good approximation since the data are not of large scale
batch.logistic(y, x, k = 2)