dbglm {dbglm} | R Documentation |
Fast generalized linear model in a database
Description
Fast generalized linear model in a database
Usage
dbglm(formula, family = binomial(), tbl, sd = FALSE,
weights = .NotYetImplemented(), subset = .NotYetImplemented(), ...)
Arguments
... |
This argument is required for S3 method extension. |
formula |
A model formula. It can have interactions but cannot have any transformations except |
family |
Model family |
tbl |
An object inheriting from |
sd |
Experimental: compute the standard deviation of the score as well as the mean in the update and use it to improve the information matrix estimate |
weights |
We don't support weights |
subset |
If you want to analyze a subset, use |
Details
For a dataset of size N
the subsample is of size N^(5/9)
. Unless N
is large the approximation won't be very good. Also, with small N
it's quite likely that, eg, some factor levels will be missing in the subsample.
Value
A list with elements
tildebeta |
coefficients from subsample |
hatbeta |
final estimate |
tildeV |
variance matrix from subsample |
hatV |
final estimate |
References
http://notstatschat.tumblr.com/post/171570186286/faster-generalised-linear-models-in-largeish-data