R: Reading Data File Larger than Memory for Fitting GLMs Using...

make.data {drglm}

R Documentation

Reading Data File Larger than Memory for Fitting GLMs Using `big.drglm` Function

Description

Reading Data File Larger than Memory for Fitting GLMs Using big.drglm Function

Usage

make.data(filename, chunksize, ...)

Arguments

`filename`	Path to the data set on disk.
`chunksize`	Size of the chunk or subset to be read from the large file for fitting GLMs.
`...`	Additional arguments to be passed to `read.csv`.

Value

A function that reads chunks of the data set.

Examples

# Create a toy dataset
set.seed(123)
# Number of rows to be generated
n <- 10000

# Creating dataset
dataset <- data.frame(
  Var_1 = round(rnorm(n, mean = 50, sd = 10)),
  Var_2 = round(rnorm(n, mean = 7.5, sd = 2.1)),
  Var_3 = as.factor(sample(c("0", "1"), n, replace = TRUE)),
  Var_4 = as.factor(sample(c("0", "1", "2"), n, replace = TRUE)),
  Var_5 = as.factor(sample(0:15, n, replace = TRUE)),
  Var_6 = round(rnorm(n, mean = 60, sd = 5))
)

# Save the dataset to a temporary file
temp_file <- tempfile(fileext = ".csv")
write.csv(dataset, file = temp_file, row.names = FALSE)

# Path to the temporary file
dataset_path <- temp_file
dataset_path  # Display the path to the temporary file

# Initialize the data reading function with the data set path and chunk size
da <- drglm::make.data(dataset_path, chunksize = 1000)

# Fitting MLR Models
nmodel <- drglm::big.drglm(da,
formula = Var_1 ~ Var_2 + factor(Var_3) + factor(Var_4) + factor(Var_5) + Var_6,
10, family = "gaussian")
# View the results table
print(nmodel)

# Fitting logistic Regression Model
bmodel <- drglm::big.drglm(da,
formula = factor(Var_3) ~ Var_1 + Var_2 + factor(Var_4) + factor(Var_5) + Var_6,
10, family = "binomial")
# View the results table
print(bmodel)

# Fitting Poisson Regression Model
pmodel <- drglm::big.drglm(da,
formula = Var_5 ~ Var_1 + Var_2 + factor(Var_3) + factor(Var_4) + Var_6,
10, family = "poisson")
# View the results table
print(pmodel)

[Package drglm version 1.1 Index]

Reading Data File Larger than Memory for Fitting GLMs Using big.drglm Function

Description

Usage

Arguments

Value

Examples

Reading Data File Larger than Memory for Fitting GLMs Using `big.drglm` Function