population {regressinator}R Documentation

Define the population generalized regression relationship

Description

Specifies a hypothetical infinite population of cases. Each case has some predictor variables and one or more response variables. The relationship between the variables and response variables are defined, as well as the population marginal distribution of each predictor variable.

Usage

population(...)

Arguments

...

A sequence of named arguments defining predictor and response variables. These are evaluated in order, so later response variables may refer to earlier predictor and response variables. All predictors should be provided first, before any response variables.

Value

A population object.

See Also

predictor() and response() to define the population; sample_x() and sample_y() to draw samples from it

Examples

# A population with a simple linear relationship
linear_pop <- population(
  x1 = predictor("rnorm", mean = 4, sd = 10),
  x2 = predictor("runif", min = 0, max = 10),
  y = response(0.7 + 2.2 * x1 - 0.2 * x2, error_scale = 1.0)
)

# A population whose response depends on local variables
slope <- 2.2
intercept <- 0.7
sigma <- 2.5
variable_pop <- population(
  x = predictor("rnorm"),
  y = response(intercept + slope * x, error_scale = sigma)
)

# Response error scale is heteroskedastic and depends on predictors
heteroskedastic_pop <- population(
  x1 = predictor("rnorm", mean = 4, sd = 10),
  x2 = predictor("runif", min = 0, max = 10),
  y = response(0.7 + 2.2 * x1 - 0.2 * x2,
               error_scale = 1 + x2 / 10)
)

# A binary outcome Y, using a binomial family with logistic link
binary_pop <- population(
  x1 = predictor("rnorm", mean = 4, sd = 10),
  x2 = predictor("runif", min = 0, max = 10),
  y = response(0.7 + 2.2 * x1 - 0.2 * x2,
               family = binomial(link = "logit"))
)

# A binomial outcome Y, with 10 trials per observation, using a logistic link
# to determine the probability of success for each trial
binomial_pop <- population(
  x1 = predictor("rnorm", mean = 4, sd = 10),
  x2 = predictor("runif", min = 0, max = 10),
  y = response(0.7 + 2.2 * x1 - 0.2 * x2,
               family = binomial(link = "logit"),
               size = 10)
)

# Another binomial outcome, but the number of trials depends on another
# predictor
binom_size_pop <- population(
  x1 = predictor("rnorm", mean = 4, sd = 10),
  x2 = predictor("runif", min = 0, max = 10),
  trials = predictor("rpois", lambda = 20),
  y = response(0.7 + 2.2 * x1 - 0.2 * x2,
               family = binomial(link = "logit"),
               size = trials)
)

# A population with a simple linear relationship and collinearity. Because X
# is bivariate, there will be two predictors, named x1 and x2.
library(mvtnorm)
collinear_pop <- population(
  x = predictor("rmvnorm", mean = c(0, 1),
                sigma = matrix(c(1, 0.8, 0.8, 1), nrow = 2)),
  y = response(0.7 + 2.2 * x1 - 0.2 * x2, error_scale = 1.0)
)

[Package regressinator version 0.1.3 Index]