frrm {fairml}R Documentation

Fair Ridge Regression Model

Description

A regression model enforcing fairness with a ridge penalty.

Usage

# a fair ridge regression model.
frrm(response, predictors, sensitive, unfairness,
  definition = "sp-komiyama", lambda = 0, save.auxiliary = FALSE)
# a fair generalized ridge regression model.
fgrrm(response, predictors, sensitive, unfairness,
  definition = "sp-komiyama", family = "binomial", lambda = 0,
  save.auxiliary = FALSE)

Arguments

response

a numeric vector, the response variable.

predictors

a numeric matrix or a data frame containing numeric and factor columns; the predictors.

sensitive

a numeric matrix or a data frame containing numeric and factor columns; the sensitive attributes.

unfairness

a positive number in [0, 1], how unfair is the model allowed to be. A value of 0 means the model is completely fair, while a value of 1 means the model is not constrained to be fair at all.

definition

a character string, the label of the definition of fairness used in fitting the model. Currently either "sp-komiyama", "eo-komiyama" or "if-berk". It may also be a function: see below for details.

family

a character string, either "gaussian" to fit a linear regression, "binomial" to fit a logistic regression, "poisson" to fit a log-linear regression, "cox" to fit a Cox proportional hazards regression of "multinomial" to fit a multinomial logistic regression.

lambda

a non-negative number, a ridge-regression penalty coefficient. It defaults to zero.

save.auxiliary

a logical value, whether to save the fitted values and the residuals of the auxiliary model that constructs the decorrelated predictors. The default value is FALSE.

Details

frrm() and fgrrm() can accommodate different definitions of fairness, which can be selected via the definition argument. The labels for the built-in definitions are:

Users may also pass a function via the definition argument to plug custom fairness definitions. This function should have signature function(model, y, S, U, family) and return an array with an element called "value" (optionally along with others). The arguments will contain the model fitted for the current level of fairness (model), the sanitized response variable (y), the design matrix for the sanitized sensitive attributes (S), the design matrix for the sanitized decorrelated predictors (U) and the character string identifying the family the model belongs to (family).

The algorithm works like this:

  1. regresses the predictors against the sensitive attributes;

  2. constructs a new set of predictors that are decorrelated from the sensitive attributes using the residuals of this regression;

  3. regresses the response against the decorrelated predictors and the sensitive attributes; while

  4. using a ridge penalty to control the proportion of variance the sensitive attributes can explain with respect to the overall explained variance of the model.

Both sensitive and predictors are standardized internally before estimating the regression coefficients, which are then rescaled back to match the original scales of the variables.

fgrrm() is the extension of frrm() to generalized linear models, currently implementing linear (family = "gaussian") and logistic (family = "binomial") regressions. fgrrm() is equivalent to frrm() with family = "gaussian". The definition of fairness are identical between frrm() and fgrrm().

Value

frrm() returns an object of class c("frrm", "fair.model"). fgrrm() returns an object of class c("fgrrm", "fair.model").

Author(s)

Marco Scutari

References

Scutari M, Panero F, Proissl M (2022). "Achieving Fairness with a Simple Ridge Penalty". Statistics and Computing, 32, 77.
https://link.springer.com/content/pdf/10.1007/s11222-022-10143-w.pdf

See Also

nclm, zlm, zlrm


[Package fairml version 0.8 Index]