shapr {shapr}R Documentation

Create an explainer object with Shapley weights for test data.

Description

Create an explainer object with Shapley weights for test data.

Usage

shapr(x, model, n_combinations = NULL)

Arguments

x

Numeric matrix or data.frame/data.table. Contains the data used to estimate the (conditional) distributions for the features needed to properly estimate the conditional expectations in the Shapley formula.

model

The model whose predictions we want to explain. Run shapr:::get_supported_models() for a table of which models shapr supports natively.

n_combinations

Integer. The number of feature combinations to sample. If NULL, the exact method is used and all combinations are considered. The maximum number of combinations equals 2^ncol(x).

Value

Named list that contains the following items:

exact

Boolean. Equals TRUE if n_combinations = NULL or n_combinations < 2^ncol(x), otherwise FALSE.

n_features

Positive integer. The number of columns in x

S

Binary matrix. The number of rows equals the number of unique combinations, and the number of columns equals the total number of features. I.e. let's say we have a case with three features. In that case we have 2^3 = 8 unique combinations. If the j-th observation for the i-th row equals 1 it indicates that the j-th feature is present in the i-th combination. Otherwise it equals 0.

W

Second item

X

data.table. Returned object from feature_combinations

x_train

data.table. Transformed x into a data.table.

feature_list

List. The updated_feature_list output from preprocess_data

In addition to the items above, model and n_combinations are also present in the returned object.

Author(s)

Nikolai Sellereite

Examples

if (requireNamespace("MASS", quietly = TRUE)) {
  # Load example data
  data("Boston", package = "MASS")
  df <- Boston

  # Example using the exact method
  x_var <- c("lstat", "rm", "dis", "indus")
  y_var <- "medv"
  df1 <- df[, x_var]
  model <- lm(medv ~ lstat + rm + dis + indus, data = df)
  explainer <- shapr(df1, model)

  print(nrow(explainer$X))
  # 16 (which equals 2^4)

  # Example using approximation
  y_var <- "medv"
  x_var <- setdiff(colnames(df), y_var)
  model <- lm(medv ~ ., data = df)
  df2 <- df[, x_var]
  explainer <- shapr(df2, model, n_combinations = 1e3)

  print(nrow(explainer$X))

  # Example using approximation where n_combinations > 2^m
  x_var <- c("lstat", "rm", "dis", "indus")
  y_var <- "medv"
  df3 <- df[, x_var]
  model <- lm(medv ~ lstat + rm + dis + indus, data = df)
  explainer <- shapr(df1, model, n_combinations = 1e3)

  print(nrow(explainer$X))
  # 16 (which equals 2^4)
}

[Package shapr version 0.2.2 Index]