pps {dlookr}R Documentation

Compute Predictive Power Score

Description

The pps() compute PPS(Predictive Power Score) for exploratory data analysis.

Usage

pps(.data, ...)

## S3 method for class 'data.frame'
pps(.data, ..., cv_folds = 5, do_parallel = FALSE, n_cores = -1)

## S3 method for class 'target_df'
pps(.data, ..., cv_folds = 5, do_parallel = FALSE, n_cores = -1)

Arguments

.data

a target_df or data.frame.

...

one or more unquoted expressions separated by commas. You can treat variable names like they are positions. Positive values select variables; negative values to drop variables. If the first expression is negative, describe() will automatically start with all variables. These arguments are automatically quoted and evaluated in a context where column names represent column positions. They support unquoting and splicing.

cv_folds

integer. number of cross-validation folds.

do_parallel

logical. whether to perform score calls in parallel.

n_cores

integer. number of cores to use, defaults to maximum cores - 1.

Details

The PPS is an asymmetric, data-type-agnostic score that can detect linear or non-linear relationships between two variables. The score ranges from 0 (no predictive power) to 1 (perfect predictive power).

Value

An object of the class as pps. Attributes of pps class is as follows.

Information of Predictive Power Score

The information of PPS is as follows.

References

See Also

print.relate, plot.relate.

Examples

library(dplyr)

# If you want to use this feature, you need to install the 'ppsr' package.
if (!requireNamespace("ppsr", quietly = TRUE)) {
  cat("If you want to use this feature, you need to install the 'ppsr' package.\n")
}

# pps type is generic =======================================
pps_generic <- pps(iris)
pps_generic

# pps type is target_by =====================================
##-----------------------------------------------------------
# If the target variable is a categorical variable
categ <- target_by(iris, Species)

# compute all variables
pps_cat <- pps(categ)
pps_cat

# compute Petal.Length and Petal.Width variable
pps_cat <- pps(categ, Petal.Length, Petal.Width)
pps_cat

# Using dplyr
pps_cat <- iris %>% 
  target_by(Species) %>% 
  pps()

pps_cat

##-----------------------------------------------------------
# If the target variable is a numerical variable
num <- target_by(iris, Petal.Length)

pps_num <- pps(num)
pps_num


[Package dlookr version 0.6.3 Index]