findviews_to_predict {findviews} | R Documentation |
Views of a multidimensional dataset, ranked by their prediction power.
Description
findviews_to_predict
detects groups of mutually dependent columns,
ranks them by predictive power, and plots them with Shiny and ggplot.
Usage
findviews_to_predict(target, data, view_size_max = NULL,
clust_method = "complete", ...)
Arguments
target |
Name of the variable to be predicted. |
data |
Data frame or matrix to be processed |
view_size_max |
Maximum number of columns in the views. If set to
|
clust_method |
Character describing a clustering method, used internally
by |
... |
Optional Shiny parameters, used in Shiny's
|
Details
The function findviews_to_predict
takes a data set and a target
variable as input. It detects clusters of statistically dependent columns in
the data set - e.g., views - and ranks those groups according to how well
they predict the target variable.
To detect the views, findviews_to_predict
relies on findviews
.
To evaluate their predictive power, it uses the mutual information
between the joint distribution of the columns and that of the target
variable. Internally, findviews_to_predict
discretizes all the
continuous variables with equi-width binning.
Note: findviews_to_predict
removes the column to be predicted (the
target column) from the dataset before it creates the column groups. Hence,
the views it returns may be different from those return by calling by
findviews
directly on the dataset.
Examples
## Not run:
findviews_to_predict('mpg', mtcars)
findviews_to_predict('mpg', mtcars, view_size_max = 4)
## End(Not run)