findviews {findviews} | R Documentation |
Views of a multidimensional dataset.
Description
findviews
detects and plots groups of mutually dependent columns.
It is based on Shiny and ggplot.
Usage
findviews(data, view_size_max = NULL, clust_method = "complete", ...)
Arguments
data |
Data frame or matrix to be processed |
view_size_max |
Maximum number of columns in the views. If set to
|
clust_method |
Character describing a clustering method, used internally
by |
... |
Optional Shiny parameters, used in Shiny's
|
Details
The function findviews
takes a data frame or a matrix as input. It
computes the pairwise dependency between the columns, detects clusters in the
resulting structure and displays the results with a Shiny app.
findviews
processes numerical and categorical data separately. It excludes
the columns with only one value, the columns in which all the values are
distinct (e.g., primary keys), and the columns with more than 75% missing values.
findviews
computes the dependency between the columns differently
depending on their type. It uses Pearson's coefficient of correlation for
numerical data, and Cramer's V for categorical data.
To cluster the columns, findviews
uses the function
hclust
, R's implementation of agglomerative hierarchical
clustering. The parameter clust_method
specifies which flavor of
agglomerative clustering to use. The number of clusters is determined by the
parameter view_size_max
.
Examples
## Not run:
findviews(mtcars)
findviews(mtcars, view_size_max = 4, port = 7000)
## End(Not run)