optimal_stage {dCUR} | R Documentation |
optimal_stage
Description
optimal_stage
is a function used to select the optimal k, the number of columns and rows of dynamic CUR object; it also produces a data frame and corresponding plots.
Usage
optimal_stage(data, limit = 80)
Arguments
data |
An object resulting from a call to dCUR. |
limit |
Cumulative percentage average of relative error rate. |
Details
Select the optimal stage of dynamic CUR descomposition
The objective of CUR decomposition is to find the most relevant variables and observations within a data matrix to reduce the dimensionality. It is well known that as more columns (variables) and rows are selected, the relative error will decrease; however, this is not true for k (number of components to compute leverages). Given the above, this function seeks to find the best-balanced stage of k, the number of relevant columns, and rows that have an error very close to the minimum, but at the same time maintain the low-rank fit of the data matrix.
Value
data |
a data frame which specifies the relative error for each stage of CUR decomposition. |
rows_plot |
a plot where the average relative error is shown for each number of relevant rows selected. |
columns_plot |
a plot where the average relative error is shown for each number of relevant columns selected. |
k_plot |
a plot where the average relative error is shown for each k (number of components to compute leverage), given the optimal number of relevant columns and rows. |
optimal |
a data frame where the average relative error is shown for optimal k (number of components to compute leverage), given the optimal number of relevant columns and rows. |
Author(s)
Cesar Gamboa-Sanabria, Stefany Matarrita-Munoz, Katherine Barquero-Mejias, Greibin Villegas-Barahona, Mercedes Sanchez-Barba and Maria Purificacion Galindo-Villardon.
References
Villegas G, Martin-Barreiro C, Gonzalez-Garcia N, Hernandez-Gonzalez S, Sanchez-Barba M, Galindo-Villardon M (2019). “Dynamic CUR, an alternative to variable selection in CUR decomposition.” Revistas Investigacion Operacional, 40(3), 391–399.
See Also
Examples
results <- dCUR(data=AASP, variables=hoessem:notabachillerato,
k=15, rows=0.25, columns=0.25,skip = 0.1, standardize=TRUE,
cur_method="sample_cur",
parallelize =TRUE, dynamic_columns = TRUE,
dynamic_rows = TRUE)
result <- optimal_stage(results, limit = 80)
result
result$k_plot
result$columns_plot
result$data
result$optimal