R: Function which finds the data points that are closer to the...

find_closer_points {archetypal}

R Documentation

Function which finds the data points that are closer to the archetypes during all iterations of the algorithm PCHA

Description

This function runs the PCHA algorithm and finds the data points that are in the local neighborhood of each archetype. The size of the neighborhood is user defined (npoints). This allows us to study the properties of the solution or manually choose an initial approximation to search for a better fit.

Usage

find_closer_points(df, kappas, usedata = FALSE, npoints = 2, 
                     nworkers = NULL, rseed = NULL, 
                     verbose = FALSE, doparallel = FALSE, ...)

Arguments

`df`	The data frame with dimensions n x d
`kappas`	The number of archetypes
`usedata`	If it is TRUE, then entire data frame will be used, if `doparallel = TRUE`
`npoints`	The number of closer points to be estimated
`nworkers`	The number of logical processors that will be used, if `doparallel = TRUE`
`rseed`	The random seed that will be used for random generator. Useful for reproducible results.
`verbose`	If it is set to TRUE, then details will be printed, except from `archetypal`
`doparallel`	If it is set to TRUE, then parallel processing will be performed
`...`	Other arguments to be passed to `archetypal` except internally used `save_history = TRUE` and `verbose = FALSE`. This is essential for using optimal parameters found by `find_pcha_optimal_parameters`

Value

A list with members:

rows_history, a list with npoints rows used that are closer to each archetype for each iteration done by algorithm
iter_terminal, iteration after which rows closer to archetypes do not change any more
rows_closer, the rows closer to archetypes by means of Euclidean distance and are fixed after iter_terminal iteration
rows_closer_matrix, a matrix with npoints rows which are closer to each archetype
solution_used, the AA output that has been used. Some times useful, especially for big data.

Examples

{
# Load data "wd2"
data("wd2")
yy = find_closer_points(df = wd2, kappas = 3, npoints = 2, nworkers = 2)
yy$rows_history
yy$iter_terminal
yy$rows_closer
yy$rows_closer_matrix
yy$solution_used$BY

}