select_neighbours {ceterisParibus} | R Documentation |
Select Subset of Rows Closest to a Specified Observation
Description
This function selects subset of rows from data set. This is usefull if data is large and we need just a sample to calculate profiles.
Usage
select_neighbours(
data,
observation,
variables = NULL,
distance = gower::gower_dist,
n = 20,
frac = NULL
)
Arguments
data |
set of observations |
observation |
single observation |
variables |
variables that shall be used for calculation of distance. By default these are all variables present in 'data' and 'observation' |
distance |
distance function, by default the 'gower_dist' function. |
n |
number of neighbours to select |
frac |
if 'n' is not specified (NULL), then will be calculated as 'frac' * number of rows in 'data'. Either 'n' or 'frac' need to be specified. |
Details
Note that select_neighbours
function is S3 generic.
If you want to work on non standard data sources (like H2O ddf, external databases)
you should overload it.
Value
a data frame with selected rows
Examples
library("DALEX")
new_apartment <- apartments[1, 2:6]
small_apartments <- select_neighbours(apartmentsTest, new_apartment, n = 10)
new_apartment
small_apartments
[Package ceterisParibus version 0.4.2 Index]