closest_to {rearrr} | R Documentation |
Orders values by shortest distance to an origin
Description
Values are ordered by how close they are to the origin.
In 1d (when
`cols`
has length 1
), the origin can be thought of as a target value.
In n dimensions, the origin can be thought of as coordinates.
The origin can be supplied as coordinates or as a function that returns coordinates. The
latter can be useful when supplying a grouped data.frame
and ordering the rows by
their distance to the centroid of each group.
The *_vec()
version takes and returns a vector
.
Example:
The column values:
c(1, 2, 3, 4, 5)
and origin = 2
are ordered as:
c(
2
, 1, 3, 4, 5)
Usage
closest_to(
data,
cols = NULL,
origin = NULL,
origin_fn = NULL,
shuffle_ties = FALSE,
origin_col_name = ".origin",
distance_col_name = ".distance",
overwrite = FALSE
)
closest_to_vec(data, origin = NULL, origin_fn = NULL, shuffle_ties = FALSE)
Arguments
data |
|
cols |
Column(s) to create sorting factor by.
When |
origin |
Coordinates of the origin to calculate distances to.
A scalar to use in all dimensions
or a N.B. Ignored when |
origin_fn |
Function for finding the origin coordinates. Input: Each column will be passed as a Output: A Can be created with E.g. Built-in functions are |
shuffle_ties |
Whether to shuffle elements with the same distance to the origin. (Logical) |
origin_col_name |
Name of new column with the origin coordinates. If |
distance_col_name |
Name of new column with the distances to the origin. If |
overwrite |
Whether to allow overwriting of existing columns. (Logical) |
Value
The sorted data.frame
(tibble
) / vector
.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other rearrange functions:
center_max()
,
center_min()
,
furthest_from()
,
pair_extremes()
,
position_max()
,
position_min()
,
rev_windows()
,
roll_elements()
,
shuffle_hierarchy()
,
triplet_extremes()
Other distance functions:
dim_values()
,
distance()
,
expand_distances()
,
expand_distances_each()
,
furthest_from()
,
swirl_2d()
,
swirl_3d()
Examples
# Attach packages
library(rearrr)
library(dplyr)
# Set seed
set.seed(1)
# Create a data frame
df <- data.frame(
"index" = 1:10,
"A" = sample(1:10),
"B" = runif(10),
"G" = c(
1, 1, 1, 2, 2,
2, 3, 3, 3, 3
),
stringsAsFactors = FALSE
)
# Closest to 3 in a vector
closest_to_vec(1:10, origin = 3)
# Closest to the third row (index of data.frame)
closest_to(df, origin = 3)$index
# By each of the columns
closest_to(df, cols = "A", origin = 3)$A
closest_to(df, cols = "A", origin_fn = most_centered)$A
closest_to(df, cols = "B", origin = 0.5)$B
closest_to(df, cols = "B", origin_fn = centroid)$B
# Shuffle the elements with the same distance to the origin
closest_to(df,
cols = "A",
origin_fn = create_origin_fn(median),
shuffle_ties = TRUE
)$A
# Grouped by G
df %>%
dplyr::select(G, A) %>% # For clarity
dplyr::group_by(G) %>%
closest_to(
cols = "A",
origin_fn = create_origin_fn(median)
)
# Plot the rearranged values
plot(
x = 1:10,
y = closest_to(df,
cols = "B",
origin_fn = create_origin_fn(median)
)$B,
xlab = "Position",
ylab = "B"
)
plot(
x = 1:10,
y = closest_to(df,
cols = "A",
origin_fn = create_origin_fn(median),
shuffle_ties = TRUE
)$A,
xlab = "Position",
ylab = "A"
)
# In multiple dimensions
df %>%
closest_to(cols = c("A", "B"), origin_fn = most_centered)