expand_distances {rearrr} | R Documentation |
Expand the distances to an origin
Description
Moves the data points in n-dimensional space such that their distance
to a specified origin is increased/decreased.
A `multiplier`
greater than 1 leads to expansion,
while a positive `multiplier`
lower than 1 leads to contraction.
The origin can be supplied as coordinates or as a function that returns coordinates. The
latter can be useful when supplying a grouped data.frame
and expanding around e.g. the centroid
of each group.
The multiplier/exponent can be supplied as a constant or as a function that returns a constant.
The latter can be useful when supplying a grouped data.frame
and the multiplier/exponent depends
on the data in the groups.
For expansion in each dimension separately, use expand_distances_each()
.
NOTE: When exponentiating, the default is to first add 1
to the distances,
to ensure expansion even when the distance is between 0
and 1
.
If you need the purely exponentiated distances,
disable `add_one_exp`
.
Usage
expand_distances(
data,
cols = NULL,
multiplier = NULL,
multiplier_fn = NULL,
origin = NULL,
origin_fn = NULL,
exponentiate = FALSE,
add_one_exp = TRUE,
suffix = "_expanded",
keep_original = TRUE,
mult_col_name = ifelse(isTRUE(exponentiate), ".exponent", ".multiplier"),
origin_col_name = ".origin",
overwrite = FALSE
)
Arguments
data |
|
cols |
Names of columns in |
multiplier |
Constant to multiply/exponentiate the distances to the origin by. N.B. When |
multiplier_fn |
Function for finding the Input: Each column will be passed as a Output: A |
origin |
Coordinates of the origin to expand around.
A scalar to use in all dimensions
or a N.B. Ignored when |
origin_fn |
Function for finding the origin coordinates. Input: Each column will be passed as a Output: A Can be created with E.g. Built-in functions are |
exponentiate |
Whether to exponentiate instead of multiplying. (Logical) |
add_one_exp |
Whether to add The distances to the origin (
N.B. Ignored when |
suffix |
Suffix to add to the names of the generated columns. Use an empty string (i.e. |
keep_original |
Whether to keep the original columns. (Logical) Some columns may have been overwritten, in which case only the newest versions are returned. |
mult_col_name |
Name of new column with the |
origin_col_name |
Name of new column with the origin coordinates.
If |
overwrite |
Whether to allow overwriting of existing columns. (Logical) |
Details
Increases the distance to the origin in n-dimensional space by multiplying or exponentiating it by the multiplier.
We first move the origin to the zero-coordinates (e.g. c(0, 0, 0)
)
and normalize each vector to unit length. We then multiply this unit vector by the
multiplied/exponentiated distance and moves the origin back to its original coordinates.
The distance to the specified origin is calculated with:
d(P1, P2) = sqrt( (x2 - x1)^2 + (y2 - y1)^2 + (z2 - z1)^2 + ... )
Note: By default (when `add_one_exp`
is TRUE
),
we add 1
to the distance before the exponentiation
and subtract it afterwards. See `add_one_exp`
.
Value
data.frame
(tibble
) with the expanded columns,
along with the applied multiplier/exponent and origin coordinates.
Author(s)
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
See Also
Other mutate functions:
apply_transformation_matrix()
,
cluster_groups()
,
dim_values()
,
expand_distances_each()
,
flip_values()
,
roll_values()
,
rotate_2d()
,
rotate_3d()
,
shear_2d()
,
shear_3d()
,
swirl_2d()
,
swirl_3d()
Other expander functions:
expand_distances_each()
Other distance functions:
closest_to()
,
dim_values()
,
distance()
,
expand_distances_each()
,
furthest_from()
,
swirl_2d()
,
swirl_3d()
Examples
# Attach packages
library(rearrr)
library(dplyr)
library(purrr)
has_ggplot <- require(ggplot2) # Attach if installed
# Set seed
set.seed(1)
# Create a data frame
df <- data.frame(
"x" = runif(20),
"y" = runif(20),
"g" = rep(1:4, each = 5)
)
# Expand distances in the two dimensions (x and y)
# With the origin at x=0.5, y=0.5
# We multiply the distances by 2
expand_distances(
data = df,
cols = c("x", "y"),
multiplier = 2,
origin = c(0.5, 0.5)
)
# Expand distances in the two dimensions (x and y)
# With the origin at x=0.5, y=0.5
# We exponentiate the distances by 2
expand_distances(
data = df,
cols = c("x", "y"),
multiplier = 2,
exponentiate = TRUE,
origin = 0.5
)
# Expand values in one dimension (x)
# With the origin at x=0.5
# We exponentiate the distances by 3
expand_distances(
data = df,
cols = c("x"),
multiplier = 3,
exponentiate = TRUE,
origin = 0.5
)
# Expand x and y around the centroid
# We use exponentiation for a more drastic effect
# The add_one_exp makes sure it expands
# even when x or y is in the range [0, <1]
# To compare multiple exponents, we wrap the
# call in purrr::map_dfr
df_expanded <- purrr::map_dfr(
.x = c(1, 3, 5),
.f = function(exponent) {
expand_distances(
data = df,
cols = c("x", "y"),
multiplier = exponent,
origin_fn = centroid,
exponentiate = TRUE,
add_one_exp = TRUE
)
}
)
df_expanded
# Plot the expansions of x and y around the overall centroid
if (has_ggplot){
ggplot(df_expanded, aes(x = x_expanded, y = y_expanded, color = factor(.exponent))) +
geom_vline(
xintercept = df_expanded[[".origin"]][[1]][[1]],
size = 0.2, alpha = .4, linetype = "dashed"
) +
geom_hline(
yintercept = df_expanded[[".origin"]][[1]][[2]],
size = 0.2, alpha = .4, linetype = "dashed"
) +
geom_path(size = 0.2) +
geom_point() +
theme_minimal() +
labs(x = "x", y = "y", color = "Exponent")
}
# Expand x and y around the centroid using multiplication
# To compare multiple multipliers, we wrap the
# call in purrr::map_dfr
df_expanded <- purrr::map_dfr(
.x = c(1, 3, 5),
.f = function(multiplier) {
expand_distances(df,
cols = c("x", "y"),
multiplier = multiplier,
origin_fn = centroid,
exponentiate = FALSE
)
}
)
df_expanded
# Plot the expansions of x and y around the overall centroid
if (has_ggplot){
ggplot(df_expanded, aes(x = x_expanded, y = y_expanded, color = factor(.multiplier))) +
geom_vline(
xintercept = df_expanded[[".origin"]][[1]][[1]],
size = 0.2, alpha = .4, linetype = "dashed"
) +
geom_hline(
yintercept = df_expanded[[".origin"]][[1]][[2]],
size = 0.2, alpha = .4, linetype = "dashed"
) +
geom_path(size = 0.2, alpha = .8) +
geom_point() +
theme_minimal() +
labs(x = "x", y = "y", color = "Multiplier")
}
#
# Contraction
#
# Group-wise contraction to create clusters
df_contracted <- df %>%
dplyr::group_by(g) %>%
expand_distances(
cols = c("x", "y"),
multiplier = 0.07,
suffix = "_contracted",
origin_fn = centroid
)
# Plot the clustered data point on top of the original data points
if (has_ggplot){
ggplot(df_contracted, aes(x = x_contracted, y = y_contracted, color = factor(g))) +
geom_point(aes(x = x, y = y, color = factor(g)), alpha = 0.3, shape = 16) +
geom_point() +
theme_minimal() +
labs(x = "x", y = "y", color = "g")
}