dim_values {rearrr}R Documentation

Dim values of a dimension based on the distance to an n-dimensional origin

Description

[Experimental]

Dims the values in the dimming dimension (last by default) based on the data point's distance to the origin.

Distance is calculated as:

d(P1, P2) = sqrt( (x2 - x1)^2 + (y2 - y1)^2 + (z2 - z1)^2 + ... )

The default `dimming_fn` multiplies by the inverse-square of 1 + distance and is calculated as:

dimming_fn(x, d) = x * (1 / (1 + d) ^ 2)

Where x is the value in the dimming dimension. The +1 is added to ensure that values are dimmed even when the distance is below 1. The quickest way to change the exponent or the +1 is with create_dimming_fn().

The origin can be supplied as coordinates or as a function that returns coordinates. The latter can be useful when supplying a grouped data.frame and dimming around e.g. the centroid of each group.

Usage

dim_values(
  data,
  cols,
  dimming_fn = create_dimming_fn(numerator = 1, exponent = 2, add_to_distance = 1),
  origin = NULL,
  origin_fn = NULL,
  dim_col = tail(cols, 1),
  suffix = "_dimmed",
  keep_original = TRUE,
  origin_col_name = ".origin",
  overwrite = FALSE
)

Arguments

data

data.frame or vector.

cols

Names of columns in `data` to calculate distances from. The dimming column (`dim_col`) is dimmed based on all the columns. Each column is considered a dimension.

N.B. when the dimming dimension is included in `cols`, it is used in the distance calculation as well.

dimming_fn

Function for calculating the dimmed values.

Input: Two (2) input arguments:

  1. A numeric vector with the values in the dimming dimension.

  2. A numeric vector with corresponding distances to the origin.

Output: A numeric vector with the same length as the input vectors.

E.g.:

function(x, d){

⁠ ⁠x * (1 / ((1 + d) ^ 2))

}

This kind of dimming function can be created with create_dimming_fn(), which for instance makes it easy to change the exponent (the 2 above).

origin

Coordinates of the origin to dim around. A scalar to use in all dimensions or a vector with one scalar per dimension.

N.B. Ignored when `origin_fn` is not NULL.

origin_fn

Function for finding the origin coordinates.

Input: Each column will be passed as a vector in the order of `cols`.

Output: A vector with one scalar per dimension.

Can be created with create_origin_fn() if you want to apply the same function to each dimension.

E.g. `create_origin_fn(median)` would find the median of each column.

Built-in functions are centroid(), most_centered(), and midrange()

dim_col

Name of column to dim. Default is the last column in `cols`.

When the `dim_col` is not present in `cols`, it is not used in the distance calculation.

suffix

Suffix to add to the names of the generated columns.

Use an empty string (i.e. "") to overwrite the original columns.

keep_original

Whether to keep the original columns. (Logical)

Some columns may have been overwritten, in which case only the newest versions are returned.

origin_col_name

Name of new column with the origin coordinates. If NULL, no column is added.

overwrite

Whether to allow overwriting of existing columns. (Logical)

Details

Value

data.frame (tibble) with the dimmed column, along with the origin coordinates.

Author(s)

Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk

See Also

Other mutate functions: apply_transformation_matrix(), cluster_groups(), expand_distances(), expand_distances_each(), flip_values(), roll_values(), rotate_2d(), rotate_3d(), shear_2d(), shear_3d(), swirl_2d(), swirl_3d()

Other distance functions: closest_to(), distance(), expand_distances(), expand_distances_each(), furthest_from(), swirl_2d(), swirl_3d()

Examples

# Attach packages
library(rearrr)
library(dplyr)
library(purrr)
has_ggplot <- require(ggplot2)  # Attach if installed

# Set seed
set.seed(7)

# Create a data frame with clusters
df <- generate_clusters(
  num_rows = 70,
  num_cols = 3,
  num_clusters = 5,
  compactness = 1.6
) %>%
  dplyr::rename(x = D1, y = D2, z = D3) %>%
  dplyr::mutate(o = 1)

# Dim the values in the z column
dim_values(
  data = df,
  cols = c("x", "y", "z"),
  origin = c(0.5, 0.5, 0.5)
)

# Dim the values in the `o` column (all 1s)
# around the centroid
dim_values(
  data = df,
  cols = c("x", "y"),
  dim_col = "o",
  origin_fn = centroid
)

# Specify dimming_fn
# around the centroid
dim_values(
  data = df,
  cols = c("x", "y"),
  dim_col = "o",
  origin_fn = centroid,
  dimming_fn = function(x, d) {
    x * 1 / (2^(1 + d))
  }
)

#
# Dim cluster-wise
#

# Group-wise dimming
df_dimmed <- df %>%
  dplyr::group_by(.cluster) %>%
  dim_values(
    cols = c("x", "y"),
    dim_col = "o",
    origin_fn = centroid
  )

# Plot the dimmed data such that the alpha (opacity) is
# controlled by the dimming
# (Note: This works because the `o` column is 1 for all values)
if (has_ggplot){
  ggplot(
    data = df_dimmed,
    aes(x = x, y = y, alpha = o_dimmed, color = .cluster)
  ) +
    geom_point() +
    theme_minimal() +
    labs(x = "x", y = "y", color = "Cluster", alpha = "o_dimmed")
}

[Package rearrr version 0.3.4 Index]