centralities {mmb}R Documentation

Given a neighborhood of data, computes the similarity of each sample in the neighborhood to the neighborhood.

Description

Takes a data.frame of samples, then builds a PDF/PMF or ECDF for each of the selected features. Then, for each sample, computes the product of probabilities. The result is a vector that holds a probability for each sample. That probability (or relative likelihood) then represents the vicinity (or similarity) of the sample to the given neighborhood.

Usage

centralities(
  dfNeighborhood,
  selectedFeatureNames = c(),
  shiftAmount = 0.1,
  doEcdf = FALSE,
  ecdfMinusOne = FALSE
)

Arguments

dfNeighborhood

data.frame that holds all rows that make up the neighborhood.

selectedFeatureNames

vector of names of features to use. The centrality of each row in the neighborhood is calculated based on the selected features.

shiftAmount

numeric DEFAULT 0.1 optional amount to shift each features probability by. This is useful for when the centrality not necessarily must be an actual probability and too many features are selected. To obtain actual probabilities, this needs to be 0, and you must use the ECDF.

doEcdf

boolean DEFAULT FALSE whether to use the ECDF instead of the EPDF to find the likelihood of continuous values.

ecdfMinusOne

boolean DEFAULT FALSE only has an effect if the ECDF is used. If true, uses 1 minus the ECDF to find the probability of a continuous value. Depending on the interpretation of what you try to do, this may be of use.

Value

a named vector, where the names correspond to the rownames of the rows in the given neighborhood, and the value is the centrality of that row.

Author(s)

Sebastian Hönel sebastian.honel@lnu.se

Examples

# Create a neighborhood:
nbh <- mmb::neighborhood(df = iris, features = mmb::createFeatureForBayes(
  name = "Sepal.Width", value = mean(iris$Sepal.Width)))

cent <- mmb::centralities(dfNeighborhood = nbh, shiftAmount = 0.1,
  doEcdf = TRUE, ecdfMinusOne = TRUE)

# Plot the ordered samples to get an idea of the centralities in the neighborhood:
plot(x = names(cent), y=cent)

[Package mmb version 0.13.3 Index]