conditionalDataMin {mmb} | R Documentation |
Segment data according to one or more random variables.
Description
Takes a data.frame and segments it, according to the selected variables. Only rows satisfying all conditions are kept. Supports discrete and continuous variables. Supports NA, NaN and NULL by using is.na, is.nan and is.null as comparator.
Usage
conditionalDataMin(
df,
features,
selectedFeatureNames = c(),
retainMinValues = 1
)
Arguments
df |
data.frame with data to segment. If it contains less than or
equally many rows as specified by |
features |
data.frame of bayes-features that are used to segment.
Each feature's value is used to segment the data, and the features are
used in the order as given by |
selectedFeatureNames |
default |
retainMinValues |
default 1. The minimum amount of rows to retain. Filtering the data by the selected features may reduce the amount of remaining rows quickly, and this can be used as an early stopping criteria. Note that filtering is done variable by variable, and the amount of remaining rows is evaluated after each segmenting-step. If the threshold is undercut, then the result from the previous round is returned. |
Value
data.frame that is segmented according to the selected variables and the minimum amount of rows to retain.
Author(s)
Sebastian Hönel sebastian.honel@lnu.se
See Also
getValueKeyOfBayesFeatures()
Examples
feat1 <- mmb::createFeatureForBayes(
name = "Petal.Length", value = mean(iris$Petal.Length))
feat2 <- mmb::createFeatureForBayes(
name = "Petal.Width", value = mean(iris$Petal.Width))
feats <- rbind(feat1, feat2)
data <- mmb::conditionalDataMin(df = iris, features = feats,
selectedFeatureNames = feats$name, retainMinValues = 1)