occurrencesLessThan {inverseRegex}R Documentation

Identifies Infrequent inverseRegex Patterns in an R Object.


Calls inverseRegex on the input object and identifies values that occur infrequently.


occurrencesLessThan(x, fraction = 0.05, n = NULL, ...)



Object to analyse for infrequent regex patterns.


Fraction of the R object size; regex patterns that occur less (or equal) often than this will be identified. For a vector this fraction will be multiplied by the length of the object; for a matrix it will be multiplied by the total number of entries; and for a data frame or tibble it will be multiplied by the number of rows. Defaults to 0.05.


Alternative to the fraction argument which allows a literal number of occurrences to be searched for. Defaults to NULL, in which case fraction will be used.


Other arguments to be passed to inverseRegex.


This function is essentially a wrapper around calling table() on the return value of inverseRegex. It can be used to identify the indices of values that consist of a regex pattern different to others in the R object.


A collection of logical values with TRUE indicating entries with an infrequent regex pattern. The class of the return value will depend on the input object; matrices, data frames, and tibbles will be returned in kind; all others are returned as vectors.


NA values are not considered and will need to be identified separately.


Jasper Watson

See Also

inverseRegex, regex


occurrencesLessThan(c(LETTERS, 1))

x <- iris
x$Species <- as.character(x$Species)
x[27, 'Species'] <- 'set0sa'
apply(occurrencesLessThan(x), 2, which)

[Package inverseRegex version 0.1.1 Index]