xtfrm2 {stringx}R Documentation

Sort Strings

Description

The sort method for objects of class character (sort.character) uses the locale-sensitive Unicode collation algorithm to arrange strings in a vector with regards to a chosen lexicographic order.

xtfrm2 and [DEPRECATED] xtfrm generate an integer vector that sort in the same way as its input, and hence can be used in conjunction with order or rank.

Usage

xtfrm2(x, ...)

## Default S3 method:
xtfrm2(x, ...)

## S3 method for class 'character'
xtfrm2(
  x,
  ...,
  locale = NULL,
  strength = 3L,
  alternate_shifted = FALSE,
  french = FALSE,
  uppercase_first = NA,
  case_level = FALSE,
  normalisation = FALSE,
  numeric = FALSE
)

xtfrm(x)

## Default S3 method:
xtfrm(x)

## S3 method for class 'character'
xtfrm(x)

## S3 method for class 'character'
sort(
  x,
  ...,
  decreasing = FALSE,
  na.last = NA,
  locale = NULL,
  strength = 3L,
  alternate_shifted = FALSE,
  french = FALSE,
  uppercase_first = NA,
  case_level = FALSE,
  normalisation = FALSE,
  numeric = FALSE
)

Arguments

x

character vector whose elements are to be sorted

...

further arguments passed to other methods

locale

NULL or "" for the default locale (see stri_locale_get) or a single string with a locale identifier, see stri_locale_list

strength

see stri_opts_collator

alternate_shifted

see stri_opts_collator

french

see stri_opts_collator

uppercase_first

see stri_opts_collator

case_level

see stri_opts_collator

normalisation

see stri_opts_collator

numeric

see stri_opts_collator

decreasing

single logical value; if FALSE, the ordering is nondecreasing (weakly increasing)

na.last

single logical value; if TRUE, then missing values are placed at the end; if FALSE, they are put at the beginning; if NA, then they are removed from the output whatsoever.

Details

What 'xtfrm' stands for the current author does not know, but would appreciate someone's enlightening him.

Value

sort.character returns a character vector, with only the names attribute preserved. Note that the output vector may be shorter than the input one.

xtfrm2.character and xtfrm.character return an integer vector; most attributes are preserved.

Differences from Base R

Replacements for the default S3 methods sort and xtfrm for character vectors implemented with stri_sort and stri_rank.

Author(s)

Marek Gagolewski

See Also

The official online manual of stringx at https://stringx.gagolewski.com/

Related function(s): strcoll

Examples

x <- c("a1", "a100", "a101", "a1000", "a10", "a10", "a11", "a99", "a10", "a1")
base::sort.default(x)   # lexicographic sort
sort(x, numeric=TRUE)   # calls stringx:::sort.character
xtfrm2(x, numeric=TRUE)  # calls stringx:::xtfrm2.character

rank(xtfrm2(x, numeric=TRUE), ties.method="average")  # ranks with averaged ties
order(xtfrm2(x, numeric=TRUE))    # ordering permutation
x[order(xtfrm2(x, numeric=TRUE))] # equivalent to sort()

# order a data frame w.r.t. decreasing ids and increasing vals
d <- data.frame(vals=round(runif(length(x)), 1), ids=x)
d[order(-xtfrm2(d[["ids"]], numeric=TRUE), d[["vals"]]), ]



[Package stringx version 0.2.8 Index]