str_split {stringr}R Documentation

Split up a string into pieces

Description

This family of functions provides various ways of splitting a string up into pieces. These two functions return a character vector:

These two functions return a more complex object:

Usage

str_split(string, pattern, n = Inf, simplify = FALSE)

str_split_1(string, pattern)

str_split_fixed(string, pattern, n)

str_split_i(string, pattern, i)

Arguments

string

Input vector. Either a character vector, or something coercible to one.

pattern

Pattern to look for.

The default interpretation is a regular expression, as described in vignette("regular-expressions"). Use regex() for finer control of the matching behaviour.

Match a fixed string (i.e. by comparing only bytes), using fixed(). This is fast, but approximate. Generally, for matching human text, you'll want coll() which respects character matching rules for the specified locale.

Match character, word, line and sentence boundaries with boundary(). An empty pattern, "", is equivalent to boundary("character").

n

Maximum number of pieces to return. Default (Inf) uses all possible split positions.

For str_split(), this determines the maximum length of each element of the output. For str_split_fixed(), this determines the number of columns in the output; if an input is too short, the result will be padded with "".

simplify

A boolean.

  • FALSE (the default): returns a list of character vectors.

  • TRUE: returns a character matrix.

i

Element to return. Use a negative value to count from the right hand side.

Value

See Also

stri_split() for the underlying implementation.

Examples

fruits <- c(
  "apples and oranges and pears and bananas",
  "pineapples and mangos and guavas"
)

str_split(fruits, " and ")
str_split(fruits, " and ", simplify = TRUE)

# If you want to split a single string, use `str_split_1`
str_split_1(fruits[[1]], " and ")

# Specify n to restrict the number of possible matches
str_split(fruits, " and ", n = 3)
str_split(fruits, " and ", n = 2)
# If n greater than number of pieces, no padding occurs
str_split(fruits, " and ", n = 5)

# Use fixed to return a character matrix
str_split_fixed(fruits, " and ", 3)
str_split_fixed(fruits, " and ", 4)

# str_split_i extracts only a single piece from a string
str_split_i(fruits, " and ", 1)
str_split_i(fruits, " and ", 4)
# use a negative number to select from the end
str_split_i(fruits, " and ", -1)

[Package stringr version 1.5.1 Index]