split_sentence_token {textshape}R Documentation

Split Sentences & Tokens

Description

Split sentences and tokens.

Usage

split_sentence_token(x, ...)

## Default S3 method:
split_sentence_token(x, lower = TRUE, ...)

## S3 method for class 'data.frame'
split_sentence_token(x, text.var = TRUE, lower = TRUE, ...)

Arguments

x

A data.frame or character vector with sentences.

lower

logical. If TRUE the words are converted to lower case.

text.var

The name of the text variable. If TRUE split_sentence_token tries to detect the column with sentences.

...

Ignored.

Value

Returns a list of vectors of sentences or a expanded data.frame with sentences split apart.

Examples

(x <- c(paste0(
    "Mr. Brown comes! He says hello. i give him coffee.  i will ",
    "go at 5 p. m. eastern time.  Or somewhere in between!go there"
),
paste0(
    "Marvin K. Mooney Will You Please Go Now!", "The time has come.",
    "The time has come. The time is now. Just go. Go. GO!",
    "I don't care how."
)))
split_sentence_token(x)

data(DATA)
split_sentence_token(DATA)

## Not run: 
## Kevin S. Dias' sentence boundary disambiguation test set
data(golden_rules)
library(magrittr)

golden_rules %$%
    split_sentence_token(Text)

## End(Not run)

[Package textshape version 1.7.5 Index]