txt_count_words {doc2vec}R Documentation

Count the number of spaces occurring in text

Description

The C++ doc2vec functionalities in this package assume words are either separated by a space or tab symbol and that each document contains less than 1000 words.
This function calculates how many words there are in each element of a character vector by counting the number of occurrences of the space or tab symbol.

Usage

txt_count_words(x, pattern = "[ \t]", ...)

Arguments

x

a character vector with text

pattern

a text pattern to count which might be contained in x. Defaults to either space or tab.

...

other arguments, passed on to gregexpr

Value

an integer vector of the same length as x indicating how many times the pattern is occurring in x

Examples

x <- c("Count me in.007", "this is a set  of words",
       "more\texamples tabs-and-spaces.only", NA)
txt_count_words(x)

[Package doc2vec version 0.2.0 Index]