count_tokens {TheOpenAIR} | R Documentation |
Count the number of tokens in a text string
Description
This function takes a file path, URL or character string as input and returns the number of tokens in the text. Tokens are defined as words and/or special characters.
Usage
count_tokens(text)
Arguments
text |
A file path, URL or character string representing the text to be tokenized. |
Value
An integer representing the number of tokens in the text.
Examples
## Not run:
# Example 1: File path
test_file_path <- tempfile()
writeLines("This is a test.", test_file_path)
expect_equal(count_tokens(test_file_path), 5)
file.remove(test_file_path)
# Example 2: URL
url <- "https://www.gutenberg.org/files/2701/2701-0.txt"
count_tokens(url)
# Example 3: Character string
text <- "This is a test string."
count_tokens(text)
## End(Not run)
[Package TheOpenAIR version 0.1.0 Index]