R: Collect comments data from reddit threads

Collect.reddit {vosonSML}

R Documentation

Collect comments data from reddit threads

Description

Collects comments made by users on one or more specified subreddit conversation threads and structures the data into a dataframe with the class names "datasource" and "reddit".

Usage

## S3 method for class 'reddit'
Collect(
  credential,
  threadUrls,
  waitTime = c(3, 5),
  ua = getOption("HTTPUserAgent"),
  writeToFile = FALSE,
  verbose = FALSE,
  ...
)

collect_reddit_threads(
  threadUrls,
  waitTime = c(3, 5),
  ua = getOption("HTTPUserAgent"),
  writeToFile = FALSE,
  verbose = FALSE,
  ...
)

Arguments

`credential`	A `credential` object generated from `Authenticate` with class name `"reddit"`.
`threadUrls`	Character vector. Reddit thread urls to collect data from.
`waitTime`	Numeric vector. Time range in seconds to select random wait from in-between url collection requests. Minimum is 3 seconds. Default is `c(3, 5)` for a wait time chosen from between 3 and 5 seconds.
`ua`	Character string. Override User-Agent string to use in Reddit thread requests. Default is `option("HTTPUserAgent")` value as set by vosonSML.
`writeToFile`	Logical. Write collected data to file. Default is `FALSE`.
`verbose`	Logical. Output additional information about the data collection. Default is `TRUE`.
`...`	Additional parameters passed to function. Not used in this method.

Value

A tibble object with class names "datasource" and "reddit".

Note

The reddit web endpoint used for collection has maximum limit of 500 comments per thread url.

Examples

## Not run: 
# subreddit url to collect threads from
threadUrls <- c("https://www.reddit.com/r/xxxxxx/comments/xxxxxx/x_xxxx_xxxxxxxxx/")

redditData <- redditAuth |>
  Collect(threadUrls = threadUrls, writeToFile = TRUE)

## End(Not run)

[Package vosonSML version 0.32.7 Index]