detect_groups {CooRTweet}R Documentation

detect_groups

Description

Function to perform the initial stage in detecting coordinated behavior. It identifies pairs of accounts that share the same objects in a time_window. See details.

Usage

detect_groups(
  x,
  time_window = 10,
  min_participation = 2,
  remove_loops = TRUE,
  ...
)

Arguments

x

a data.table with the columns: object_id (uniquely identifies coordinated content), account_id (unique ids for accounts), content_id (id of account generated content), timestamp_share (integer). See also reshape_tweets and prep_data.

time_window

the number of seconds within which shared contents are to be considered as coordinated (default to 10 seconds).

min_participation

The minimum number of actions required for a account to be included in subsequent analysis (default set at 2). This ensures that only accounts with a minimum level of activity in the original dataset are included in subsequent analysis. It is important to distinguish this from the frequency of repeated interactions an account has with another specific account, as represented by edge weight. The edge weight parameter is utilized in the generate_coordinated_network function as a concluding step in identifying coordinated behavior.

remove_loops

Should loops (shares of the same objects made by the same account within the time window) be removed? (default to TRUE).

...

keyword arguments for backwards compatibility.

Details

This function achieves the initial stage in detecting coordinated behavior by identifying accounts who share identical objects within the same temporal window, and is preliminary to the network analysis conducted using the generate_coordinated_network function. detect_groups groups the data by object_id (uniquely identifies content) and calculates the time differences between all content_id (ids of account generated contents) within their groups. It then filters out all content_id that are higher than the time_window (in seconds). It returns a data.table with all IDs of coordinated contents. The object_id can be for example: hashtags, IDs of tweets being retweeted, or URLs being shared. For twitter data, best use reshape_tweets.

Value

a data.table with ids of coordinated contents. Columns: object_id, account_id, account_id_y, content_id, content_id_y, timedelta. The account_id and content_id represent the "older" data points, account_id_y and content_id_y represent the "newer" data points. For example, account A retweets from account B, then account A's content is newer (i.e., account_id_y).


[Package CooRTweet version 2.0.2 Index]