seqcheck {EloRating} | R Documentation |
runs raw data diagnostics for Elo rating
Description
runs some diagnostics on the data supplied to elo.seq, to check whether elo.seq will run without errors
Usage
seqcheck(winner, loser, Date, draw = NULL, presence = NULL)
Arguments
winner |
either a factor or character vector with winner IDs of dyadic dominance interactions |
loser |
either a factor or character vector with loser IDs of dyadic dominance interactions |
Date |
character vector of form "YYYY-MM-DD" with the date of the respective interaction |
draw |
logical, which interactions ended undecided (i.e. drawn or tied)? By default all |
presence |
optional data.frame, to supply data about presence and absence of individuals for part of the time the data collection covered. see details |
Details
calender dates (for the sequence as well as in the first column of presence
, if supplied) need to be in "YYYY-MM-DD" format!
seqcheck
will return two types of messages: warnings and errors. Errors will result in the data NOT working when supplied to elo.seq
, and need to be fixed. Warning message do not necessarily lead to failure of executing elo.seq
. Note that by default seqcheck
is part of elo.seq
. If any error or warning is produced by seqcheck
, these data will not work in elo.seq
. Some warning (but not error) messages can be ignored (see below) and if the runcheck
argument in elo.seq
is set to FALSE
Elo-ratings will be calculated properly in such cases.
The actual checks (and corresponding messages) that are performed are described in more detail here:
Most likely (i.e. in our experience), problems are caused by mismatches between the interaction data and the corresponding presence data.
Errors:
Presence starts AFTER data
: indicates that during interactions at the beginning of the sequence, no corresponding information was found in the presence data. Solution: augment presence data, or remove interactions until the date on which presence data starts
Presence stops BEFORE data
: refers to the corresponding problem towards the end of interaction and presence data
During the following interactions, IDs were absent...
: indicates that according to the presence data, IDs were absent (i.e. "0"), but interactions with them occured on the very date(s) according to the interaction data
The following IDs occur in the data sequence but NOT...
: there is/are no columns corresponding to the listed IDs in the presence data
There appear to be gaps in your presence (days missing?)...
: check whether your presence data includes a line for each date starting from the date of the first interaction through to the date of the last interaction
Warnings:
Presence continues beyond data
: indicates that presence and interaction data do not end on the same date.
Presence starts earlier than data
: indicates that presence and interaction data do not start on the same date.
The following IDs occur in the presence data but NOT...
: there are more ID columns in the presence data than IDs occuring in the interaction data
Date column is not ordered
: The dates are not supplied in ascending order. elo.seq
will still work but the results won't be reliable because the interactions were not in the correct sequence.
Other warnings/errors can result from inconsistencies in either the presence or sequence data, or be of a more general nature:
Errors:
No 'Date' column found
: in the presence data, no column exists with the name/header "Date". Please rename (or add) the necessary column named "Date" to your presence data.
At least one presence entry is not 1 or 0
: presence data must come in binary form, i.e. an ID was either present ("1") or absent ("0") on a given date. No NA
s or other values are allowed.
Your data vectors do not match in length
: at least one of the three mandatory arguments (winner, loser, Date) differs from one other in length. Consider handling your data in a data.frame, which avoids this error.
Warnings:
IDs occur in the data with inconsistent capitalization
: because R
is case-sensitive, "A" and "a" are considered different individuals. If such labelling of IDs is on purpose, ignore the warning and set runcheck=FALSE
when calling elo.seq()
There is (are) X case(s) in which loser ID equals winner ID
: winner and loser represent the same ID
The following individuals were observed only on one day
: while not per se a problem for the calculation of Elo ratings, individuals that were observed only on one day (irrespective of the number of interactions on that day) cannot be plotted. eloplot
will give a warning in such cases, too.
Value
returns textual information about possible issues with the supplied data set, or states that data are fine for running with elo.seq
Author(s)
Christof Neumann
Examples
data(adv)
seqcheck(winner = adv$winner, loser = adv$loser, Date = adv$Date)
data(advpres)
seqcheck(winner = adv$winner, loser = adv$loser, Date = adv$Date,
presence = advpres)
# create faulty presence data
# remove one line from presence data
faultypres <- advpres[-1, ]
# make all individuals absent on one day
faultypres[5, 2:8] <- 0
# run check
seqcheck(winner = adv$winner, loser = adv$loser, Date = adv$Date,
presence = faultypres)
# fix first error
faultypres <- rbind(faultypres[1, ], faultypres)
faultypres$Date[1] <- "2010-01-01"
# run check again
seqcheck(winner = adv$winner, loser = adv$loser, Date = adv$Date,
presence = faultypres)
# fix presence on date for interaction number 6
faultypres[6, 2:8] <- 1
# run check again
seqcheck(winner = adv$winner, loser = adv$loser, Date = adv$Date,
presence = faultypres)
# all good now