check_dup_wrong {votesys} | R Documentation |
Check Ballots with Duplicated Values, Mistakes, or without Any Valid Entry
Description
The function simply checks validity of ballots and shows the check result. If
you want a one-step clean, set clean
to TRUE and a set of cleaned ballots
will be returned. Here, duplicated values mean that the voter write the
same candidate more than one time, or, when
he assigns scores, he assigns the same score
to more than one candidates. Mistakes are names that do not appear in the candidate
list, or score values that are illegal (e.g., if voters are required to assign 1-5 to candidates,
then 6 is an illegal value). Ballots without a valid entry (that is, all entries are NAs) are also
to be picked out. Different formats can be input into the function, see Details.
Usage
check_dup_wrong(x, xtype = 2, candidate = NULL, vv = NULL, isna = NULL,
clean = FALSE)
Arguments
x |
a data.frame, matrix or list of raw ballots. See Details. |
xtype |
should be 1, 2 (default) or 3, designating the type of x. See Details. |
candidate |
if |
vv |
if |
isna |
entries which should be taken as NAs. |
clean |
the default is FALSE, that is, it does not return the cleaned data. If it is TRUE, a set of ballots without duplicated values, without mistakes and with at least one valid value, is returned. |
Details
The function accepts the following input:
(1) when
xtype
is 1, x must be a matrix. Column names are candidate names (if column names are NULL, they will be created: x1, x2, x3...). Candidate number is the number of columns of the matrix. Entry ij is the numeric score assigned by the ith voter to the jth candidate.(2) when
xtype
is 2, x can be a matrix or data.frame. Candidate number is the length ofcandidate
. Entries are names (character or numeric) of candidates. The i1, i2, i3... entries are the 1st, 2nd, 3rd... preferences of voter i.(3) when
xtype
is 3,x
should be a list. Each element of the list is a ballot, a vector contains the names (character or numeric) of candidates. The 1st preference is in the 1st position of the vector, the 2nd preference is in the 2nd position... The number of candidates is the length ofcandidate
; as a result, a ballot with number of names larger than candidate number is labelled as wrong.
Value
a list with 3 or 4 elements: row_with_dup
is
the rows (not row names) of rows that have
duplicated values; row_with_wrong
is the rows with illegal names or the
lengths of them are larger than candidate number (this could only happen when x
is a list). row_all_na
is the rows the entries of which are all NAs. For a list,
elements with NULL are also taken as all-NA ballots.
Examples
raw=list(
c('a', 'e', 'c', 'd', 'b'),
c('b', 'a', 'e'),
c('c', 'd', 'b'),
c('d', 'a', 'b'),
c('a', 'a', 'b', 'b', 'b'),
c(NA, NA, NA, NA),
v7=NULL,
v8=c('a', NA, NA, NA, NA, NA, NA),
v9=rep(" ", 3)
)
y=check_dup_wrong(raw, xtype=3, candidate=letters[1: 5])
y=check_dup_wrong(raw, xtype=3, candidate=letters[1: 4])