p_value {keyperm} | R Documentation |
Convert results of permutation test for keyness to p-values
Description
Calculate p-values from the results of keyperm()
with
output = "counts"
.
Usage
p_value(results, alternative = NULL)
Arguments
results |
results from permutation test.
Must be of class |
alternative |
direction of p-value to calculate, one of |
Details
Valid (slightly conservative) p-values are calculated from an
object of class keyperm_results_counts
that is obtained
by running keyperm()
with output = "counts"
.
keyperm_results_counts
is a matrix with three columns that
contain the counts of generated permutations that resulted in a score
strictly less than, equal to and strictly greater that the observed score.
For a one-sided p-value we use
pvalue_greater = (no. greater + no. equal + 1)/(no. of perms + 1)
or
pvalue_less = (no. less + no. equal + 1)/(no. of perms + 1)
Adding 1 in both the numerator and denominator amounts to including the observed
values. This results in a slightly conservative p-value, but guarantees that
the test is valid for any number of random permutations. It also means that
never a p-value of zero is returned but the minimum possible p-value is
1/(no. perms + 1)
.
The two-sided p-value is calculated by
pvalue_twosided = 2 * min(pvalue_less, pvalue_greater)
(values larger than 1 are set to 1).
If alternative
is not specified by the user, different defaults are
used depending on the scoretype (which is included as an attribute
in the keyperm_results_counts
object).
Since for llr
and chisq
, large values indicate a great
deviation from equal frequencies without indicating the direction,
alternative == "greater"
is basically the only alternative of interest
and is used as a default.
For diff
and logratio
large absolute values indicate
a great deviation from equal frequencies, and positive values correspond to
higher frequencies in A, negative frequencies correspond to a higher frequency in B.
For these scoretypes, the default is alternative = "two.sided"
.
If only "positive" keywords for A with respect to B are desired, use alternative = "less"
.
Value
a numeric vector of p-values.