R: Expectation and Variance of Log Winsorized _F

mean_var_logwinf {MAnorm2}

R Documentation

Expectation and Variance of Log Winsorized F Distribution

Description

mean_var_logwinf calculates the expectation and variance of a log Winsorized F distribution by appealing to methods for numerical integration.

Usage

mean_var_logwinf(
  df1,
  df2,
  p_low = 0.01,
  p_up = 0.1,
  nw = gauss.quad(128, kind = "legendre")
)

Arguments

df1, df2

Vectors of numbers of numerator and denominator degrees of freedom. Inf is allowed.

p_low, p_up

Vectors of lower- and upper-tail probabilities for Winsorizing. Each element must be strictly larger than 0, and each pair of p_low and p_up must have a sum strictly smaller than 1.

Note that df1, df2, p_low, and p_up are recycled to align with the longest of them.

nw

A list containing nodes and weights variables for calculating the definite integral of a function f over the interval [-1, 1], which is approximated by sum(nw$weights * f(nw$nodes)). By default, mean_var_logwinf uses a set of Gauss-Legendre nodes along with the corresponding weights calculated by gauss.quad.

Details

The function implements exactly the same method described in Phipson et al., 2016 (see "References").

Value

A list consisting of the following components:

mu: Vector of expectations.
v: Vector of variances.

References

Phipson, B., et al., Robust Hyperparameter Estimation Protects against Hypervariable Genes and Improves Power to Detect Differential Expression. Annals of Applied Statistics, 2016. 10(2): p. 946-963.

Examples

# Derive the expectation and variance of a log Winsorized F distribution by
# simulation.
random_logwinf <- function(n, df1, df2, p_low, p_up) {
    x <- rf(n, df1, df2)
    q_low <- qf(p_low, df1, df2, lower.tail = TRUE)
    q_up <- qf(p_up, df1, df2, lower.tail = FALSE)
    x[x < q_low] <- q_low
    x[x > q_up] <- q_up
    x <- log(x)
    c(mean(x), var(x))
}

# Set parameters.
n <- 10000
df1 <- 2
df2 <- 2 ^ (0:10)
p_low <- 0.01
p_up <- 0.1

# Compare simulation results with those from numerical integration.
set.seed(100)
res1 <- vapply(df2, function(x) random_logwinf(n, df1, x, p_low, p_up),
               numeric(2))
res2 <- mean_var_logwinf(df1, df2, p_low, p_up)

# Compare mean.
plot(0:10, res1[1, ], type = "l", lwd = 2, col = "red", xlab = "Log2(df2)",
     ylab = "Mean")
lines(0:10, res2$mu, lty = 5, lwd = 2, col = "blue")
legend("topright", c("Simulation", "Numerical integration"), lty = c(1, 5),
       lwd = 2, col = c("red", "blue"))

# Compare variance.
plot(0:10, res1[2, ], type = "l", lwd = 2, col = "red", xlab = "Log2(df2)",
     ylab = "Var")
lines(0:10, res2$v, lty = 5, lwd = 2, col = "blue")
legend("topright", c("Simulation", "Numerical integration"), lty = c(1, 5),
       lwd = 2, col = c("red", "blue"))

# When df2 is Inf.
random_logwinf(n, df1, Inf, p_low, p_up)
mean_var_logwinf(df1, Inf, p_low, p_up)

[Package MAnorm2 version 1.2.2 Index]