w_mean {expss}R Documentation

Compute various weighted statistics

Description

Usage

w_mean(x, weight = NULL, na.rm = TRUE)

w_median(x, weight = NULL, na.rm = TRUE)

w_var(x, weight = NULL, na.rm = TRUE)

w_sd(x, weight = NULL, na.rm = TRUE)

w_se(x, weight = NULL, na.rm = TRUE)

w_mad(x, weight = NULL, na.rm = TRUE)

w_sum(x, weight = NULL, na.rm = TRUE)

w_n(x, weight = NULL, na.rm = TRUE)

unweighted_valid_n(x, weight = NULL)

valid_n(x, weight = NULL)

w_max(x, weight = NULL, na.rm = TRUE)

w_min(x, weight = NULL, na.rm = TRUE)

w_cov(x, weight = NULL, use = c("pairwise.complete.obs", "complete.obs"))

w_cor(x, weight = NULL, use = c("pairwise.complete.obs", "complete.obs"))

w_pearson(x, weight = NULL, use = c("pairwise.complete.obs", "complete.obs"))

w_spearman(x, weight = NULL, use = c("pairwise.complete.obs", "complete.obs"))

Arguments

x

a numeric vector (matrix/data.frame for correlations) containing the values whose weighted statistics is to be computed.

weight

a vector of weights to use for each element of x. Cases with missing, zero or negative weights will be removed before calculations. If weight is missing then unweighted statistics will be computed.

na.rm

a logical value indicating whether NA values should be stripped before the computation proceeds. Note that contrary to base R statistic functions the default value is TRUE (remove missing values).

use

"pairwise.complete.obs" (default) or "complete.obs". In the first case the correlation or covariance between each pair of variables is computed using all complete pairs of observations on those variables. If use is "complete.obs" then missing values are handled by casewise deletion.

Details

If argument of correlation functions is data.frame with variable labels then variables names will be replaced with labels. If this is undesirable behavior use drop_var_labs function: w_cor(drop_var_labs(x)). Weighted Spearman correlation coefficients are calculated with weights rounded to nearest integer. It gives the same result as in SPSS Statistics software. By now this algorithm is not memory efficient.

Value

a numeric value of length one/correlation matrix

Examples

data(mtcars)
dfs = mtcars %>% columns(mpg, disp, hp, wt)

with(dfs, w_mean(hp, weight = 1/wt))

# apply labels
mtcars = mtcars %>% 
    apply_labels(
        mpg = "Miles/(US) gallon",
        cyl = "Number of cylinders",
        disp = "Displacement (cu.in.)",
        hp = "Gross horsepower",
        drat = "Rear axle ratio",
        wt = "Weight (lb/1000)",
        qsec = "1/4 mile time",
        vs = "Engine",
        vs = c("V-engine" = 0, 
                "Straight engine" = 1),
        am = "Transmission",
        am = c(automatic = 0, 
                manual=1),
        gear = "Number of forward gears",
        carb = "Number of carburetors"
    )

# weighted correlations with labels
w_cor(dfs, weight = 1/dfs$wt)

# without labels
w_cor(drop_var_labs(dfs), weight = 1/dfs$wt)

[Package expss version 0.11.6 Index]