match_maker {PCAmatchR}R Documentation

Weighted matching of controls to cases using PCA results.

Description

Weighted matching of controls to cases using PCA results.

Usage

match_maker(
  PC = NULL,
  eigen_value = NULL,
  data = NULL,
  ids = NULL,
  case_control = NULL,
  num_controls = 1,
  num_PCs = NULL,
  eigen_sum = NULL,
  exact_match = NULL,
  weight_dist = TRUE,
  weights = NULL
)

Arguments

PC

Individual level principal component.

eigen_value

Computed eigenvalue for each PC. Used as the numerator to calculate the percent variance explained by each PC.

data

Dataframe containing id and case/control status. Optionally includes covariate data for exact matching.

ids

The unique id variable contained in both "PC" and "data."

case_control

The case control status variable.

num_controls

The number of controls to match to each case. Default is 1:1 matching.

num_PCs

The total number of PCs calculated within the PCA. Can be used as the denomiator to calculate the percent variance explained by each PC. Default is 1000.

eigen_sum

The sum of all possible eigenvalues within the PCA. Can be used as the denomiator to calculate the percent variance explained by each PC.

exact_match

Optional variables contained in the dataframe on which to perform exact matching (i.e. sex, race, etc.).

weight_dist

When set to true, matches are produced based on PC weighted Mahalanobis distance. Default is TRUE.

weights

Optional user defined weights used to compute the weighted Mahalanobis distance metric.

Value

A list of matches and weights.

Examples

# Create PC data frame by subsetting provided example dataset
pcs <- as.data.frame(PCs_1000G[,c(1,5:24)])
# Create eigenvalues vector using example dataset
eigen_vals <- c(eigenvalues_1000G)$eigen_values
# Create full eigenvalues vector using example dataset
all_eigen_vals<- c(eigenvalues_all_1000G)$eigen_values
# Create Covarite data frame
cov_data <- PCs_1000G[,c(1:4)]
# Generate a case status variable using ESN 1000 Genome population
cov_data$case <- ifelse(cov_data$pop=="ESN", c(1), c(0))
# With 1 to 1 matching
if(requireNamespace("optmatch", quietly = TRUE)){
                        library(optmatch)
                        match_maker(PC = pcs,
                                    eigen_value = eigen_vals,
                                    data = cov_data,
                                    ids = c("sample"),
                                    case_control = c("case"),
                                    num_controls = 1,
                                    eigen_sum = sum(all_eigen_vals),
                                    weight_dist=TRUE
                                   )
                        }


[Package PCAmatchR version 0.3.3 Index]