beta_kr {betaclust} | R Documentation |
Fit the K.R Model
Description
A beta mixture model for identifying differentially methylated CpG sites between DNA sample types collected from
patients.
Usage
beta_kr(data, M = 3, N, R, parallel_process = FALSE, seed = NULL)
Arguments
data |
A dataframe of dimension |
M |
Number of methylation states to be identified. |
N |
Number of patients in the study. |
R |
Number of sample types collected from each patient for study. |
parallel_process |
The "TRUE" option results in parallel processing of the models for increased computational efficiency. The default option has been set as "FALSE" due to package testing limitations. |
seed |
Seed to allow for reproducibility (default = NULL). |
Details
The K.R model allows identification of the differentially methylated CpG sites between the DNA sample types collected from each of
patients.
As each CpG site in a DNA sample can belong to one of
methylation states, there can be
methylation state changes between
DNA sample types.
The shape parameters vary for each DNA sample type but are constrained to be equal for each patient. An initial clustering using k-means is performed to identify
clusters. The resulting clustering solution is provided as
starting values to the Expectation-Maximisation algorithm. A digamma approximation is used to obtain the maximised
parameters in the M-step.
Value
A list containing:
cluster_size - The total number of CpG sites in each of the K clusters.
llk - A vector containing the log-likelihood value at each step of the EM algorithm.
alpha - The first shape parameter for the beta mixture model.
delta - The second shape parameter for the beta mixture model.
tau - The estimated mixing proportion for each cluster.
z - A matrix of dimension
containing the posterior probability of each CpG site belonging to each of the
clusters.
classification - The classification corresponding to z, i.e. map(z).
uncertainty - The uncertainty of each CpG site's clustering.
DM - The AUC and WD metric for distribution similarity in each cluster.
See Also
Examples
my.seed <- 190
M <- 3
N <- 4
R <- 2
data_output = beta_kr(pca.methylation.data[1:30,2:9], M, N, R,
parallel_process = FALSE, seed = my.seed)