beta_k {betaclust} | R Documentation |
Fit the K.. model
Description
Fit the K.. model from the betaclust
family of beta mixture models for DNA methylation data.
The K.. model analyses a single DNA sample type and identifies the thresholds between the different methylation states.
Usage
beta_k(data, M = 3, parallel_process = FALSE, seed = NULL)
Arguments
data |
A dataframe of dimension |
M |
Number of methylation states to be identified in a DNA sample type. |
parallel_process |
The "TRUE" option results in parallel processing of the models for increased computational efficiency. The default option has been set as "FALSE" due to package testing limitations. |
seed |
Seed to allow for reproducibility (default = NULL). |
Details
The K.. model clusters each of the C
CpG sites into one of K
methylation states, based on data from N
patients for one DNA sample type (i.e. R = 1
).
As each CpG site can belong to any of the M = 3
methylation states (hypomethylated, hemimethylated or hypermethylated), the default value of K = M = 3
.
Under the K.. model the shape parameters of each cluster are constrained to be equal for each patient. The returned object from this function can be passed as an input parameter to the
threshold
function available in this package to calculate the thresholds between the methylation states.
Value
A list containing:
cluster_size - The total number of CpG sites in each of the K clusters.
llk - A vector containing the log-likelihood value at each step of the EM algorithm.
alpha - The first shape parameter for the beta mixture model.
delta - The second shape parameter for the beta mixture model.
tau - The estimated mixing proportion for each cluster.
z - A matrix of dimension
C \times K
containing the posterior probability of each CpG site belonging to each of theK
clusters.classification - The classification corresponding to z, i.e. map(z).
uncertainty - The uncertainty of each CpG site's clustering.
See Also
Examples
my.seed <- 190
M <- 3
data_output <- beta_k(pca.methylation.data[1:30,2:5], M,
parallel_process = FALSE, seed = my.seed)
thresholds <- threshold(data_output, pca.methylation.data[1:30,2:5], "K..")