beta_kr {betaclust} | R Documentation |

## Fit the K.R Model

### Description

A beta mixture model for identifying differentially methylated CpG sites between `R`

DNA sample types collected from `N`

patients.

### Usage

```
beta_kr(data, M = 3, N, R, parallel_process = FALSE, seed = NULL)
```

### Arguments

`data` |
A dataframe of dimension |

`M` |
Number of methylation states to be identified. |

`N` |
Number of patients in the study. |

`R` |
Number of sample types collected from each patient for study. |

`parallel_process` |
The "TRUE" option results in parallel processing of the models for increased computational efficiency. The default option has been set as "FALSE" due to package testing limitations. |

`seed` |
Seed to allow for reproducibility (default = NULL). |

### Details

The K.R model allows identification of the differentially methylated CpG sites between the `R`

DNA sample types collected from each of `N`

patients.
As each CpG site in a DNA sample can belong to one of `M`

methylation states, there can be `K=M^R`

methylation state changes between `R`

DNA sample types.
The shape parameters vary for each DNA sample type but are constrained to be equal for each patient. An initial clustering using k-means is performed to identify `K`

clusters. The resulting clustering solution is provided as
starting values to the Expectation-Maximisation algorithm. A digamma approximation is used to obtain the maximised
parameters in the M-step.

### Value

A list containing:

cluster_size - The total number of CpG sites in each of the K clusters.

llk - A vector containing the log-likelihood value at each step of the EM algorithm.

alpha - The first shape parameter for the beta mixture model.

delta - The second shape parameter for the beta mixture model.

tau - The estimated mixing proportion for each cluster.

z - A matrix of dimension

`C \times K`

containing the posterior probability of each CpG site belonging to each of the`K`

clusters.classification - The classification corresponding to z, i.e. map(z).

uncertainty - The uncertainty of each CpG site's clustering.

DM - The AUC and WD metric for distribution similarity in each cluster.

### See Also

### Examples

```
my.seed <- 190
M <- 3
N <- 4
R <- 2
data_output = beta_kr(pca.methylation.data[1:30,2:9], M, N, R,
parallel_process = FALSE, seed = my.seed)
```

*betaclust*version 1.0.3 Index]