generate {aphid} | R Documentation |
Generate random sequences from a model.
Description
The generate
function outputs a random sequence from a HMM or PHMM.
Usage
generate(x, size, ...)
## S3 method for class 'HMM'
generate(x, size, logspace = "autodetect", random = TRUE, ...)
## S3 method for class 'PHMM'
generate(
x,
size,
logspace = "autodetect",
gap = "-",
random = TRUE,
DNA = FALSE,
AA = FALSE,
...
)
Arguments
x |
an object of class |
size |
a non-negative integer representing the length of the output
sequence if x is a |
... |
additional arguments to be passed between methods. |
logspace |
logical indicating whether the emission and transition
probabilities of x are logged. If |
random |
logical indicating whether residues should be emitted randomly with probabilities defined by the emission probabilities in the model (TRUE; default), or deterministically, whereby each residue is emitted and each transition taken based on the maximum emission/transition probability in the current state. |
gap |
the character used to represent gaps (delete states)
in the output sequence (only applicable for |
DNA |
logical indicating whether the returned sequence should be a
|
AA |
logical indicating whether the returned sequence should be a
|
Details
This simple function generates a single sequence from a HMM or profile HMM by recursively simulating a path through the model. The function is fairly slow in its current state, but a faster C++ function may be made available in a future version depending on demand.
Value
a named vector giving the sequence of residues emitted by the model, with the "names" attribute representing the hidden states.
Author(s)
Shaun Wilkinson
References
Durbin R, Eddy SR, Krogh A, Mitchison G (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge, United Kingdom.
Examples
## Generate a random sequence from a standard HMM
## The dishonest casino example from Durbin et al (1998) chapter 3.2
states <- c("Begin", "Fair", "Loaded")
residues <- paste(1:6)
### Define the transition probability matrix
A <- matrix(c(0, 0, 0, 0.99, 0.95, 0.1, 0.01, 0.05, 0.9), nrow = 3)
dimnames(A) <- list(from = states, to = states)
### Define the emission probability matrix
E <- matrix(c(rep(1/6, 6), rep(1/10, 5), 1/2), nrow = 2, byrow = TRUE)
dimnames(E) <- list(states = states[-1], residues = residues)
### Build and plot the HMM object
x <- structure(list(A = A, E = E), class = "HMM")
plot(x, main = "Dishonest casino HMM")
### Generate a random sequence from the model
generate(x, size = 300)
##
## Generate a random sequence from a profile HMM:
## Small globin alignment data from Durbin et al (1998) Figure 5.3
data(globins)
### Derive a profile hidden Markov model from the alignment
globins.PHMM <- derivePHMM(globins, residues = "AMINO", seqweights = NULL)
plot(globins.PHMM, main = "Profile hidden Markov model for globins")
### Simulate a random sequence from the model
suppressWarnings(RNGversion("3.5.0"))
set.seed(999)
simulation <- generate(globins.PHMM, size = 20)
simulation ## "F" "S" "A" "N" "N" "D" "W" "E"
### Names attribute indicates that all residues came from "match" states