hsp_nearest_neighbor {castor} | R Documentation |
Hidden state prediction based on nearest neighbor.
Description
Predict unknown (hidden) character states of tips on a tree using nearest neighbor matching.
Usage
hsp_nearest_neighbor(tree, tip_states, check_input=TRUE)
Arguments
tree |
A rooted tree of class "phylo". |
tip_states |
A vector of length Ntips, specifying the state of each tip in the tree. Tip states can be any valid data type (e.g., characters, integers, continuous numbers, and so on). |
check_input |
Logical, specifying whether to perform some basic checks on the validity of the input data. If you are certain that your input data are valid, you can set this to |
Details
For each tip with unknown state, this function seeks the closest tip with known state, in terms of patristic distance. The state of the closest tip is then used as a prediction of the unknown state. In the case of multiple equal matches, the precise outcome is unpredictable (this is unlikely to occur if edge lengths are continuous numbers, but may happen frequently if e.g. edge lengths are all of unit length). This algorithm is arguably one of the crudest methods for predicting character states, so use at your own discretion.
Any NA
entries in tip_states
are interpreted as unknown states.
If tree$edge.length
is missing, each edge in the tree is assumed to have length 1. The tree may include multifurcations (i.e. nodes with more than 2 children) as well as monofurcations (i.e. nodes with only one child). Tips must be represented in tip_states
in the same order as in tree$tip.label
. tip_states
need not include names; if names are included, however, they are checked for consistency with the tree's tip labels (if check_input==TRUE
).
Value
A list with the following elements:
success |
Logical, indicating whether HSP was successful. If |
states |
Vector of length Ntips, listing the known and predicted state for each tip. |
nearest_neighbors |
Integer vector of length Ntips, listing for each tip the index of the nearest tip with known state. Hence, |
nearest_distances |
Numeric vector of length Ntips, listing for each tip the patristic distance to the nearest tip with known state. For tips with known state, distances will be zero. |
Author(s)
Stilianos Louca
References
J. R. Zaneveld and R. L. V. Thurber (2014). Hidden state prediction: A modification of classic ancestral state reconstruction algorithms helps unravel complex symbioses. Frontiers in Microbiology. 5:431.
See Also
hsp_max_parsimony
,
hsp_mk_model
,
Examples
## Not run:
# generate random tree
Ntips = 20
tree = generate_random_tree(list(birth_rate_intercept=1),max_tips=Ntips)$tree
# simulate a binary trait
Q = get_random_mk_transition_matrix(2, rate_model="ER")
tip_states = simulate_mk_model(tree, Q)$tip_states
# print tip states
print(tip_states)
# set half of the tips to unknown state
tip_states[sample.int(Ntips,size=as.integer(Ntips/2),replace=FALSE)] = NA
# reconstruct all tip states via nearest neighbor
predicted_states = hsp_nearest_neighbor(tree, tip_states)$states
# print predicted tip states
print(predicted_states)
## End(Not run)