map_to_state_space {castor}R Documentation

Map states of a discrete trait to integers.


Given a list of states (e.g., for each tip in a tree), map the unique states to integers 1,..,Nstates, where Nstates is the number of possible states. This function can be used to translate states that are originally represented by characters or factors, into integer states as required by ancestral state reconstruction and hidden state prediction functions in this package.


map_to_state_space(raw_states, fill_gaps=FALSE, 
                   sort_order="natural", include_state_values=FALSE)



A vector of values (states), each of which can be converted to a different character. This list can include the same value multiple times, for example if values represent the trait's states for tips in a tree.


Logical. If TRUE, then states are converted to integers using as.integer(as.character()), and then all missing intermediate integer values are included as additional possible states. For example, if raw_states contained the values 2,4,6, then 3 and 5 are assumed to also be possible states.


Character, specifying the order in which raw_states should be mapped to ascending integers. Either "natural" or "alphabetical". If "natural", numerical parts of characters are sorted numerically, e.g. as in "3"<"a2"<"a12"<"b1".


Logical, specifying whether to also return a numerical version of the unique states. For example, the states "3","a2","4.5" will be mapped to the numeric values 3, NA, 4.5.


Several ancestral state reconstruction and hidden state prediction algorithms in the castor package (e.g., asr_max_parsimony) require that the focal trait's states are represented by integer indices within 1,..,Nstates. These indices are then associated, afor example, with column and row indices in the transition cost matrix (in the case of maximum parsimony reconstruction) or with column indices in the returned matrix containing marginal ancestral state probabilities (e.g., in asr_mk_model). The function map_to_state_space can be used to conveniently convert a set of discrete states into integers, for use with the aforementioned algorithms.


A list with the following elements:


Integer. Number of possible states for the trait, based on the unique values encountered in raw_states (after conversion to characters). This may be larger than the number of unique values in raw_states, if fill_gaps was set to TRUE.


Character vector of size Nstates, storing the original name (character version) of each state. For example, if raw_states was c("b1","3","a12","a2","b1","a2") and sort_order=="natural", then Nstates will be 4 and state_names will be c("3","a2","a12","b1").


Optional, only included if include_state_values==TRUE. A numeric vector of size Nstates, providing the numerical value for each unique state.


Integer vector of size equal to length(raw_states), listing the integer representation of each value in raw_states.


An integer vector of size Nstates, with names(name2index) set to state_names. This vector can be used to map any new list of states (in character format) to their integer representation. In particular, name2index[as.character(raw_states)] is equal to mapped_states.


Stilianos Louca


# generate a sequence of random states
unique_states = c("b","c","a")
raw_states = unique_states[,size=10,replace=TRUE)]

# map to integer state space
mapping = map_to_state_space(raw_states)

cat(sprintf("Checking that original unique states is the same as the one inferred:\n"))

cat(sprintf("Checking reversibility of mapping:\n"))

[Package castor version 1.6.8 Index]