R: Code gap

codeGap {monographaR}

R Documentation

Code gap

Description

This function takes a numeric vector (or a data.frame with two columns including min and max values of a sample) and tries to find breaks in the distribution (gaps), if any gap is found it returns a coded character based on that.

Usage

codeGap(x, n = NULL, max.states = NULL, poly.sep = "/", gap.size = NULL)

Arguments

`x`	integer/numeric or a two column data.frame (min and max)
`n`	integer, desired number of states (if NULL the function will try to suggest a number)
`max.states`	integer, the maximum possible number of states
`poly.sep`	character, to indicate polymorphic states (if any)
`gap.size`	numeric, the number that should be considered as a "gap"

Details

If n = NULL the function will try to find the best scenario of states (n) based on the number of polymophic samples in the resulting classification. In large data sets it will be a good idea to constrain the search using (e.g., max.states=10). This coding tries to replicate the coding traditionally used in taxonomy.

Value

list, including: dat = data.frame including the original value and the coded value (state) polymorphic = the number of polymorphic samples (if n=NULL, it returns for all tested scenarios) dist = a histogram of the data distribution

Author(s)

Marcelo Reginato

Examples


c(NA, 1:5, 15:20, 25:42, 49:60, 68:90) -> x
data.frame(x,x=x+2) -> x2

codeGap(x, n=3, max.states = 5) -> code1
code1$dat

### check the distribution

na.omit(unique(code1$dat$state)) -> b
cols <- sort(rep(rainbow(length(b)),2))
as.numeric(unlist(strsplit(b, "-"))) -> b
plot(code1$dist)
abline(v=b, lty="dashed", col=cols, lwd=2)

### estimate "n"

codeGap(x, n=NULL, max.states = NULL) -> code1
code1$dat
plot(code1$dist)

### check the distribution

na.omit(unique(code1$dat$state)) -> b
cols <- sort(rep(rainbow(length(b)),2))
as.numeric(unlist(strsplit(b, "-"))) -> b
plot(code1$dist)
abline(v=b, lty="dashed", col=cols, lwd=2)

### ranges

codeGap(x2, n=NULL, max.states =3 , gap.size=2) -> code1
code1$dat
unique(code1$dat$state)

### check the distribution

na.omit(unique(code1$dat$state)) -> b
cols <- sort(rep(rainbow(length(b)),2))
as.numeric(unlist(strsplit(b, "-"))) -> b
plot(code1$dist)
abline(v=b, lty="dashed", col=cols, lwd=2)

[Package monographaR version 1.3.1 Index]