tidyst_kda {eks}R Documentation

Tidy and geospatial kernel discrimination analysis (classification)

Description

Tidy and geospatial versions of kernel discrimination analysis (classification) for 1- and 2-dimensional data.

Usage

tidy_kda(data, ...)
st_kda(x, ...)

Arguments

data

grouped tibble of data values

x

sf object with grouping attribute and with point geometry

...

other parameters in ks::kda function

Details

A kernel discriminant analysis (aka classification or supervised learning) assigns each grid point to the group with the highest density value, weighted by the prior probabilities.

The output from *_kda have the same structure as the kernel density estimate from *_kde, except that estimate is the weighted kernel density values at the grid points (weighted by prior_prob), and label becomes the KDA grouping variable that indicates to which of the groups the grid points belong. The output is a grouped tibble, grouped by the input grouping variable.

For details of the computation of the kernel discriminant analysis and the bandwidth selector procedure, see ?ks::kda. The bandwidth matrix of smoothing parameters is computed as in ks::kde per group.

Value

–For tidy_kda, the output is an object of class tidy_ks, which is a tibble with columns:

x

evaluation points in x-axis (name is taken from 1st input variable in data)

y

evaluation points in y-axis (2-d) (name is taken from 2nd input variable in data)

estimate

weighted kernel density estimate values

prior_prob

prior probabilities for each group

ks

first row (within each group) contains the untidy kernel estimate from ks::kda

tks

short object class label derived from the ks object class

label

estimated KDA group label at (x,y)

group

grouping variable (same as input).

–For st_kda, the output is an object of class st_ks, which is a list with fields:

tidy_ks

tibble of simplified output (ks, tks, label, group) from tidy_kda

grid

sf object of grid of weighted kernel density estimate values, as polygons, with attributes estimate, label, group copied from the tidy_ks object

sf

sf object of 1% to 99% contour regions of weighted kernel density estimate, as multipolygons, with attributes contlabel derived from the contour level; and estimate, group copied from the tidy_ks object.

Examples

## tidy discriminant analysis (classification)
data(cardio, package="ks")
cardio <- dplyr::as_tibble(cardio[,c("ASTV","Mean","NSP")])
cardio <- dplyr::mutate(cardio, NSP=ordered(NSP))
cardio <- dplyr::group_by(cardio, NSP)
set.seed(8192)
cardio.train.ind <- sample(1:nrow(cardio), round(nrow(cardio)/4,0))
cardio.train <- cardio[cardio.train.ind,]
cardio.train1 <- dplyr::select(cardio.train, ASTV, NSP)
cardio.train2 <- dplyr::select(cardio.train, ASTV, Mean, NSP)

## tidy 1-d classification
t1 <- tidy_kda(cardio.train1) 
gt1 <- ggplot2::ggplot(t1, ggplot2::aes(x=ASTV)) 
gt1 + ggplot2::geom_line(ggplot2::aes(colour=NSP)) + 
    ggplot2::geom_rug(ggplot2::aes(colour=label), sides="b", linewidth=1.5) +
    ggplot2::scale_colour_brewer(palette="Dark2", na.translate=FALSE) 

## tidy 2-d classification
t2 <- tidy_kda(cardio.train2)
gt2 <- ggplot2::ggplot(t2, ggplot2::aes(x=ASTV, y=Mean)) 
gt2 + geom_contour_ks(ggplot2::aes(colour=NSP)) + 
    ggplot2::geom_tile(ggplot2::aes(fill=label), alpha=0.2) +
    ggplot2::scale_fill_brewer(palette="Dark2", na.translate=FALSE) +
    ggplot2::scale_colour_brewer(palette="Dark2") + ggplot2::theme_bw()

## geospatial classification
data(wa)
data(grevilleasf)
grevillea_gr <- dplyr::filter(grevilleasf, species=="hakeoides" |
    species=="paradoxa")
grevillea_gr <- dplyr::mutate(grevillea_gr, species=factor(species))  
grevillea_gr <- dplyr::group_by(grevillea_gr, species)
s1 <- st_kda(grevillea_gr)
s2 <- st_ksupp(st_kde(grevillea_gr))
s1$grid <- s1$grid[sf::st_intersects(s1$grid, 
    sf::st_convex_hull(sf::st_union(s2$sf)), sparse=FALSE),]

## base R plot
xlim <- c(1.2e5, 1.1e6); ylim <- c(6.1e6, 7.2e6)
plot(wa, xlim=xlim, ylim=ylim)
plot(s1, which_geometry="grid", add=TRUE, border=NA, legend=FALSE)
plot(s1, add=TRUE, lwd=2, border=rep(colorspace::qualitative_hcl(
    palette="Dark2", n=2, alpha=0.5), each=3))

## geom_sf plot
gs1 <- ggplot2::ggplot(s1) + ggplot2::geom_sf(data=wa, fill=NA) + 
    ggplot2::geom_sf(data=dplyr::mutate(s1$grid, species=label), 
    ggplot2::aes(fill=species), alpha=0.1, colour=NA) + 
    ggthemes::theme_map()
gs1 + ggplot2::geom_sf(data=st_get_contour(s1), 
    ggplot2::aes(colour=species), fill=NA) +
    colorspace::scale_colour_discrete_qualitative(palette="Dark2") +
    colorspace::scale_fill_discrete_qualitative(palette="Dark2") +
    ggplot2::facet_wrap(~species) +
    ggplot2::coord_sf(xlim=xlim, ylim=ylim)

[Package eks version 1.0.4 Index]