create_taxonomic_update_lookup {APCalign}R Documentation

Create a table with the best-possible scientific name match for Australian plant names

Description

This function takes a list of Australian plant names that need to be reconciled with current taxonomy and generates a lookup table of the best-possible scientific name match for each input name.

Usage case: This is APCalign’s core function, merging together the alignment and updating of taxonomy.

Usage

create_taxonomic_update_lookup(
  taxa,
  stable_or_current_data = "stable",
  version = default_version(),
  taxonomic_splits = "most_likely_species",
  full = FALSE,
  fuzzy_abs_dist = 3,
  fuzzy_rel_dist = 0.2,
  fuzzy_matches = TRUE,
  APNI_matches = TRUE,
  imprecise_fuzzy_matches = FALSE,
  identifier = NA_character_,
  resources = load_taxonomic_resources(),
  quiet = FALSE,
  output = NULL
)

Arguments

taxa

A list of Australian plant species that needs to be reconciled with current taxonomy.

stable_or_current_data

either "stable" for a consistent version, or "current" for the leading edge version.

version

The version number of the dataset to use.

taxonomic_splits

How to handle one_to_many taxonomic matches. Default is "return_all". The other options are "collapse_to_higher_taxon" and "most_likely_species". most_likely_species defaults to the original_name if that name is accepted by the APC; this will be right for certain species subsets, but make errors in other cases, use with caution.

full

logical for whether the full lookup table is returned or just key columns

fuzzy_abs_dist

The number of characters allowed to be different for a fuzzy match.

fuzzy_rel_dist

The proportion of characters allowed to be different for a fuzzy match.

fuzzy_matches

Fuzzy matches are turned on as a default. The relative and absolute distances allowed for fuzzy matches to species and infraspecific taxon names are defined by the parameters fuzzy_abs_dist and fuzzy_rel_dist.

APNI_matches

Name matches to the APNI (Australian Plant Names Index) are turned off as a default.

imprecise_fuzzy_matches

Imprecise fuzzy matches uses the fuzzy matching function with lenient levels set (absolute distance of 5 characters; relative distance = 0.25). It offers a way to get a wider range of possible names, possibly corresponding to very distant spelling mistakes. This is FALSE as default and all outputs should be checked as it often makes erroneous matches.

identifier

A dataset, location or other identifier, which defaults to NA.

resources

These are the taxonomic resources used for cleaning, this will default to loading them from a local place on your computer. If this is to be called repeatedly, it's much faster to load the resources using load_taxonomic_resources separately and pass the data in.

quiet

Logical to indicate whether to display messages while aligning taxa.

output

file path to save the output. If this file already exists, this function will check if it's a subset of the species passed in and try to add to this file. This can be useful for large and growing projects.

Details

Notes:

Value

A lookup table containing the accepted and suggested names for each original name input, and additional taxonomic information such as taxon rank, taxonomic status, taxon IDs and genera.

See Also

load_taxonomic_resources

Other taxonomic alignment functions: align_taxa(), update_taxonomy()

Examples


resources <- load_taxonomic_resources()

# example 1
create_taxonomic_update_lookup(c("Eucalyptus regnans",
                                 "Acacia melanoxylon",
                                 "Banksia integrifolia",
                                 "Not a species"),
                                 resources = resources)
                                 
# example 2
input <- c("Banksia serrata", "Banksia serrate", "Banksia cerrata", 
"Banksea serrata", "Banksia serrrrata", "Dryandra")

create_taxonomic_update_lookup(
    taxa = input,
    identifier = "APCalign test",
    full = TRUE,
    resources = resources
  )

# example 3
taxon_list <-
  readr::read_csv(
  system.file("extdata", "test_taxa.csv", package = "APCalign"),
  show_col_types = FALSE)

create_taxonomic_update_lookup(
    taxa = taxon_list$original_name,
    identifier = taxon_list$notes,
    full = TRUE,
    resources = resources
  )



[Package APCalign version 1.0.1 Index]