R: Wraps jbd_coordinates_transposed to identify and fix...

jbd_Ctrans_chunker {BeeBDC}

R Documentation

Wraps jbd_coordinates_transposed to identify and fix transposed occurrences

Description

Because the jbd_coordinates_transposed() function is very RAM-intensive, this wrapper allows a user to specify chunk-sizes and only analyse a small portion of the occurrence data at a time. The prefix jbd_ is used to highlight the difference between this function and the original bdc::bdc_coordinates_transposed(). This function will preferably use the countryCode column generated by bdc::bdc_country_standardized().

Usage

jbd_Ctrans_chunker(
  data = NULL,
  lat = "decimalLatitude",
  lon = "decimalLongitude",
  idcol = "databse_id",
  country = "country_suggested",
  countryCode = "countryCode",
  sci_names = "scientificName",
  border_buffer = 0.2,
  save_outputs = TRUE,
  stepSize = 1e+06,
  chunkStart = 1,
  progressiveSave = TRUE,
  path = tempdir(),
  append = TRUE,
  scale = "large",
  mc.cores = 1
)

Arguments

`data`	A data frame or tibble. Occurrence records as input.
`lat`	Character. The column with latitude in decimal degrees. Default = "decimalLatitude".
`lon`	Character. The column with longitude in decimal degrees. Default = "decimalLongitude".
`idcol`	Character. The column name with a unique record identifier. Default = "database_id".
`country`	Character. The name of the column containing country names. Default = "country".
`countryCode`	Character. Identifies the column containing ISO-2 country codes Default = "countryCode".
`sci_names`	Character. The column containing scientific names. Default = "scientificName".
`border_buffer`	Numeric. The buffer, in decimal degrees, around points to help match them to countries. Default = 0.2 (~22 km at equator).
`save_outputs`	Logical. If TRUE, transposed occurrences will be saved to their own file.
`stepSize`	Numeric. The number of occurrences to process in each chunk. Default = 1000000.
`chunkStart`	Numeric. The chunk number to start from. This can be > 1 when you need to restart the function from a certain chunk; for example if R failed unexpectedly.
`progressiveSave`	Logical. If TRUE then the country output list will be saved between each iteration so that `append` can be used if the function is stopped part way through.
`path`	Character. The path to a file in which to save the 01_coordinates_transposed_ output.
`append`	Logical. If TRUE, the function will look to append an existing file.
`scale`	Passed to rnaturalearth's ne_countries(). Scale of map to return, one of 110, 50, 10 or 'small', 'medium', 'large'. Default = "large".
`mc.cores`	Numeric. If > 1, the jbd_correct_coordinates function will run in parallel using mclapply using the number of cores specified. If = 1 then it will be run using a serial loop. NOTE: Windows machines must use a value of 1 (see ?parallel::mclapply). Additionally, be aware that each thread can use large chunks of memory. Default = 1.#'

Value

Returns the input data frame with a new column, coordinates_transposed, where FALSE = columns that had coordinates transposed.

Examples

if(requireNamespace("rnaturalearthdata")){
library(dplyr)
  # Import and prepare the data
data(beesFlagged)
beesFlagged <- beesFlagged %>% dplyr::select(!c(.val, .sea)) %>%
  # Cut down the dataset to un example quicker
dplyr::filter(dplyr::row_number() %in% 1:20)
  # Run the function
beesFlagged_out <- jbd_Ctrans_chunker(
# bdc_coordinates_transposed inputs
data = beesFlagged,
idcol = "database_id",
lat = "decimalLatitude",
lon = "decimalLongitude",
country = "country_suggested",
countryCode = "countryCode",
# in decimal degrees (~22 km at the equator)
border_buffer = 1, 
save_outputs = FALSE,
sci_names = "scientificName",
# chunker inputs
# How many rows to process at a time
stepSize = 1000000,  
# Start row
chunkStart = 1,  
# Progressively save the output between each iteration?
progressiveSave = FALSE,
path = tempdir(),
# If FALSE it may overwrite existing dataset
append = FALSE,
  # Users should select scale = "large" as it is more thoroughly tested
scale = "medium",
mc.cores = 1
) 
table(beesFlagged_out$coordinates_transposed, useNA = "always")
} # END if require

[Package BeeBDC version 1.2.0 Index]