MLFS {MLFS}R Documentation

MLFS

Description

Machine Learning Forest Simulator

Usage

MLFS(
  data_NFI,
  data_site,
  data_tariffs = NULL,
  data_climate = NULL,
  df_volumeF_parameters = NULL,
  thinning_weights_species = NULL,
  final_cut_weights_species = NULL,
  thinning_weights_plot = NULL,
  final_cut_weights_plot = NULL,
  form_factors = NULL,
  form_factors_level = "species_plot",
  uniform_form_factor = 0.42,
  sim_steps,
  volume_calculation = "volume_functions",
  merchantable_whole_tree = "merchantable",
  sim_harvesting = TRUE,
  sim_mortality = TRUE,
  sim_ingrowth = TRUE,
  sim_crownHeight = TRUE,
  harvesting_sum = NULL,
  forest_area_ha = NULL,
  harvest_sum_level = NULL,
  plot_upscale_type = NULL,
  plot_upscale_factor = NULL,
  mortality_share = NA,
  mortality_share_type = "volume",
  mortality_model = "glm",
  ingrowth_model = "ZIF_poiss",
  BAI_rf_mtry = NULL,
  ingrowth_rf_mtry = NULL,
  mortality_rf_mtry = NULL,
  nb_laplace = 0,
  harvesting_type = "final_cut",
  share_thinning = 0.8,
  final_cut_weight = 10,
  thinning_small_weight = 1,
  species_n_threshold = 100,
  height_model = "brnn",
  crownHeight_model = "brnn",
  BRNN_neurons_crownHeight = 1,
  BRNN_neurons_height = 3,
  height_pred_level = 0,
  include_climate = FALSE,
  select_months_climate = c(1, 12),
  set_eval_mortality = TRUE,
  set_eval_crownHeight = TRUE,
  set_eval_height = TRUE,
  set_eval_ingrowth = TRUE,
  set_eval_BAI = TRUE,
  k = 10,
  blocked_cv = TRUE,
  max_size = NULL,
  max_size_increase_factor = 1,
  ingrowth_codes = c(3),
  ingrowth_max_DBH_percentile = 0.9,
  measurement_thresholds = NULL,
  area_correction = NULL,
  export_csv = FALSE,
  sim_export_mode = TRUE,
  include_mortality_BAI = TRUE,
  intermediate_print = FALSE
)

Arguments

data_NFI

data frame with individual tree variables

data_site

data frame with site descriptors. This data is related to data_NFI based on the 'plotID' column

data_tariffs

optional, but mandatory if volume is calculated using the one-parametric tariff functions. Data frame with plotID, species and V45. See details.

data_climate

data frame with climate data, covering the initial calibration period and all the years which will be included in the simulation

df_volumeF_parameters

optional, data frame with species-specific volume function parameters

thinning_weights_species

data frame with thinning weights for each species. The first column represents species code, each next column consists of species-specific thinning weights applied in each simulation step

final_cut_weights_species

data frame with final cut weights for each species. The first column represents species code, each next column consists of species-specific final cut weights applied in each simulation step

thinning_weights_plot

data frame with harvesting weights related to plot IDs, used for thinning

final_cut_weights_plot

data frame with harvesting weights related to plot IDs, used for final cut

form_factors

optional, data frame with species-specific form factors

form_factors_level

character, the level of specified form factors. It can be 'species', 'plot' or 'species_plot'

uniform_form_factor

numeric, uniform form factor to be used for all species and plots. Only if form_factors are not provided

sim_steps

The number of simulation steps

volume_calculation

character string defining the method for volume calculation: 'tariffs', 'volume_functions', 'form_factors' or 'slo_2p_volume_functions'

merchantable_whole_tree

character, 'merchantable' or 'whole_tree'. It indicates which type of volume functions will be used. This parameter is used only for volume calculation using the 'slo_2p_volume_functions'.

sim_harvesting

logical, should harvesting be simulated?

sim_mortality

logical, should mortality be simulated?

sim_ingrowth

logical, should ingrowth be simulated?

sim_crownHeight

logical, should crown heights be simulated? If TRUE, a crownHeight column is expected in data_NFI

harvesting_sum

a value, or a vector of values defining the harvesting sums through the simulation stage. If a single value, then it is used in all simulation steps. If a vector of values, the first value is used in the first step, the second in the second step, etc.

forest_area_ha

the total area of all forest which are subject of the simulation

harvest_sum_level

integer with value 0 or 1 defining the level of specified harvesting sum: 0 for plot level and 1 for regional level

plot_upscale_type

character defining the upscale method of plot level values. It can be 'area' or 'upscale factor'. If 'area', provide the forest area represented by all plots in hectares (forest_area_ha argument). If 'factor', provide the fixed factor to upscale the area of all plots. Please note: forest_area_ha/plot_upscale_factor = number of unique plots. This argument is important when harvesting sum is defined on regional level.

plot_upscale_factor

numeric value to be used to upscale area of each plot

mortality_share

a value, or a vector of values defining the proportion of the volume which is to be the subject of mortality. If a single value, then it is used in all simulation steps. If a vector of values, the first value is used in the first step, the second in the second step, and so on.

mortality_share_type

character, it can be 'volume' or 'n_trees'. If 'volume' then the mortality share relates to total standing volume, if 'n_trees' then mortality share relates to the total number of standing trees

mortality_model

model to be used for mortality prediction: 'glm' for generalized linear models; 'rf' for random forest algorithm; 'naiveBayes' for Naive Bayes algorithm

ingrowth_model

model to be used for ingrowth predictions. 'glm' for generalized linear models (Poisson regression), 'ZIF_poiss' for zero inflated Poisson regression and 'rf' for random forest

BAI_rf_mtry

a number of variables randomly sampled as candidates at each split of a random forest model for predicting basal area increments (BAI). If NULL, default settings are applied.

ingrowth_rf_mtry

a number of variables randomly sampled as candidates at each split of a random forest model for predicting ingrowth. If NULL, default settings are applied

mortality_rf_mtry

a number of variables randomly sampled as candidates at each split of a random forest model for predicting mortality. If NULL, default settings are applied

nb_laplace

value used for Laplace smoothing (additive smoothing) in naive Bayes algorithm. Defaults to 0 (no Laplace smoothing)

harvesting_type

character, it could be 'random', 'final_cut', 'thinning' or 'combined'. The latter combines 'final_cut' and 'thinning' options, where the share of each is specified with the argument 'share_thinning'

share_thinning

numeric, a number or a vector of numbers between 0 and 1 that specifies the share of thinning in comparison to final_cut. Only used if harvesting_type is 'combined'

final_cut_weight

numeric value affecting the probability distribution of harvested trees. Greater value increases the share of harvested trees having larger DBH. Default is 10.

thinning_small_weight

numeric value affecting the probability distribution of harvested trees. Greater value increases the share of harvested trees having smaller DBH. Default is 1.

species_n_threshold

a positive integer defining the minimum number of observations required to treat a species as an independent group

height_model

character string defining the model to be used for height prediction. If brnn, then ANN method with Bayesian Regularization is applied.

crownHeight_model

character string defining the model to be used for crown heights. Available are ANN with Bayesian regularization (brnn) or linear regression (lm)

BRNN_neurons_crownHeight

a positive integer defining the number of neurons to be used in the brnn method for predicting crown heights

BRNN_neurons_height

a positive integer defining the number of neurons to be used in the brnn method for predicting tree heights

height_pred_level

integer with value 0 or 1 defining the level of prediction for height-diameter (H-D) models. The value 1 defines a plot-level prediction, while the value 0 defines regional-level predictions. Default is 0. If using 1, make sure to have representative plot-level data for each species.

include_climate

logical, should climate variables be included as predictors

select_months_climate

vector of subset months to be considered. Default is c(1,12), which uses all months.

set_eval_mortality

logical, should the mortality model be evaluated and returned as the output

set_eval_crownHeight

logical, should the crownHeight model be evaluated and returned as the output

set_eval_height

logical, should the height model be evaluated and returned as the output

set_eval_ingrowth

logical, should the the ingrowth model be evaluated and returned as the output

set_eval_BAI

logical, should the the BAI model be evaluated and returned as the output

k

the number of folds to be used in the k fold cross-validation

blocked_cv

logical, should the blocked cross-validation be used in the evaluation phase?

max_size

a data frame with the maximum values of DBH for each species. If a tree exceeds this value, it dies. If not provided, the maximum is estimated from the input data. Two columns must be present, i.e. 'species' and 'DBH_max'

max_size_increase_factor

numeric value, which will be used to increase the max DBH for each species, when the maximum is estimated from the input data. If the argument 'max_size' is provided, the 'max_size_increase_factor' is ignored. Default is 1. To increase maximum for 10 percent, use 1.1.

ingrowth_codes

numeric value or a vector of codes which refer to ingrowth trees

ingrowth_max_DBH_percentile

which percentile should be used to estimate the maximum simulated value of ingrowth trees?

measurement_thresholds

data frame with two variables: 1) DBH_threshold and 2) weight. This information is used to assign the correct weights in BAI and increment sub-model; and to upscale plot-level data to hectares.

area_correction

optional data frame with three variables: 1) plotID and 2) DBH_threshold and 3) the correction factor to be multiplied by weight for this particular category.

export_csv

logical, if TRUE, at each simulation step, the results are saved in the current working directory as csv file

sim_export_mode

logical, if FALSE, the results of the individual simulation steps are not merged into the final export table. Therefore, output element 1 ($sim_results) will be empty. This was introduced to allow simulations when using larger data sets and long term simulations that might exceed the available RAM. In such cases, we recommend setting the argument export_csv = TRUE, which will export each simulation step to the current working directory.

include_mortality_BAI

logical, should basal area increments (BAI) be used as independent variable for predicting individual tree morality?

intermediate_print

logical, if TRUE intermediate steps will be printed while MLFS is running

Value

a list of class mlfs with at least 15 elements:

  1. $sim_results - a data frame with the simulation results

  2. $height_eval - a data frame with predicted and observed tree heights, or a character string indicating that tree heights were not evaluated

  3. $crownHeight_eval - a data frame with predicted and observed crown heights, or character string indicating that crown heights were not evaluated

  4. $mortality_eval - a data frame with predicted and observed probabilities of dying for all individual trees, or character string indicating that mortality sub-model was not evaluated

  5. $ingrowth_eval - a data frame with predicted and observed number of new ingrowth trees, separately for each ingrowth level, or character string indicating that ingrowth model was not evaluated

  6. $BAI_eval - a data frame with predicted and observed basal area increments (BAI), or character string indicating that BAI model was not evaluated

  7. $height_model_species - the output model for tree heights (species level)

  8. $height_model_speciesGroups - the output model for tree heights (species group level)

  9. $crownHeight_model_species - the output model for crown heights (species level)

  10. $crownHeight_model_speciesGroups - the output model for crown heights (species group level)

  11. $mortality_model - the output model for mortality

  12. $BAI_model_species - the output model for basal area increments (species level)

  13. $BAI_model_speciesGroups - the output model for basal area increments (species group level)

  14. $max_size - a data frame with maximum allowed diameter at breast height (DBH) for each species

  15. $ingrowth_model_3 - the output model for ingrowth (level 1) – the output name depends on ingrowth codes

  16. $ingrowth_model_15 - the output model for ingrowth (level 2) – optional and the output name depends on ingrowth codes

Examples


library(MLFS)

# open example data
data(data_NFI)
data(data_site)
data(data_climate)
data(df_volume_parameters)
data(measurement_thresholds)

test_simulation <- MLFS(data_NFI = data_NFI,
 data_site = data_site,
 data_climate = data_climate,
 df_volumeF_parameters = df_volume_parameters,
 form_factors = volume_functions,
 sim_steps = 2,
 sim_harvesting = TRUE,
 harvesting_sum = 100000,
 harvest_sum_level = 1,
 plot_upscale_type = "factor",
 plot_upscale_factor = 1600,
 measurement_thresholds = measurement_thresholds,
 ingrowth_codes = c(3,15),
 volume_calculation = "volume_functions",
 select_months_climate = seq(6,8),
 intermediate_print = FALSE
 )


[Package MLFS version 0.4.2 Index]