meltt_validate {meltt} | R Documentation |
Validation method to assess data integrated by meltt.
Description
Function to efficiently sample a subset of integrated data to generate performance statistics.
Usage
meltt_validate(object, description.vars = NULL, sample_prop = .1, within_window = TRUE,
spatial_window = NULL, temporal_window = NULL, reset = FALSE)
Arguments
object |
object of class |
description.vars |
String vector referencing column names located in the input data. These are the variables that will be folded into the description being validated; if none are provided, taxonomy levels are used by default. |
sample_prop |
Argument establishes the proportion of of the total matched pairs that are sampled from. The size of this sample is then used to determine the size of the control group (i.e. all entries not flagged as matches, which are paired with other unique entries and matches. These entries should not be matches). For example, if |
within_window |
Use the same spatio-temporal window used in the initial data integration to calculate what counts as a "proximate event" for all entries in the control group. |
spatial_window |
If |
temporal_window |
If |
reset |
If TRUE, the validation step will be reset and a new validation sample frame will be produced. Default = FALSE. |
Details
meltt_validate
offers an efficient method of assessing the performance of meltt for a specific integration, by randomly sampling from a proportion of pairs of matching events flagged by the algorithm as the same event, and then sampling a "control group" of equal size from events that were identified as unique (offering both unique-unique and unique-match pairs). The function compiles the samples and then generates a shiny app to ease assessment. Once all entries in the sample have been assessed, the function then returns accuracy statistics in terms of a confusion matrix. Performance is determined by the difference in the qualitative assessment in comparison to the meltt integration.
Value
Function automatically overwrites input "meltt" object; if validation set has been completely reviewed, then the function prints the performance statistics.
Author(s)
Karsten Donnay and Eric Dunford.
References
Karsten Donnay, Eric T. Dunford, Erin C. McGrath, David Backer, David E. Cunningham. (2018). "Integrating Conflict Event Data." Journal of Conflict Resolution.
See Also
Examples
data(crashMD)
output <- meltt(crash_data1, crash_data2, crash_data3,
taxonomies = crash_taxonomies, twindow = 1, spatwindow = 3)
## Not run:
# app will activate to validate sample.
meltt_validate(output)
# for smaller sample, must reset to overwrite existing validation sample
meltt_validate(output, sample_prop=.1, reset = TRUE)
# override of the validation to get a sense of the report
output$validation$validation_set$coding = 1
meltt_validate(output)
## End(Not run)