pai_summary {ptools}R Documentation

Summary Table for Multiple PAI Stats

Description

Takes a list of multiple PAI summary tables (for different predictions) and returns summaries at fixed area thresholds

Usage

pai_summary(pai_list, thresh, labs, wide = TRUE)

Arguments

pai_list

list of data frames that have the PAI stats from the pai function

thresh

vector of area numbers to select, e.g. 10 would select the top 10 areas, c(10,100) would select the top 10 and the top 100 areas

labs

vector of characters that specifies the labels for each PAI dataframe, should be the same length as pai_list

wide

boolean, if TRUE (default), returns data frame in wide format. Else returns summaries in long format

Details

Given predictions over an entire sample, this returns a dataframe with the sorted best PAI (sorted by density of predicted counts per area). PAI is defined as:

PAI = \frac{c_t/C}{a_t/A}

Where the numerator is the percent of crimes in cumulative t areas, and the denominator is the percent of the area encompassed. PEI is the observed PAI divided by the best possible PAI if you were a perfect oracle, so is scaled between 0 and 1. RRI is predicted/observed, so if you have very bad predictions can return Inf or undefined! See Wheeler & Steenbeek (2019) for the definitions of the different metrics. User note, PEI may behave funny with different sized areas.

Value

A dataframe with the PAI/PEI/RRI, and cumulative crime/predicted counts, for each original table

References

Drawve, G., & Wooditch, A. (2019). A research note on the methodological and theoretical considerations for assessing crime forecasting accuracy with the predictive accuracy index. Journal of Criminal Justice, 64, 101625.

Wheeler, A. P., & Steenbeek, W. (2021). Mapping the risk terrain for crime using machine learning. Journal of Quantitative Criminology, 37(2), 445-480.

See Also

pai() for a summary table of metrics for multiple pai tables given fixed N thresholds

Examples


# Making some very simple fake data
crime_dat <- data.frame(id=1:6,
                        obs=c(6,7,3,2,1,0),
                        pred=c(8,4,4,2,1,0))
crime_dat$const <- 1
p1 <- pai(crime_dat,'obs','pred','const')
print(p1)

# Combining multiple predictions, making
# A nice table
crime_dat$rand <- sample(crime_dat$obs,nrow(crime_dat),FALSE)
p2 <- pai(crime_dat,'obs','rand','const')
pai_summary(list(p1,p2),c(1,3,5),c('one','two'))


[Package ptools version 2.0.0 Index]