R: Summary Table for Multiple PAI Stats

pai_summary {ptools}

R Documentation

Summary Table for Multiple PAI Stats

Description

Takes a list of multiple PAI summary tables (for different predictions) and returns summaries at fixed area thresholds

Usage

pai_summary(pai_list, thresh, labs, wide = TRUE)

Arguments

`pai_list`	list of data frames that have the PAI stats from the `pai` function
`thresh`	vector of area numbers to select, e.g. 10 would select the top 10 areas, c(10,100) would select the top 10 and the top 100 areas
`labs`	vector of characters that specifies the labels for each PAI dataframe, should be the same length as `pai_list`
`wide`	boolean, if TRUE (default), returns data frame in wide format. Else returns summaries in long format

Details

Given predictions over an entire sample, this returns a dataframe with the sorted best PAI (sorted by density of predicted counts per area). PAI is defined as:

PAI = \frac{c_t/C}{a_t/A}

Where the numerator is the percent of crimes in cumulative t areas, and the denominator is the percent of the area encompassed. PEI is the observed PAI divided by the best possible PAI if you were a perfect oracle, so is scaled between 0 and 1. RRI is predicted/observed, so if you have very bad predictions can return Inf or undefined! See Wheeler & Steenbeek (2019) for the definitions of the different metrics. User note, PEI may behave funny with different sized areas.

Value

A dataframe with the PAI/PEI/RRI, and cumulative crime/predicted counts, for each original table

References

Drawve, G., & Wooditch, A. (2019). A research note on the methodological and theoretical considerations for assessing crime forecasting accuracy with the predictive accuracy index. Journal of Criminal Justice, 64, 101625.

Wheeler, A. P., & Steenbeek, W. (2021). Mapping the risk terrain for crime using machine learning. Journal of Quantitative Criminology, 37(2), 445-480.

Examples


# Making some very simple fake data
crime_dat <- data.frame(id=1:6,
                        obs=c(6,7,3,2,1,0),
                        pred=c(8,4,4,2,1,0))
crime_dat$const <- 1
p1 <- pai(crime_dat,'obs','pred','const')
print(p1)

# Combining multiple predictions, making
# A nice table
crime_dat$rand <- sample(crime_dat$obs,nrow(crime_dat),FALSE)
p2 <- pai(crime_dat,'obs','rand','const')
pai_summary(list(p1,p2),c(1,3,5),c('one','two'))

[Package ptools version 2.0.0 Index]