| dataCar {actuaRE} | R Documentation |
data Car
Description
This data set is taken from the dataCar data set of the insuranceData package and slightly adjusted (see the code in examples for reproducing this data set).
The original data set is based on one-year vehicle insurance policies taken out in 2004 or 2005. There are 67566 policies, of which 4589 (6.8%) had at least one claim.
Usage
data(dataCar)
Format
A data frame with 67566 observations on the following 15 variables.
veh_valuevehicle value, in $10,000s
exposure0-1
clmoccurrence of claim (0 = no, 1 = yes)
numclaimsnumber of claims
claimcst0claim amount (0 if no claim)
veh_bodyvehicle body, coded as
BUSCONVTCOUPEHBACKHDTOPMCARAMIBUSPANVNRDSTRSEDANSTNWGTRUCKUTEveh_age1 (youngest), 2, 3, 4
gendera factor with levels
FMareaa factor with levels
ABCDEFagecat1 (youngest), 2, 3, 4, 5, 6
X_OBSTAT_a factor with levels
01101 0 0 0Ythe loss ratio, defined as the number of claims divided by the exposure
wthe exposure, identical to
exposureVehicleTypetype of vehicle,
common vehicleoruncommon vehicleVehicleBodyvehicle body, identical to
veh_body
Details
Adjusted data set dataCar, where we removed claims with a loss ratio larger than 1 000 000. In addition, we summed the exposure per vehicle body and removed those where
the summed exposure was less than 100. Hereby, we ensure that there is sufficient exposure for each vehicle body category.
Source
http://www.acst.mq.edu.au/GLMsforInsuranceData
References
De Jong P., Heller G.Z. (2008), Generalized linear models for insurance data, Cambridge University Press
Examples
# How to construct the data set using the original dataCar data set from the insuranceData package
library(plyr)
library(magrittr)
data("dataCar", package = "insuranceData")
dataCar$Y = with(dataCar, claimcst0 / exposure)
dataCar$w = dataCar$exposure
dataCar = dataCar[which(dataCar$Y < 1e6), ]
Yw = ddply(dataCar, .(veh_body), function(x) c(crossprod(x$Y, x$w) / sum(x$w), sum(x$w)))
dataCar = dataCar[!dataCar$veh_body %in% Yw[Yw$V2 < 1e2, "veh_body"], ]
dataCar$veh_body %<>% droplevels()
dataCar$VehicleType = sapply(tolower(dataCar$veh_body), function(x) {
if(x %in% c("sedan", "ute", "hback"))
"Common vehicle"
else
"Uncommon vehicle"
})
dataCar$VehicleBody = dataCar$veh_body