hdfail {frailtySurv} | R Documentation |
Hard drive failure dataset
Description
This dataset contains the observed follow-up times and SMART statistics of 52k unique hard drives.
Daily snapshots of a large backup storage provider over 2 years were made publicly available. On each day, the Self-Monitoring, Analysis, and Reporting Technology (SMART) statistics of operational drives are recorded. When a hard drive is no longer operational, it is marked as a failure and removed from the subsequent daily snapshots. New hard drives are also continuously added to the population. In total, there are over 52k unique hard drives over approximately 2 years and 2885 (5.5%) failures.
Usage
data("hdfail")
Format
A data frame with 52422 observations on the following 8 variables.
serial
unique serial number of the hard drive
model
hard drive model
time
the observed followup time
status
failure indicator
temp
temperature in Celsius
rsc
binary covariate, where 1 indicates sectors that encountered read, write, or verification errors
rer
binary covariate, where 1 indicates a non-zero rate of errors that occur in hardware when reading from data from disk.
psc
binary covariate, where 1 indicates there were sectors waiting to be remapped due to an unrecoverable error.
Source
https://www.backblaze.com/cloud-storage/resources/hard-drive-test-data
Examples
## Not run:
data(hdfail)
# Select only Western Digital hard drives
dat <- subset(hdfail, grepl("WDC", model))
fit.hd <- fitfrail(Surv(time, status) ~ temp + rer + rsc
+ psc + cluster(model),
dat, frailty="gamma", fitmethod="score")
fit.hd
## End(Not run)